Open Bilbio corpus for content analysis
Description
Description of the corpus
The corpus describes fulltexts publication in sciences (mathemtaics, computing, statistics) in LATEX or TXT format.
They are published in open access.
Purprose to use this corpus is twice :
- information extraction (for instance: extract all collocations around a target word, or extract methods names)
- comparison of abstract and body text
size of publication corpus : 650,000
size of publication sample : 20
data :
body string text data
Author
This dataset has been published on the initiative and under the responsibility of nicolas turenne.
Latest update
October 12, 2023
License
Metadata quality:
Data description filled
Files documented
License filled
Update frequency not followed
File formats are open
Temporal coverage filled
Spatial coverage not set
Some files are unavailable
Metadata quality
Update frequency not followed
Spatial coverage not set
Some files are unavailable
There are no reuses for this dataset yet.
There are no discussions for this dataset yet.
There are no community resources for this dataset yet.
Information
Tags
License
ID
5840026288ee383a2cc65bb3
Temporality
Creation
December 1, 2016
Frequency
Biannual
Temporal coverage
1994/01/01 to 2014/07/01
Latest update
October 12, 2023
Actions
Embed
Statistics for the year
Views
584
Downloads
46
Reuses of this dataset
0
Followers
0