Archive en agro-écologie de BSV (Bulletins de Santé du Végétal) From nicolas turenne The corpus describes damage of insects and diseases on crops (wheat, wine...). corpus contains 41,000 documents. 17,000 were published from 1960 till 2000 of medium quality about text recognition. Each file contains level of risk about crop from a region of France. Texts are in French size of… Metadata quality: Metadata quality: Data description filled Files documented License filled Update frequency not followed File formats are open Temporal coverage filled Spatial coverage filled Some files are unavailable Learn more about this indicator Metadata quality: 77.77777777777779/100 Updated on October 12, 2023 Creative Commons Attribution 0 reuses 2 favorites
archive de youtube sur le lancement d'alerte From nicolas turenne Description of the corpus The corpus describes videos about whistleblowing on the Youtube social media. Goal of the corpus is the detect automatically new videos (persons or organizations) emitting whistleblowing. The corpus aims at finding patterns for that purpose. size of video corpus : 347,544… Metadata quality: Metadata quality: Data description filled Files documented License filled Update frequency not followed File formats are open Temporal coverage filled Spatial coverage not set All files are available Learn more about this indicator Metadata quality: 77.77777777777779/100 Updated on November 30, 2016 Creative Commons Attribution 0 reuses 1 favorite
Credibility Corpus with several datasets (Twitter, Web database) in French and English From nicolas turenne Description of the corpora The set of these datasets are made to analyze ifnormation credibility in general (rumor and disinformation for English and French documents), and occuring on the social web. Target databases about rumor, hoax and disinformation helped to collect obviously misinformation.… Metadata quality: Metadata quality: Data description filled Files documented License filled Update frequency not set File formats are open Temporal coverage filled Spatial coverage not set All files are available Learn more about this indicator Metadata quality: 66.66666666666666/100 Updated on December 1, 2016 Creative Commons Attribution 0 reuses 0 favorites
Open Bilbio corpus for content analysis From nicolas turenne Description of the corpus The corpus describes fulltexts publication in sciences (mathemtaics, computing, statistics) in LATEX or TXT format. They are published in open access. Purprose to use this corpus is twice : information extraction (for instance: extract all collocations around a target… Metadata quality: Metadata quality: Data description filled Files documented License filled Update frequency not followed File formats are open Temporal coverage filled Spatial coverage not set Some files are unavailable Learn more about this indicator Metadata quality: 66.66666666666666/100 Updated on October 12, 2023 Creative Commons Attribution 0 reuses 0 favorites