Update render script and Makefile

This commit is contained in:
Jonas Zeunert
2024-04-22 21:54:39 +02:00
parent 2d63fe63cd
commit 4d0cd768f7
10975 changed files with 47095 additions and 4031084 deletions

View File

@@ -1,12 +1,13 @@
 Awesome Linguistics Resources for Spanish !Awesome (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg) (https://github.com/sindresorhus/awesome)
 Awesome Linguistics Resources for Spanish !Awesome (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg) 
 (https://github.com/sindresorhus/awesome)
Curated list of Linguistic Resources for doing Spanish NLP & CL.
 Clustering
 Clustering
- Multilingual Latent Dirichlet Allocation LDA (https://github.com/ArtificiAI/Multilingual-Latent-Dirichlet-Allocation-LDA)
 Speech
 Speech
- Mexican Spanish Speech Recognition DB - 150 Speakers (http://www.speechocean.com/en-ASR-Corpora/631.html)
- Mexican Spanish Speech Recognition DB - 299 Speakers (http://www.speechocean.com/en-ASR-Corpora/603.html)
@@ -21,7 +22,7 @@
- Ruby Snowball Implementation (https://github.com/MaG21/estem)
- Spaguetti POSTagger(Based on NLTK + CESS corpus (https://code.google.com/p/spaghetti-tagger/)
 Multiword Expressions Extractors (MLWE)
 Multiword Expressions Extractors (MLWE)
- Freeling (http://nlp.lsi.upc.edu/freeling/)
Name Entity Recognition (NER)
@@ -63,10 +64,11 @@
- Cross Lingual Textual Entailment (CLTE) Corpus (English-Spanish) (http://www.celct.it/resources.php?id_page=CLTE)
- Ngram Frequencies out of Colombia News Corpora (http://ngrams.cavorite.com/datos/)
- Sagan Textual Entailment Test Suite (http://www.investigacion.frc.utn.edu.ar/mslabs/~jcastillo/Sagan-test-suite/)
- Garcia, Marcos and Pablo Gamallo, 2013 - Portuguese and Spanish biographical relation extraction corpora (Garcia, Marcos and Pablo Gamallo, 2013. Exploring the Effectiveness of Linguistic Knowledge for 
Biographical Relation Extraction. Natural Language Engineering, CJO2013. doi:10.1017/S1351324913000314.) (http://gramatica.usc.es/~marcos/corpora_nle.tgz)
- Garcia, Marcos and Pablo Gamallo, 2014 - Portuguese, Spanish and Galician coreference corpora (Garcia, Marcos and Pablo Gamallo, 2014. Multilingual corpora with coreferential annotation of person entities. In 
Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik: 3229-3233.) (http://gramatica.usc.es/~marcos/resources/corpora_coref.tar.bz2)
- Garcia, Marcos and Pablo Gamallo, 2013 - Portuguese and Spanish biographical relation extraction corpora (Garcia, Marcos and Pablo Gamallo, 2013. Exploring the Effectiveness of Linguistic 
Knowledge for Biographical Relation Extraction. Natural Language Engineering, CJO2013. doi:10.1017/S1351324913000314.) (http://gramatica.usc.es/~marcos/corpora_nle.tgz)
- Garcia, Marcos and Pablo Gamallo, 2014 - Portuguese, Spanish and Galician coreference corpora (Garcia, Marcos and Pablo Gamallo, 2014. Multilingual corpora with coreferential annotation of 
person entities. In Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik: 3229-3233.) 
(http://gramatica.usc.es/~marcos/resources/corpora_coref.tar.bz2)
- COW(Corpora From the Web) Ngram/Annotated People's Name Corpora  (http://hpsg.fu-berlin.de/cow/)
- Wikicorpus- Portion of 2006's wikipedia annotated with WordNet Synsets and POS (http://www.cs.upc.edu/~nlp/wikicorpus/)
- Spanish Billion Words Corpus with word2vec Embeddings (http://crscardellino.me/SBWCE/)