[38;2;255;187;0m[4mAwesome Linguistics[0m
[38;5;14m[1m![0m[38;5;12mAwesome[39m[38;5;14m[1m (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)[0m[38;5;12m (https://github.com/sindresorhus/awesome)[39m

[38;5;12mA curated list of anything remotely related to linguistics, sorted in alphabetical order.[39m

[38;5;12m- [39m[38;5;14m[1mProgramming[0m[38;5;12m (#programming)[39m
[48;5;235m[38;5;249m- **Platforms and toolkits** (#platforms-and-toolkits)[49m[39m
[48;5;235m[38;5;249m- **Algorithms** (#algorithms)[49m[39m[48;5;235m[38;5;249m                        [49m[39m
[48;5;235m[38;5;249m- **Data sets** (#data-sets)[49m[39m[48;5;235m[38;5;249m                          [49m[39m
[38;5;12m- [39m[38;5;14m[1mResources[0m[38;5;12m (#resources)[39m
[48;5;235m[38;5;249m- **Deep learning models and transformers** (#deep-learning-models-and-transformers)[49m[39m
[48;5;235m[38;5;249m- **On Wikipedia** (#on-wikipedia)[49m[39m[48;5;235m[38;5;249m                                                  [49m[39m
[48;5;235m[38;5;249m- **On Youtube** (#on-youtube)[49m[39m[48;5;235m[38;5;249m                                                      [49m[39m
[48;5;235m[38;5;249m- **Books** (#books)[49m[39m[48;5;235m[38;5;249m                                                                [49m[39m
[48;5;235m[38;5;249m    - **Free** (#free)[49m[39m[48;5;235m[38;5;249m                                                              [49m[39m
[48;5;235m[38;5;249m    - **Non free** (#non-free)[49m[39m[48;5;235m[38;5;249m                                                      [49m[39m
[48;5;235m[38;5;249m    - **Lists** (#lists)[49m[39m[48;5;235m[38;5;249m                                                            [49m[39m
[38;5;12m- [39m[38;5;14m[1mStandards[0m[38;5;12m (#standards)[39m
[38;5;12m- [39m[38;5;14m[1mLists[0m[38;5;12m (#lists)[39m
[38;5;12m- [39m[38;5;14m[1mCommunities[0m[38;5;12m (#communities)[39m

[38;2;255;187;0m[4mProgramming[0m
[48;2;30;30;40m[38;5;13m[3mLibraries, frameworks and applications useful for developing applications.[0m

[38;2;255;187;0m[4mPlatforms and toolkits[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mCLARIN-D web tools[0m[38;5;12m (https://www.clarin-d.net/en/analysing) - Tools for Analysing Research Data [39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mCorpusExplorer[0m
[38;5;12m (https://notes.jan-oliver-ruediger.de/software/corpusexplorer-overview/) - Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 50 interactive visualizations under a user-friendly interface.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mHaxe-linguistics[0m[38;5;12m (https://github.com/sexybiggetje/haxe-linguistics) - Early linguistical analysis and natural language processing library for Haxe.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNatural[0m[38;5;12m (https://github.com/NaturalNode/natural) - General natural language tools for Node.js.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNatural Language ToolKit (NLTK)[0m[38;5;12m (http://www.nltk.org/) - The most complete platform for building Python programs to work with human language data.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSnowball[0m[38;5;12m (https://snowballstem.org/) - Snowball is a language in which stemming algorithms can be easily represented.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSpacy[0m[38;5;12m (https://spacy.io/) - Industrial-strength  National Language Processing in Python.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mMate Tools[0m[38;5;12m (http://hdl.handle.net/11022/1007-0000-0000-8E4E-A), webservice via WebLicht[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mUBIAI[0m[38;5;12m (https://ubiai.tools/) - Easy-to-use text annotation tool for teams with most comprehensive auto-annotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mtextblob-de[0m[38;5;12m (https://github.com/markuskiller/textblob-de) - Nice alternative for spacy (see above).[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mtyo[0m[38;5;12m (https://github.com/mongsvo/tyo) - A utility for finding Typo-Bridges.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mUralicNLP[0m[38;5;12m [39m[38;5;12m(https://github.com/mikahama/uralicNLP)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAn[39m[38;5;12m [39m[38;5;12mopen[39m[38;5;12m [39m[38;5;12msource[39m[38;5;12m [39m[38;5;12mPython[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12mprocessing[39m[38;5;12m [39m[38;5;12mmorphologically[39m[38;5;12m [39m[38;5;12mrich[39m[38;5;12m [39m[38;5;12mand,[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mmost[39m[38;5;12m [39m[38;5;12mpart,[39m[38;5;12m [39m[38;5;12mendangered[39m[38;5;12m [39m[38;5;12mUralic[39m[38;5;12m [39m[38;5;12mlanguages.[39m[38;5;12m [39m[38;5;12mIt[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mdo[39m[38;5;12m [39m[38;5;12mmorphological[39m[38;5;12m [39m[38;5;12manalysis,[39m[38;5;12m [39m[38;5;12mgeneration,[39m[38;5;12m [39m[38;5;12mlemmatization,[39m[38;5;12m [39m
[38;5;12mdisambiguation[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mlexical[39m[38;5;12m [39m[38;5;12mlookup[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mgreat[39m[38;5;12m [39m[38;5;12mmany[39m[38;5;12m [39m[38;5;12mUralic[39m[38;5;12m [39m[38;5;12mlanguages.[39m

[38;2;255;187;0m[4mAlgorithms[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mStemming algorithms for various European languages[0m[38;5;12m (http://snowball.tartarus.org/texts/stemmersoverview.html) - Various stemming algorithms from snowball.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mThe Porter Stemmer Algorithm[0m[38;5;12m (http://tartarus.org/martin/PorterStemmer/) - The ‘official’ home page for distribution of the Porter Stemming Algorithm, written and maintained by its author, Martin Porter.[39m

[38;2;255;187;0m[4mData sets[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mEuroRomCom Data[0m[38;5;12m (https://github.com/kirkins/euroromcom) - JSON formatted Pan-Romance word lists.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mAraneum Germanicum[0m[38;5;12m (http://aranea.juls.savba.sk/aranea_about/_germanicum.html)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mCEHugeWebCorpus[0m[38;5;12m (https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-2638) - German corpus based on CommonCrawl[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mDigitales Wörterbuch der deutschen Sprache (DWDS)[0m[38;5;12m (https://dwds.de)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mGC4 Corpus[0m[38;5;12m (https://german-nlp-group.github.io/projects/gc4-corpus.html) (CommonCrawl)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mIDS Corpora[0m[38;5;12m (https://www1.ids-mannheim.de/kl/projekte/korpora) - German Reference Corpus[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mLeipzig Corpora Collection[0m[38;5;12m (https://wortschatz.uni-leipzig.de/en/download/) - sampled sentences in different languages.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSdeWaC[0m[38;5;12m (https://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/sdewac.en.html) - big german internet corpus[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mC-WEP[0m[38;5;12m (http://lingured.info/linguistic-resources/cwep/)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mDysList (list of dyslexic errors)[0m[38;5;12m (https://github.com/Rauschii/DysListGerman)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mFalko[0m[38;5;12m (https://www.linguistik.hu-berlin.de/de/institut/professuren/korpuslinguistik/forschung/falko)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mLitkey[0m[38;5;12m (https://www.linguistics.ruhr-uni-bochum.de/litkeycorpus/)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mOpinionSpam[0m[38;5;12m (https://github.com/hdaSprachtechnologie/OpinionSpam)[39m

[38;2;255;187;0m[4mResources[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mLow Resource Languages[0m[38;5;12m (https://github.com/RIchardLitt/low-resource-languages) - A list of resources for conservation, development, and documentation of low resource (human) languages.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mLanguage Science Press[0m[38;5;12m (https://langsci-press.org/) - Language Science Press is a born-digital scholar-led open access publisher in linguistics.[39m

[38;2;255;187;0m[4mDeep learning models and transformers[0m

[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mdbmdz BERT models[0m[38;5;12m (https://github.com/dbmdz/berts)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mDeepset German BERT model[0m[38;5;12m (https://deepset.ai/german-bert)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mEvaluating German Transformer Language Models with Syntactic Agreement Tests[0m[38;5;12m (https://github.com/DFKI-NLP/gevalm)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mGerman ELMo Model[0m[38;5;12m (https://github.com/t-systems-on-site-services-gmbh/german-elmo-model)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mgerman-transformer-training[0m[38;5;12m (https://github.com/PhilipMay/german-transformer-training)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mGermLM[0m[38;5;12m (https://github.com/tonianelope/Multilingual-BERT) (NER exploration)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mGerPT2[0m[38;5;12m (https://github.com/bminixhofer/gerpt2)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSentence Transformers[0m[38;5;12m (https://github.com/UKPLab/sentence-transformers)[39m

[38;2;255;187;0m[4mOn Wikipedia[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mBag of words model[0m[38;5;12m (https://en.wikipedia.org/wiki/Bag-of-words_model)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mDocument classification[0m[38;5;12m (https://en.wikipedia.org/wiki/Document_classification)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mLanguage models[0m[38;5;12m (https://en.wikipedia.org/wiki/Language_model)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNaive Bayes classification[0m[38;5;12m (https://en.wikipedia.org/wiki/Naive_Bayes_classifier)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNatural language processing[0m[38;5;12m (https://en.wikipedia.org/wiki/Natural_language_processing)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mOutline of natural language processing[0m[38;5;12m (https://en.wikipedia.org/wiki/Outline_of_natural_language_processing)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mParts of speech tagging[0m[38;5;12m (https://en.wikipedia.org/wiki/Part-of-speech_tagging)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSentiment analysis[0m[38;5;12m (https://en.wikipedia.org/wiki/Sentiment_analysis)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mTerm frequency - inverse document frequency[0m[38;5;12m (https://en.wikipedia.org/wiki/Tf%E2%80%93idf)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mVector space model[0m[38;5;12m (https://en.wikipedia.org/wiki/Vector_space_model)[39m

[38;2;255;187;0m[4mOn Youtube[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mComputational Linguistics Lecture Playlist (Youtube)[0m[38;5;12m (https://www.youtube.com/playlist?list=PLegWUnz91WfuPebLI97-WueAP90JO-15i) - Lectures for University of Maryland class on computational linguistics.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mThe Virtual Linguistics Campus[0m[38;5;12m (https://www.youtube.com/channel/UCaMpov1PPVXGcKYgwHjXB3g) - CC-licensed educational videos interconnected with Marburg University's e-learning platform of the same name.[39m

[38;2;255;187;0m[4mBooks[0m
[48;2;30;30;40m[38;5;13m[3mSome of the more interesting and complete books.[0m

[38;2;255;187;0m[4mFree[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mEssentials of Linguistics, 2nd edition[0m[38;5;12m (https://ecampusontario.pressbooks.pub/essentialsoflinguistics2/) - An introductory book (2nd edition).[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mIntroduction to Linguistics[0m[38;5;12m (https://linguistics.ucla.edu/people/Kracht/courses/ling20-fall07/ling-intro.pdf)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNatural Language Processing with Python[0m[38;5;12m (https://www.nltk.org/book/) - The book from the NLTK package.[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mText Mining with R[0m[38;5;12m (https://www.tidytextmining.com)[39m

[38;2;255;187;0m[4mNon free[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mFoundations of Computational Linguistics[0m[38;5;12m (https://books.google.com/books?id=o9iGAgAAQBAJ&dq=Foundations+of+Computational+Linguistics&hl=nl&source=gbs_navlinks_s)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mFoundations of Statistical Natural Language Processing[0m[38;5;12m (https://books.google.nl/books?id=YiFDxbEX3SUC)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSemisupervised Learning for Computational Linguistics[0m[38;5;12m (https://books.google.com/books/about/Semisupervised_Learning_for_Computationa.html?id=VCd67cGB_rAC&redir_esc=y)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mSpeech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition[0m[38;5;12m (https://books.google.nl/books?id=fZmj5UNK8AQC)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mThe Oxford Handbook of Computational Linguistics[0m[38;5;12m (https://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780199276349.001.0001/oxfordhb-9780199276349)[39m

[38;2;255;187;0m[4mStandards[0m

[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mDTA Basisformat[0m[38;5;12m (https://www.deutschestextarchiv.de/doku/basisformat/)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mISO TC 37 SC 4[0m[38;5;12m (https://www.iso.org/committee/297592.html)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mUIMA[0m[38;5;12m (https://docs.oasis-open.org/uima/v1.0/os/uima-spec-os.html)[39m

[38;2;255;187;0m[4mLists[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1m15 most popular books on good reads[0m[38;5;12m (https://www.goodreads.com/shelf/show/natural-language-processing)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;12mGitHub topics [39m[38;5;14m[1mcorpus-linguistics[0m[38;5;12m (https://github.com/topics/corpus-linguistics) & [39m[38;5;14m[1mnlp[0m[38;5;12m (https://github.com/topics/nlp)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mnlp-datasets[0m[38;5;12m (https://github.com/niderhoff/nlp-datasets)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNLP-progress[0m[38;5;12m (https://github.com/sebastianruder/NLP-progress)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1m/r/LanguageTechnology/[0m[38;5;12m (https://www.reddit.com/r/LanguageTechnology/)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome-nlp[0m[38;5;12m (https://github.com/keon/awesome-nlp)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mAwesome Community-Curated NLP List[0m[38;5;12m (https://github.com/alvations/awesome-community-curated-nlp)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome-chinese-nlp[0m[38;5;12m (https://github.com/crownpku/Awesome-Chinese-NLP)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome-danish[0m[38;5;12m (https://github.com/fnielsen/awesome-danish)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome-hungarian-nlp[0m[38;5;12m (https://github.com/oroszgy/awesome-hungarian-nlp)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome Information Retrieval[0m[38;5;12m (https://github.com/harpribot/awesome-information-retrieval)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mIndonesian NLP[0m[38;5;12m (https://github.com/kmkurn/id-nlp-resource)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mNorwegian NLP resources[0m[38;5;12m (https://github.com/web64/norwegian-nlp-resources)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mGerman NLP resources[0m[38;5;12m (https://github.com/adbar/German-NLP/)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome-nlp-polish[0m[38;5;12m (https://github.com/ksopyla/awesome-nlp-polish)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mawesome-spanish-nlp[0m[38;5;12m (https://github.com/dav009/awesome-spanish-nlp)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mM. Weisser's list of NLP/Computational Linguistics Resources[0m[38;5;12m (https://martinweisser.org/corpora_site/comp_ling_resources.html)[39m

[38;2;255;187;0m[4mCommunities[0m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mLinguistics Stack Exchange[0m[38;5;12m (https://linguistics.stackexchange.com/)[39m
[48;5;12m[38;5;11m⟡[49m[39m[38;5;12m [39m[38;5;14m[1mUntranslatable.co, Multilingual urban dictionary[0m[38;5;12m (https://untranslatable.co/)[39m

[38;5;12mlinguistics Github: https://github.com/theimpossibleastronaut/awesome-linguistics[39m