update lists

This commit is contained in:
2025-07-18 22:22:32 +02:00
parent 55bed3b4a1
commit 5916c5c074
3078 changed files with 331679 additions and 357255 deletions

View File

@@ -1,9 +1,9 @@
 Awesome Natural Language Generation !Awesome (https://awesome.re/badge.svg) (https://awesome.re)
 Awesome Natural Language Generation !Awesome (https://awesome.re/badge.svg) (https://awesome.re)
!Piscis Magnus from BL Harley 647 (logo.png)
Natural Language Generation is a broad domain with applications in chat-bots, story generation, and data descriptions. There is a wide spectrum of different technologies addressing parts or the whole of the NLG process. This list aims 
to represent this deversity of NLG applications and techniques by providing links to various projects, tools, research papers, and learning materials.
Natural Language Generation is a broad domain with applications in chat-bots, story generation, and data descriptions. There is a wide spectrum of different technologies addressing parts or the whole of the NLG process. This list aims to 
represent this deversity of NLG applications and techniques by providing links to various projects, tools, research papers, and learning materials.
Contents
@@ -25,15 +25,15 @@
- Alex Context NLG Dataset (https://github.com/UFAL-DSG/alex_context_nlg_dataset) - A dataset for NLG in dialogue systems in the public transport information domain.
- Box-score data (https://github.com/harvardnlp/boxscore-data/) - This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.
- E2E (http://www.macs.hw.ac.uk/InteractionLab/E2E) - This shared task focuses on recent end-to-end (E2E), data-driven NLG methods, which jointly learn sentence planning and surface realisation from non-aligned data.
- Neural-Wikipedian (https://github.com/pvougiou/Neural-Wikipedian) - The repository contains the code along with the required corpora that were used in order to build a system that "learns" how to generate English biographies for 
Semantic Web triples.
- Neural-Wikipedian (https://github.com/pvougiou/Neural-Wikipedian) - The repository contains the code along with the required corpora that were used in order to build a system that "learns" how to generate English biographies for Semantic Web 
triples.
- WeatherGov (https://cs.stanford.edu/~pliang/data/weather-data.zip) - Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.
- WebNLG (https://github.com/ThiagoCF05/webnlg) - The enriched version of the WebNLG - a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation.
- WikiBio - wikipedia biography dataset (https://rlebret.github.io/wikipedia-biography-dataset/) - This dataset gathers 728,321 biographies from wikipedia. It aims at evaluating text generation algorithms.
- The Schema-Guided Dialogue Dataset (https://github.com/google-research-datasets/dstc8-schema-guided-dialogue) - The Schema-Guided Dialogue (SGD) dataset consists of over 20k annotated multi-domain, task-oriented conversations between 
a human and a virtual assistant.
- The Wikipedia company corpus (https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus) - Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for
51K companies in English.
- The Schema-Guided Dialogue Dataset (https://github.com/google-research-datasets/dstc8-schema-guided-dialogue) - The Schema-Guided Dialogue (SGD) dataset consists of over 20k annotated multi-domain, task-oriented conversations between a human 
and a virtual assistant.
- The Wikipedia company corpus (https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus) - Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K 
companies in English.
- YelpNLG (https://nlds.soe.ucsc.edu/yelpnlg) - YelpNLG provides resources for natural language generation of restaurant reviews.
Dialog
@@ -139,3 +139,5 @@
!CC0 (http://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg) (http://creativecommons.org/publicdomain/zero/1.0)
To the extent possible under law, TokenMill (https://www.tokenmill.ai) has waived all copyright and related or neighboring rights to this work.
nlg Github: https://github.com/accelerated-text/awesome-nlg