short-paper

Texygen: A Benchmarking Platform for Text Generation Models

Authors:

Yong YuAuthors Info & Claims

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 1097 - 1100

https://doi.org/10.1145/3209978.3210080

Published: 27 June 2018 Publication History

Abstract

We introduce Texygen, a benchmarking platform to support research on open-domain text generation models. Texygen has not only implemented a majority of text generation models, but also covered a set of metrics that evaluate the diversity, the quality and the consistency of the generated texts. The Texygen platform could help standardize the research on text generation and improve the reproductivity and reliability of future research work in text generation.

References

[1]

Mart'ın Abadi et almbox. . 2016. TensorFlow: A System for Large-Scale Machine Learning. OSDI, Vol. Vol. 16. 265--283.

Digital Library

[2]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio . 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

[3]

Tong Che, Yanran Li, Ruixiang Zhang, R Devon Hjelm, Wenjie Li, Yangqiu Song, and Yoshua Bengio . 2017. Maximum-Likelihood Augmented Discrete Generative Adversarial Networks. arXiv preprint arXiv:1702.07983 (2017).

[4]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio . 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.

Digital Library

[5]

Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Jun Wang, and Yong Yu . 2017. Long Text Generation via Adversarial Training with Leaked Information. arXiv preprint arXiv:1709.08624.

[6]

Ferenc Huszár . 2015. How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? arXiv preprint arXiv:1511.05101 (2015).

[7]

Fred Jelinek, Robert L Mercer, Lalit R Bahl, and James K Baker . 1977. Perplexity - a measure of the difficulty of speech recognition tasks. JASA Vol. 62, S1 (1977).

[8]

Matt J Kusner and José Miguel Hernández-Lobato . 2016. Gans for sequences of discrete elements with the gumbel-softmax distribution. arXiv preprint arXiv:1611.04051 (2016).

[9]

Jiwei Li, Will Monroe, Tianlin Shi, Alan Ritter, and Dan Jurafsky . 2017. Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547 (2017).

[10]

Kevin Lin, Dianqi Li, Xiaodong He, Zhengyou Zhang, and Ming-Ting Sun . 2017. Adversarial Ranking for Language Generation. arXiv preprint arXiv:1705.11001 (2017).

Digital Library

[11]

Tomávs Mikolov, Martin Karafiát, Lukávs Burget, Jan vCernockỳ, and Sanjeev Khudanpur . 2010. Recurrent neural network based language model. In Eleventh Annual Conference of the International Speech Communication Association.

[12]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu . 2002. BLEU: a method for automatic evaluation of machine translation ACL. 311--318.

Digital Library

[13]

Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang . 2017. Irgan: A minimax game for unifying generative and discriminative information retrieval models. In SIGIR. ACM, 515--524.

Digital Library

[14]

Ronald J Williams . 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning Vol. 8, 3--4 (1992), 229--256.

Digital Library

[15]

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu . 2017. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. AAAI. 2852--2858.

[16]

Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, and Lawrence Carin . 2017. Adversarial Feature Matching for Text Generation. arXiv preprint arXiv:1706.03850 (2017).

Cited By

Hajipoor ONickabadi AHomayounpour M(2025)GPTGAN: Utilizing the GPT language model and GAN to enhance adversarial text generationNeurocomputing10.1016/j.neucom.2024.128865617(128865)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128865
Feher DKhered AZhang HBatista-Navarro RSchlegel V(2025)Learning to generate and evaluate fact-checking explanations with transformersEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109492139:PAOnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.engappai.2024.109492
Wang AJiang JMa YLiu AOkazaki N(2024)Generative Data Augmentation for Aspect Sentiment Quad PredictionJournal of Natural Language Processing10.5715/jnlp.31.152331:4(1523-1544)Online publication date: 2024
https://doi.org/10.5715/jnlp.31.1523
Show More Cited By

Index Terms

Texygen: A Benchmarking Platform for Text Generation Models
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

Knowledge-Based Report Generation: a technique for automatically generating natural language reports from databases
SIGIR '83: Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval

Knowledge-Based Report Generation is a technique for automatically generating natural language summaries from databases. It is so named because it applies the tools of knowledge-based expert systems design to the problem of text generation. The ...
Knowledge-Based Report Generation: a technique for automatically generating natural language reports from databases

Knowledge-Based Report Generation is a technique for automatically generating natural language summaries from databases. It is so named because it applies the tools of knowledge-based expert systems design to the problem of text generation. The ...
Towards an Automatic Generation of Persuasive Messages
Persuasive Technology
Abstract
In the last decades, the Natural Language Generation (NLG) methods have been improved to generate text automatically. However, based on the literature review, there are not works on generating text for persuading people. In this paper, we propose ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

June 2018

1509 pages

ISBN:9781450356572

DOI:10.1145/3209978

General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

DIDI Collaborative Research Funds
Shanghai Sailing Program
National Natural Science Foundation of China
MSRA Collaborative Research

Conference

SIGIR '18

Sponsor:

SIGIR

SIGIR '18: The 41st International ACM SIGIR conference on research and development in Information Retrieval

July 8 - 12, 2018

MI, Ann Arbor, USA

Acceptance Rates

SIGIR '18 Paper Acceptance Rate 86 of 409 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

163
Total Citations
View Citations
1,703
Total Downloads

Downloads (Last 12 months)218
Downloads (Last 6 weeks)30

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hajipoor ONickabadi AHomayounpour M(2025)GPTGAN: Utilizing the GPT language model and GAN to enhance adversarial text generationNeurocomputing10.1016/j.neucom.2024.128865617(128865)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128865
Feher DKhered AZhang HBatista-Navarro RSchlegel V(2025)Learning to generate and evaluate fact-checking explanations with transformersEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109492139:PAOnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.engappai.2024.109492
Wang AJiang JMa YLiu AOkazaki N(2024)Generative Data Augmentation for Aspect Sentiment Quad PredictionJournal of Natural Language Processing10.5715/jnlp.31.152331:4(1523-1544)Online publication date: 2024
https://doi.org/10.5715/jnlp.31.1523
Wang YXian ZChen FWang TWang YFragkiadaki KErickson ZHeld DGan CSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)RoboGenProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694197(51936-51983)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694197
Guo XYu FZhang HQin LHu BSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)COLD-attackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692745(16974-17002)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692745
Wu JHou M(2024)TabMoE: A General Framework for Diverse Table-Based Reasoning with Mixture-of-ExpertsMathematics10.3390/math1219303112:19(3031)Online publication date: 27-Sep-2024
https://doi.org/10.3390/math12193031
Li CZhang LZheng Q(2024)Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation QualityElectronics10.3390/electronics1306109313:6(1093)Online publication date: 15-Mar-2024
https://doi.org/10.3390/electronics13061093
Şahin OYayla R(2024)BiLSTM Derin Öğrenme Yöntemi ile Uzun Metinlerden Yeni Özet Metinlerin TüretilmesiKaradeniz Fen Bilimleri Dergisi10.31466/kfbd.142302214:3(1096-1119)Online publication date: 15-Sep-2024
https://doi.org/10.31466/kfbd.1423022
Okamoto NShinagawa SNakamura S(2024)Image Diversity Evaluation Metrics Correlated with Human Subjectivity and Prediction of Image Diversity in Text-to-image Synthesis人の主観に相関する画像多様性評価指標とText-to-imageにおける画像多様性予測Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.39-6_E-O3539:6(E-O35_1-17)Online publication date: 1-Nov-2024
https://doi.org/10.1527/tjsai.39-6_E-O35
Pelaez SVerma GRibeiro BShapira P(2024)Large-scale text analysis using generative language models: A case study in discovering public value expressions in AI patentsQuantitative Science Studies10.1162/qss_a_002855:1(153-169)Online publication date: 1-Mar-2024
https://doi.org/10.1162/qss_a_00285
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten