Language Modeling and Text Generation Using Hybrid Recurrent Neural Network

Samreen*, Muhammad Javed Iqbal, Iftikhar Ahmad, Suleman Khan, Rizwan Khan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

The increase in development of machines that have capability to understand the complicated behavior to solve the human brain involvement problems, the auto text generation application also gets the wide attention. The Language modeling or text generation is a task of next character or word prediction in a sequence with analysis of input data. The ATG enable the machines to write and provide the help to reduce the human brain effort. The ATG is also useful for understanding and analysis of languages and provide the techniques that enable the machines to exchange information in natural languages. At the large scale the text data are created everywhere (whatsApp, facebook, and tweets etc.) and freely online available therefore an effective system is needed for automation of text generation process and analysis of the text data for extracting meaningful information from it so in this work, a case study is presented on how develop a text generation model using hybrid recurrent neural network for English language. The explore model find the dependencies between characters and the conditional probabilities of character in sequences from the available input text data and generate the wholly new sequences of characters like human beings writing (correct in meaning, spelling and sentence structure). A comprehensive comparison between these models, namely, LSTM, deep LSTM, GRU and HRNN is also presented. Previously the RNN models are used for text predictions or auto text generation but these models created the problem of vanishing gradient (short memory) when process long text, therefore the GRU and LSTM models were created for solving this problem. The text generated by GRU and LSTM have many spellings error, incorrect sentence structure, therefore, filling this gap the HRNN model is explore. The HRNN model is the combination of LSTM, GRU and a dense layer. The experiments performed on Penn Treebank, Shakespeare, and Nietzsche datasets. The perplexity of HRNN model is 3.27, the bit per character is 1.18 and average word prediction accuracy is 0.63. As compare with baseline work and previous models (LSTM, deep LSTM and GRU), our model (HRNN) perplexity and bit per character are less. The texts generated by HRNN have fewer spelling errors and sentence structure mistakes. A closer analysis of explored models’ performance and efficiency is described with the help of graph plots and generated texts by taking some input strings. These graphs explain the performance for each model.

Original languageEnglish
Title of host publicationDeep Learning for Unmanned Systems
EditorsAnis Koubaa, Ahmad Taher Azar
Place of PublicationCham
PublisherSpringer
Pages669-687
Number of pages19
Edition1st
ISBN (Electronic)9783030779399
ISBN (Print)9783030779382, 9783030779412
DOIs
Publication statusPublished - 2 Oct 2021

Publication series

NameStudies in Computational Intelligence
PublisherSpringer
Volume984
ISSN (Print)1860-949X
ISSN (Electronic)1860-9503

Fingerprint

Dive into the research topics of 'Language Modeling and Text Generation Using Hybrid Recurrent Neural Network'. Together they form a unique fingerprint.

Cite this