Virtual Humans: Communications - Natural Language Generation

In Chapter 5 we look at the directions of work and research within the field of Natural Language Generation. However whilst Natural Language Understanding is a heavily researched field, most NLU systems once they identify what the user is saying just have the bot respond with either a particular fixed response (which may have some synonyms and alternatives for variety), or with a template (e..g "The train to X from Y leaves at Z") which is populated at run-time. This latter approach has been dismissed as "mail-merge" (Reiter and Dale, 1997) but is by far the most common approach in commercial chatbots.

With the rise in machine learning work is going on to apply these techniques to NLG - eg Oh, 2000 and Wen 2015. However a large corpus is needed to make this approach work, Wen describes using 1000 dialogues just to implement a restaurant advice system.

Reiter and Dale do provide a good description of the high level process needed:

1. Content determination (what needs to be communicated).
2. Discourse planning (how each bit needs to be said – which may include micro-templates).
3. Sentence aggregation (how all the bits will fit together).
4. Lexicalization (the exact choice of words to reflect the expressed relations and concepts, some of which may be hard-coded).
5. Referring expression generation (the exact choice of words to reflect the entities referred to).
6. Linguistic realization (a tidy up to ensure that rules of grammar are being followed).

The end goal really ought to be such a system that builds the sentences in the human way, i.e. the bot understands the user question/statement, works out what it needs to communicate (step 1 above) and then works out what words, idioms and grammatical structures are needed to communicate this (steps 2-6 above), but research in this areas does seem to be very limited. As we find decent papers exploring this approach, and progress our own work, we'll add more information here.

Useful References

Deemter, K. V., Theune, M., & Krahmer, E. (2005). Real versus template-based natural language generation: A false opposition?. Computational Linguistics, 31(1), 15-24.

Oh, A. H., & Rudnicky, A. I. (2000). Stochastic language generation for spoken dialogue systems. In Proceedings of the 2000 ANLP/NAACL Workshop on Conversational Systems-Volume 3 (pp. 27-32). Association for Computational Linguistics.

Reiter, E., & Dale, R. (1997). Building applied natural language generation systems. Natural Language Engineering, 3(1), 57-87.

Wen, T. H., Gasic, M., Mrksic, N., Su, P. H., Vandyke, D., & Young, S. (2015). Semantically conditioned lstm-based natural language generation for spoken dialogue systems. In Proceeding of EMNLP (pp. 1711–1721), Lisbon, Portugal, September.