There's been quite a bit in the press recently about Meena,
Google's new chatbot powered by a "Evolved Transformer seq2seq" neural network model which is even more powerful than the recently touted GPT2. The bot has been trained on "341 GB of text, filtered from public domain social media conversations".
Whilst the results for general conversation seem impressive, I can't help but think that this is a bit of a dead end as the bot really has no idea and is in no way directing the conversation, its just coming up with the "best" response to each input, albeit taking the conversation to date into account. It just doesn't seem to have any intent or goals of its own.
More interestingly though Google has been measuring the performance of the bot (and using that as part of its reward function) with just two variables - how sensible is a reply, and how specific is it. Lots of bots try to go for sensibility and the expenses of specificity, they come back with general statements ("why do you say that") rather than actually trying to drive the conversation forward. For a while now we've been using
Grice's maxims as a way of assessing bot performance (quantity, quality, relation, manner) which aim to cover similar ground, but the Google model may be even simpler and sharper - particularly when trying to train a human to do the assessing!