How AI Don’t Speak Like a Robot

I often go on about how my AI characters can’t just speak to me in their own words, and if they only could, it would really demonstrate the capabilities of my Extreme AI personality engine (and possibly let my AI tell me just what it thought of all this). While that is an admirable end goal, however, spending a little time working on even simplistically adaptable text is more productive than just whining about full natural language processing, so I’m going to write a little about my explorations in trying to get speech to “work” in my SMMG (that’s “Sports Management Mystery Game” to those not yet in the know).

By “work” I mean a kind of semi-procedurally generated text that is both created for reports (say, when your AI head coach tells you about the team’s practice that week or when the AI press writes up how your team did in actual games) and is influenced by the reporter’s personality. The text should not read like it’s being said by a robot, and it has certain beats it has to hit to advance the plot of the game/give you (the Player) necessary information for playing. (I’ll also explore how best to save these kinds of conversations and any emotional ties to them in the characters’ memory using RealMemory, but that’ll be in a later article, I think.)

I’m sure I’m approaching this from a fairly simplistic, naïve POV, as procedurally generated text isn’t my expertise; if you’re looking for that, definitely visit Emily Short’s blog and other similar sites for a deeper insight. (I have, but I know I’ve only scratched the surface; plus, I’ve only researched far enough to be able to implement the text in my specific game situation. Part of trying not to get lost in a sea of interesting information and never managing to complete my game, lol.)

Let’s take the example of Practice. In the SMMG, each week you can have your team practice (or not, but that leads to out-of-shape players, resentment, and losses). Each of your players can focus on a specific skill (one of six, such as “catching” or “running”) or have a more general practice that spans the lot. The results of this practice can range from “great” (definitely going to help in a real match) to “terrible” (practice has managed to make the player worse). As manager/owner, you find out about these results from the head coach. (You can also get an idea of which kinds of things a player is good at/enjoys doing from the results.)

Originally, I had the coach tell you results using a simple iteration for each player that evaluated each area in which a player practiced and gave the result, such as (for a general practice covering all kinds of skills):

Did great at running. Did well at catching. Did terribly at kicking. Did average at run defense. [etc]

This gets the information across, but that’s about it. It doesn’t sound like a person is telling you this, and it certainly doesn’t sound like a person with any sort of personality is saying any of it. I guess it’s a kind of procedural text generation, but … well, not really. Actually, to me it reads like a computer printout on an old dot-matrix printer. I can almost hear the sound … rat-a-tat whir, rat-a-tat whir …

Next I decided that if my text were to sound more human, it should be processed more like a human processes info, rather than just being spit out linearly. My second iteration thus tried a little preprocessing, in a limited way, of the information that the coach was about to give you. She would actually “think” about a player’s overall practice (in code, that meant creating lists that kept track of which skills were done to what effect before saying anything about them, such as a simple list of strings called “practiceGreat” including every skill practiced at that level, and other lists for each of the other levels). Now we have a choice: The coach’s report can either be robotic as before, or (by adding a simple grammar parser as well) can say something like “DotsPlayer1 did really well in running, catching, and kicking, but was terrible at pass defense and run defense. Otherwise, was very average.”

This is much better. It reads like real sentences, especially the sort of thing someone would write in a report. It does become kind of repetitive when written out for several players in a row, though. And it also still doesn’t take into account personality: every AI coach on every team would say the same thing the same way, which (since in this case you don’t know what any other coach said to his or her respective manager) may not be a big deal, but doesn’t make the text generator very adaptable. Plus, it doesn’t take into account change in your coach’s personality or attitude toward you over time.

In terms of just making the sentences a little more variable and realistic, one can add some simple code (just checking how many times she’s talked about a general practice already, and adapting the opening) to create sentences like these:

Practiced everything. Did really well at kicking, and made many good choices. Otherwise very average.

Also practiced everything. Did really well at pass defense and kicking, and had a good week defending against the run. Otherwise average.

Had a go at everything. Did well at running and catching, but was terrible at kicking. Otherwise average.

This kind of thing just requires conditional statements that take into account how many players have already been described and maybe adds a random element as we get further down the list. For one report, this works great, actually. I could even “add” personality quirks to the way it’s said, all hard-coded. But how lovely to be able to make it a bit more automated, to give the coach the ability to take this information and speak for herself in a limited way without my having to hard-code every block and set of conditions and personality variance possible throughout the game.

Using ExAI, one can check the coach’s current feelings and attitudes toward the Player (you), and her current overall personality, very easily. The difficulty is in knowing which personality facets will make her say which kinds of things.

Luckily, in reading Emily Short’s blog I came across links that lead me to a series of articles on the Personage project (a project of Francois Mairesse and Marilyn A. Walker; see refs at end). Between this and works referenced in my own Master’s project (Costa and McCrae, 1995; John et al, 2010; and Saucier and Ostendorf, 1999*), I was able to create a more research-based connection between a character’s language and her Big Five personality traits (I say “more” because I don’t think you can with certainty predict what someone will say, or what sentence structure they will use, etc., but you can make generalizations that help when determining what an AI character will say in a limited situation such as this one). Averaging our coach’s underlying facets to give us her Big Five, we get:

  • Openness 43
  • Conscientiousness 71
  • Extraversion 69
  • Agreeableness 53
  • Neuroticism 33

There are many very complex ways to look at this (see the 59-page Personage paper), but in a really general way for our purposes here, we can say the following:

  • Mid-range Openness shouldn’t have as much of an effect as other, more extreme scores.
  • High Conscientiousness should lead to high numbers of positive vs negative emotion words, good information, and getting straight to the point. It would indicate fewer swear words and more hedges, and longer words.
  • The high Extraversion score would indicate less-formal sentences, few tentative phrases, few hedges/softeners, few errs and ums, more near-swear words, verbal exaggeration, shorter words, and a less-rich vocabulary.
  • Mid-range Agreeableness shouldn’t have as much of an effect as other scores.
  • Low Neuroticism should lead to calmness, few conjunctions, few pronouns, many articles, less exaggeration, and again shorter words plus a less-rich vocabulary.

Some of these work against one another, and one could create a complex series of weights to figure out what wins out/how they affect one another, which I believe the Personage project does (I can’t actually get the project files themselves to work, unfortunately). For my simple solution, my overall take on this is that the coach would get straight to the point in her reports, would tend to vary her word choice but avoid overly negative phrasing, and wouldn’t be enamored of long sentences or flowery language. She generally wouldn’t hesitate or use verbal placeholders like uh, um, and er. Interestingly, this is basically what I’ve already written for her, although maybe the sentences can get a bit long with the grammar parser (so maybe my writer’s ear for this sort of thing is working 🙂 ). She’s not one for varying her vocab a whole bunch, so maybe the repetition in the phrasing is workable—although I still don’t want her to sound un-human.

However, in the interest of there being a little variation in her speech, especially over the course of many weeks, I could change the probabilities of using one of a limited number of phrases or words, still keeping to the general rules above. And, of course, if I were to be using this for a character with high verbal variability, I’d need to provide more opportunities for varied speech, or varied phrasing, types of words, etc. And if I were to try to use the same set of code for several characters, I’d need even more complexity. But that would start to be an engine unto itself, and that’s a whole other ball game.

Also, when using this type of variability even for a single character but over a long period (such that her personality could change over time, as in ExAI or real life), you’d need to have her speech adapt to these changes; e.g., if she became more extraverted, those qualities associated with extraversion would become more pronounced (or would be likely to become more pronounced).

Yikes! I seem to have gone a bit overboard here; I wasn’t trying to write a paper of my own! I’ll save the discussion of press reports for another time.

‘Til later!

(As always, comments on this blog should either be tweeted to me @QuantumTigerAI or emailed to one of QTG’s contact emails. The actual comments section has been overrun by spam bots, as has my Forum, and I don’t have time to weed through them every day.)

*Full refs are as follows:

Costa, PT, Jr and McCrae, RR 1995, ‘Domains and Facets: Hierarchical Personality Assessment Using the Revised NEO Personality Inventory’, Journal of Personality Assessment, vol. 64, no. 1, pp. 21-50.

John, OP et al  2010, ‘Paradigm Shift to the Integrative Big Five Trait Taxonomy: History, Measurement, and Conceptual Issues’, in John et al (ed.), Handbook of personality: Theory and research, 3d ed, Guilford, New York. (US edition.)

Mairesse, F and Walker, MA [n.d.], ‘Can Conversational Agents Express Big Five Personality Traits through Language?: Evaluating a Psychologically-Informed Language Generator’, available at

Saucier, G & Ostendorf, F 1999, ‘Hierarchical Subcomponents of the Big Five Personality Factors: A Cross-Language Replication’, Journal of Personality and Social Psychology, vol. 76, no. 4, pp. 613-627.

Short, E [various dates], Emily Short’s Interactive Storytelling, blog available at

This entry was posted in Artificial Intelligence, Game Development, SMMG, Uncategorized and tagged , , , , , . Bookmark the permalink.