Hi Crispy,
Well, while Alice is not the 'perfect' dialog machine, the things you mention it lacking is actually at the core of what Alice is all about. Unfortunately, the ALICE that you talk to on alicebot.org is a general Alice that uses a very general and limited AIML set. In order to converse with the 'real' Alice (also known as Silver Alice), you need to be a member of the Alice AI Foundation. You are basically talking to the 'base' set of AIML categories that is distributed as a starting point for bot creation, with a small set of ALICE personality thrown in.
That being said, the concept you are talking about of 'synonyms' is at the heart of the ALICE paradigm. This is what is known as 'symbolic reduction'.
Let's take your case example.
quote:
For instance, consider the following example as input: "I'm taken aback". While ALICE, with all its knowledge base, searches for a match for "be taken aback", it fails to take the shortcut of finding a logical answer to the statement rather than a hard-coded one. Suppose "be taken aback" wasn't added to the bot's knowledge base, but a list of sysnonyms, which requantify the size of the klnowledge base. Suppose "be taken aback" was resolved to a more common word or phrase, such as "astonished" before being processed, which wouldn't have the bot match for "I'm taken aback", but rather "I'm astonished". If "astonished" as chosen as the base synonym for also "very surprised", "speechless", etc., this would highly simplify the pattern matcher's job and liability to actually find a match on the whole. In my view, this would far better serve as AI than direct matching to the input.
In AIML, we would do it like this:
<category> <pattern>I M VERY SURPRISED</pattern> <template> For some reason, that doesn't surprise me. </template></category><category> <pattern>_ ASTONISHED</pattern> <template> <srai><star index="1"/> VERY SURPRISED</srai> </template></category><category> <pattern>_ TAKEN ABACK</pattern> <template> <srai><star index="1"/> ASTONISHED</srai> </template></category>
With this set of categories, this would be the dialog:
Client: I'm taken aback!
Alice: For some reason, that doesn't surprise me.
the SRAI tag is the SYMBOLIC REDUCTION tag... first it expands the tags and text that are inside of the SRAI tags, and then it takes what is generated and passes it back through the pattern matching again. So first it would reduce I M TAKEN ABACK to I M ASTONISHED, and passes I M ASTONISHED back through the graphmaster. I M ASTONISHED is further morphed to I M VERY SURPRISED, which is then fed into the graphmaster again, which comes up with a match that has no SRAI tag. One could also do this:
<category> <pattern>_ VERY _</pattern> <template><star index="1"/> <star index="2"/></srai></category><category> <pattern>I M SURPRISED</pattern> <template> For some reason, that doesn't surprise me. </template></category>
In this case, the first category effectively removes the extraneous 'very', which would reduce I M VERY SURPRISED to I M SURPRISED.... it would also serve to reduce I M VERY VERY VERY VERY VERY VERY VERY VERY VERY VERY VERY SURPRISED to I M SURPRISED.
In addition to the inherent AIML reduction, most of the AIML interpreters also have a file called 'substitutions.xml', which is a list of synonyms just like you described, as well as substitions for first/second/third person conversions.
Now let's take your example a little bit further... What if we wanted ALICE to respond in context to the conversation? There are a couple of ways to facilitate this. First the THAT method. Consider the following categories:
<category> <pattern>WHAT IS DADA</pattern> <template> The first thing you hear when a shark approaches. </template></category><category> <pattern>* YOUR SEXUAL PREFERENCE</pattern> <template> I am gay. </template></category><category> <pattern>I M VERY SURPRISED</pattern> <template> For some reason, that doesn't surprise me. </template></category><category> <pattern>I M VERY SURPRISED</pattern> <that>* GAY</that> <template> Does my sexual preference really bother you? </template></category><category> <pattern>_ ASTONISHED</pattern> <template> <srai><star index="1"/> VERY SURPRISED</srai> </template></category><category> <pattern>_ TAKEN ABACK</pattern> <template> <srai><star index="1"/> ASTONISHED</srai> </template></category>
Possible dialog:
Client: What is Dada?
Alice: The first thing you hear when a shark approaches.
Client: I'm taken aback!
Alice: For some reason that doesn't surprise me.
Client: What is your sexual preference?
Alice: I am gay.
Client: I'm taken aback!
Alice: Does my sexual preference really bother you?
...
In this case, there are 2 categories for I M VERY SURPRISED, one default, and one THAT. THAT refers to the last thing Alice said. You can use all wildcards in the THAT as well. Now anytime you say any phrase that evaluates to IM VERY SURPRISED, it will first match to the pattern IM VERY SURPRISED, and then check the THAT. THAT is added to the end of the search sequence like follows:
I'M TAKEN ABACK is first evaluated to I M VERY SURPRISED and then expanded to:
I M VERY SURPRISED <that> I AM GAY
And then the category pattern is expanded to:
I M VERY SURPRISED <that> * GAY
So we get a match.
....
THe idea of context can further be expanded into topics. You can embed categories in topics that will only evaluate if their topic is set as the current topic. Some interpreters allow various types of nesting of topics, but I don't think the standard defines topic nesting. Anyway, here is an example:
<category> <pattern>WHAT IS DADA</pattern> <template> The first thing you hear when a shark approaches. </template></category><category> <pattern>* YOUR SEXUAL PREFERENCE</pattern> <template> <condition name="botsexualpreference"> <li value="hetero">I'm straight as an arrow!</li> <li value="homo">I'm gay.</li> <li value="default">Uhhh... I'm not really sure?</li> </condition> <think> <set name="topic">sexual preference</set> </think> </template></category><category> <pattern>I M VERY SURPRISED</pattern> <template> For some reason, that doesn't surprise me. </template></category><topic name="sexual preference"> <category> <pattern>I M VERY SURPRISED</pattern> <template> Sexual preference is a boring subject, I'm surprised it surprises you. </template> </category> <category> <pattern>*</pattern> <template> Hmmmm. So, what is your preference? </template> </category> </topic><category> <pattern>_ ASTONISHED</pattern> <template> <srai><star index="1"/> VERY SURPRISED</srai> </template></category><category> <pattern>_ TAKEN ABACK</pattern> <template> <srai><star index="1"/> ASTONISHED</srai> </template></category>
Client: What is your sexual preference?
Alice: I am straight as an arrow!
Client: I'm taken aback!
Alice: Sexual preference is a boring subject, I'm surprised it surprises you.
Client: Well it does!
Alice: Hmmm. So what is your preference?
The TOPIC is expanded just like the THAT. In a normal input, this is what it would be like:
WHAT IS YOUR SEXUAL PREFERENCE <that> *
*
Then the second input would look like:
I M VERY SURPRISED <that> I AM STRAIGHT AS AN ARROW SEXUAL PREFERENCE
and the relevent category would be defined as:
I M VERY SURPRISED <that> * SEXUAL PREFERENCE
Therefore, a match.
...
Anyway, if you really need more powerful cognitive functionality, and actual concept evaluation, you might consider looking into Thought Tresure. It is an open source project that combines a variety of different methods, from the simple pattern matching, to hardcore natural language processing. However, I tend to get better conversations out of AIML. Personally, I am currently adding Thought Treasure style cognition to my AIML interpreter, as I find XML a lot easier to work with in a content-oriented way, and ALICE's means of robust dialog personality are far superior.
Peace
[edited by - krippy2k on March 5, 2004 8:22:33 AM]