If you haven’t already, it may be helpful to read the general intuition of intents in the context of the other parts of the buddy in the introduction page. Here, we will build upon the intuition of how the intents and utterances classifying system work and is able to map what the user may say to the action needed.
There are a lot resources online which can help you the understand various methods one may employ for classification of text. It is beyond the scope of this document to enumerate them here. But, for the sake of intuition we describe a toy version. We take the two sentences we used when describing the previous intuition. ‘Take me to my cart’ and ‘I want to checkout’. One way to solve this problem is to do a direct match lookup: we have all these sentences in a look up table which indexes the sentences to the intents. However, this kind of a hard approach can make designing the lookup table difficult because we will have to account for all the possible ways a user may say something.
Instead, what we do is build some statistics on top of the utterances that were provided which ‘softens’ the lookup, such that we can now match sentences that are ‘close to’ those mentioned. This is what machine learning does. It thus removes the burden to enumerate all the given ways.
However, there are limits to this system. The system does not actually understand what you are saying, like a human would, but is rather performing math upon what is being said using the sentences it was given and general knowledge to match what the user says to an intent. The problem with general knowledge sometimes can be that is too general. Therefore, to make sure the system understands the domain better, it is best to supply as many example utterances as possible. About 50 per intent is a good number taking care to be as different as possible.
In effect, if N intents have been specified to the buddy, the classifier has to make a decision on whether the utterance spoken is one of the N intents or is none of them. So the classifier has to decide among N+1 states: all of the intents and no intent.
An intent has a name, a list of entities that belong to it, and a set of marked up utterances which define it. The name used will in some sense be the identifier of the intent even in the client app. If the classifier decides that the utterance the user spoke maps to that intent, it is this name that will be returned.
The utterances are sentences that are marked up with entities where possible. Therefore an utterance can be thought of as a string of marked up and not marked up text. The non-marked up text helps the classifer identify the indent even if the exact words in the entities are replaced. The non marked up text also helps identify locations of entities, in case the entities are of expandable type, more on entities in the entities and entity types page.
In this subsection we run through some examples of intents and utterances in context.
One is the ‘navigate’ use case. We may have utterances such as:
Take me to my cartShow me my saved itemsShow me my inboxTake me back
Another intent could be ‘filter’.
Show only black colourShow the ones with long sleevesI want to see office wear clothes
Another intent could be ‘adding to the cart’, with utterances such as
Add this item to the cartI'll take this itemI want to buy this item
Note that technically the above utterances have examples of entities. But we shall not confuse with them here and deal with them in the Entities and Entity Types page.
Currently we have two system intents that are included by default with every buddy. These are the intents slang_help and slang_cancel. These trigger the Slang help menu and remove the Slang surface respectively, when triggered by a user utterance.
The topics covered here along with those on the entities and entity_types and the prompts page should be enough to develop most apps and use cases. However, there are many tips and tricks that we have learned over time which can help boost accuracy, those are included in the advanced topics.