Updated by Thays Oliveira
In the Settings section, you can manage general information of your intelligence such as Name, Description, Language, Categories, prediction algorithm, and IA options.
In this article, we will learn what each of these mean and how to manage in the best way possible for your bot's needs.
- Name: it's how your intelligence is identified in the platform by the community. It's a good practice to name your intelligence in a way that its purpose and application are clear for other users.
- Description: a place where you can explain to other users of the community what your bot does and how to contribute and improve it, if needed! You can list intents and entities and describe how each of them works in your intelligence. You can build your description in Markdown, if you want to.
- Language: is the default language of your bot, you can select any language you want to be the base language for your intelligence.
- Category: is a classification by occupation area. It helps the community to understand the applications and scope of your intelligence. For example, a bot that has intents related to medical issues, can be classified in the Health category. You can select one or more categories for your intelligence.
There are five different prediction algorithms on BotHub:
- Neural Network with Internal Vocabulary
- Neural Network with External Vocabulary
- Transformer Neural Network with internal vocabulary (recommended)
- Transformer Neural Network with word embedding external vocabulary
- Transformer Neural Network with BERT word embedding (recommended)
Algorithms with Internal Vocabulary (1, 3)
These algorithms use only the vocabulary in the training dataset to perform its classifications.
One of the disadvantages of this training method is that it needs a large number of training phrases to make the prediction really assertive.
It is a good option to be used in intelligences that have a more specific context.
Algorithmswith External Vocabulary (2, 4, 5)
These algorithms, in addition to being trained with the dataset inserted by the owner, use Word Embeddings, which are sets of previously trained words in order to insert a general context to the intelligence.
This context helps to process words and sentences that have never been added to the dataset, making the intelligence more likely to relate a sentence never seen before to its intention.
Transformer Neural Network with BERT word embedding (5)
It is the latest BotHub algorithm that makes use of the pre-trained word embedding from the state of the art BERT (Bidirectional Encoder Representations from Transformers).
It is currently available for the PT-BR and EN languages.
BotHub has some IA configuration options that help your intelligence to work in different ways for different goals! Below you can find which options are these and how to use them.
Intent's confidence can be displayed in two forms. In the next image examples, we will use the same sentence for the same dataset, changing only the "Competing Intents" IA option and see the differences.
- Percentage for each intent
Intent ranking for the sentence "I wanna go outside".
In this option (with Competing Intents off), the confidence for each intent is given in a 1:1 proportion (1 sentence, 1 confidence), meaning that the algorithm only tells us what is the highest confidence intent.
- All intents competing
The confidence of a sentence in the default configuration (without competing intents) is based on the percentage probability of each intent. It means, how much the intelligence is certain that the sentence is for each intent.
When you turn this option on, it changes how the confidence is displayed on the response JSON. Now, the confidence's sum of all intents is equal to 1 (100%) so all the results complements each other. This means that the algorithm shows us the most "dominant" intent for that entry according to the dataset.
Notice that this option only changes how the confidence information is shown, it does not change how the algorithm works.
It's an option that uses a pre-created dataset that identifies some names, places, and companies with an external dataset. It brings these as entities in the response JSON.
This option changes how the algorithm processes the words. When Analyze Char is activated, the algorithm is going to break each word in many small parts and learn its vocabulary based on these sets of structures. It can improve the intelligence in some cases, like when a language has words with a lot of variations of itself (e.g., portuguese).
It is recommended to run tests before using this feature.