Initial concepts
So, what we can do with the Platform?
How do I know if my company needs a chatbot?
Register and login
First Steps - Creating your project
Choose your plan
Permission System
Project Dashboard
Platform Glossary
Changing the Platform Language
2-Factor Authentication
Invalid authentication code
General settings
Artificial Intelligence
Agent Builder
Zero Shot Learning
Weni Platform AI Module
Repository - Overview
What is an Intelligence?
Intents and Entities
Creating an Intelligence
Training your Intelligence
Strength of Intelligence
Testing your intelligence
Translating your dataset
Integrating Intelligence into a Project on the Weni Platform
Introduction to Content Intelligence
Integrating a Content Intelligence
Interface Updates
Guidance and Best Practices
Expressions and Variables Introduction
Variables Glossary
Expressions Glossary
Flows Creation
Flows introduction
Flow editor and tools
Action cards
Decision cards
Adding Media to the message
Call Webhook: Making requests to external services
Split by Intent: Using Classification Artificial Intelligence
Import and export flows
Using expressions to capture the user's location
Viewing reports on the platform
Route markers
WhatsApp Message Card
Contacts and Messages
Triggers and Campaigns
Adding a trigger
Triggers Types
Tell a flow to ignore triggers and keywords
Campaign introduction
How to create a Campaign
Editing events
Creating contact from an external Webhook
Contact history
How to Download and Extract Archived Data
How to connect and talk to the bot through the settings
Adding a Facebook Channel
Adding a Viber channel
How to Create an SMS Channel - For Developers (RapidPro)
Web Chat Channel
General API concepts and Integrations
How to create a channel on twitter
How to create a channel on Instagram
How to create an SMS channel
Adding ticket creation fields in Zendesk
Adding Discord as a channel
Creating a Slack Channel
Adding a Viber channel (RapidPro)
Creating a Microsoft Teams channel
Weni Integrations
How to Use the Applications Module
How to Create a Web Channel
Adding a Telegram channel
How to create a channel with WhatsApp Demo
Whatsapp: Weni Express Integration
Whatsapp: How to create Template Messages
WhatsApp Template Messages: Impediments and Configurations
Supported Media Sending - WhatsApp Cloud
Whatsapp Business API
Human Attendance
Weni Chats: Introduction to the Chats module
Weni Chats: Setting Up Human Attendance
Weni Chats: Human Service Dashboard
Weni Chats: Human Service Management
Weni Chats: Attendance distribution rule
Weni Chats: Using active triggering of flows
Weni Chats: CoPilot
Zendesk - Human Support
Ticketer: Ticketer on Rapid Pro
U-Partners - Proper use of features
Using groups to organize human attendance
Data and BI
How to Install and Use the Weni Data Connector for Power BI
Incremental Update - Power BI
Explore Weni's Database Documentation
Tips for Data Modeling in Power BI
Filter using Contact Fields in Power BI
UX Writing
- All Categories
- Artificial Intelligence
- Guidance and Best Practices
Guidance and Best Practices
by Manu da Silva
To build the best possible intelligence, meaning a dataset that is accurate in its predictions, we must follow certain best practices when creating training phrases.
In this article, we will learn about some training methods and best practices in Weni.
Main Guidelines
Within best practices, there are some key guidelines we need to follow:
- Number of Phrases
- Balanced Number of Phrases
- Vocabulary Specificity
- Variation in Sentence Structures
Each of these topics is explained below.
Number of Phrases
Most NLP algorithm models rely on the number of training examples to increase the prediction rate for each intent. Therefore, to achieve high accuracy, we need to balance the relationship between the number of phrases and the number of intents in your dataset.
Below are some classifications of dataset quality based on the number of training phrases per intent, for an example with five intents or fewer:
- Minimum: 10 phrases per intent;
- Good: 25 phrases per intent;
- Excellent: 40 phrases per intent.
Some factors may influence these numbers, such as the total number of intents in the intelligence (which can affect the number of false positives). The more intents there are, the more phrases are required per intent.
The chosen algorithm also impacts this number. For instance, the algorithm using BERT, due to its use of a pre-trained model, tends to need fewer phrases to achieve good results.
Using a balanced number of phrases across all intents of your intelligence reduces the chances of bias toward a specific intent.
For example, if the intelligence has an intent X with 50 phrases and an intent Y with 200 phrases, the probability that the algorithm classifies inputs as belonging to intent Y might be higher because it has more examples (considering that the input is a new phrase never seen during training).
Thus, a good practice is to have an approximately equal number of phrases for all intents in your dataset, if possible.
Vocabulary Specificity
To reduce the number of false positives in the dataset and increase precision, we recommend that the training phrases generated respect the topic specificity rule.
This rule dictates that all specific words related to an intent should only be added to the phrases for that intent, and words that should not be interpreted as belonging to any intent should be distributed across all intents, so the algorithm does not associate those words with any specific topic.
For example, if I have intelligence that identifies orders in a fast-food restaurant, with the intents "food" and "drinks," I need to associate words related to each intent, such as "sandwich" for the first and "juice" for the second.
Thus, we would generate training phrases like "I would like to buy a sandwich" for the intent "food" and "I want to buy a juice" for the intent "drinks."
Note that specific words like "sandwich" and "juice" are associated with individual intents, while words like "I would like," "want," "to," "buy," and "a" are distributed between the two intents, ensuring that if someone types just "I would like to buy," the intelligence will not associate it with either intent due to very low confidence.
Variation in Sentence Structures
Sentence structure is also an important factor in interpreting user input. For example, if the phrase "I would like to eat a pizza" is trained for the intent "food," the algorithm would classify the phrase "I would love to eat a pizza" as the same intent, given that the sentence structure is similar (assuming there is a sufficient number of trained phrases with this structure).
This means that the more varied the example phrases are, both in terms of structures and words, the greater the probability that the intelligence will accurately predict more words related to that intent.