The below report shows each intent’s performance with different measures. Each measure gives insights into why each intent performs in a certain way.
Types of performance measurements
1. Precision
It shows the percentage of confusion between one and other intents, e.g. If precision is 100%, there is no confusion between intent samples and other intents. This measure gives you insights into the confusion between 2 intents or more.
Assume you have two intents; one has samples that are supposed to be in the other intent. The model is now confused, unable to categorize the sample in which intent, e.g. if one intent has 20 samples and they are all accurate, and another intent has 10 samples, 5 of them belong to the previous intent, then the precision measure of the first intent becomes less.
What to do if intent precision is low
- Go to intents confusion report.
- Check the intent column to see which intents are confused.
ℹ️ The graph shows the intents with less precision in Yellow and those with more precision in Blue.
2. Recall
Shows the accuracy of each intent sample. i.e., when intent has 20 samples, 10 are accurate, and the other 10 are inaccurate. The intent recall here is 50% accuracy. The recall checks the accuracy of each intent’s samples. It answers the question: “How many samples are correctly classified?”
What to do if an intent recall is low
- Go back to the intent sample page to see if the samples are misclassified or classified in another intent.
- If the samples are unclassified, add more samples.
- If the samples are misclassified in other intents, check if they belong to the intent or not. If not, add more similar samples.
ℹ️ If your samples’ recall measure is less, you need to add more samples to your intents.
ℹ️ The graph shows intents with less recall in Red and the ones with more recall in Blue.
3. F1 Score
This is the average measure between Precision and Recall. To fully evaluate the effectiveness of the model, you should consider both precision and recall.
What to do if F1-score is low
- Balance the number of samples between intents and solve precision and recall issues.