In our previous Chapter we explored the steps needed to Train and Evaluate our Neural Networks. Now that being said, What would one do if your Training wasn't upto the mark, or may be your realised that everything you taught was in vain.
When a Teacher fails in her duty to impart and pass on knowledge to her Students, it is her duty to try and try again until she succeeds.
Today's article is all about understanding Where, How and What needs to eyed out for to analyse the Performance changes in your Model and the best practices around it.
Where do we Start?
Based on our knowledge by far, we know that a lot of Configurations, Parameters, Features and Components go into Training a Neural Network. What we also learnt is How to analyse whether your model is Performing at its best or if there is something wrong with it, based on the Learning curve of your Training Cycle.
What we didn't see is some of the most important terms in the field of Neural Networks -
These are some of the crucial terminologies which we need to deal with in a little more details before taking the next step.
Accuracy - It is the percentage or number of times your Model predicted the right value as that of the Ground Truth Annotation with respect to the total number of predictions.
Recall - This is the percentage of times where the Model has been able to classify the Positives Accurately. Implying it is the ratio between How many times the model predicted there was an instance and there actually was VS Total number of times there were instances.
Precision - This is the percentage of times when the System's predictions of all the Positive Cases were actually True.
Loss - Simply put this is the Error Rate of the Neural Model. The function used to calculate the Error Rate is called Loss Function. In other words - It is a measure of divergence between predicted value and actual value.
Now in the above definitions we were taking about Positives and Negatives. So where exactly do we get these values from? The answer to that is what we call as CONFUSION MATRIX.
What is Confusion Matrix?
To simply define this, It is a Tabular representation of 4 different components of Analysis of a Neural Model's Predictions.
These components are -
- True Positive
- True Negative
- False Positive
- False Negative
I know these words don't make much sense now, but let's try and understand them with a simple example.
Let's take a case of where we have a model built to Predict if a person has Migraine or not.
True Positive - When a person has Migraine and the Model predicts the same.
True Negative - When a person doesn't have Migraine and the Model predicts the same.
False Positive - When a person doesn't have Migraine but our Model says that he does.
False Negative - When a person has Migraine and our Model says that he doesn't.
Based on the above mentioned parameters one can analyse which segment of Learning is the Model performing poorly.
So now that we have analysed where the model is performing poorly, How do we optimise it? But before we answer that question, let's try and understand What do we mean by this action in the first place.
What is Optimisation?
If one were to try and understand Optimisation Intuitively, one would say -
It is the process of altering the Training parameters and Neural Architectures to align with the Dataset and the Use case and provide State of the Art Results.
Based on our previous posts we are aware on How to build Neural Architectures and thereby implying How to make changes to the same. But then what are these Training Parameters that we keep talking about.
Some of the Standard Training Params which can be altered to improvise the Model's Performance are as follows -
- Batch Size
- Learning Rate
- Activation Function
- Hidden Layers
So now we know What we can Optimise. Let's see How to do it?
How to Optimise?
The process of Optimisation is an extremely iterative and Exhaustive process if one is not aware of the necessary changes to be made. One could end up being in a Rabbit Hole.
So here are some tips for every Parameter -
Batch Size - Always keep Batch size in multiples of 2. Preferably try to have it somewhere in the range of 8-32 for large Datasets. Very High or Very Low Batch Size can end up introducing too much Variance into the Model.
Epochs - There is no Standard number as to How many Epochs one should run a Model Training for. Based on the Learning Curve or the Loss Function's Value, do remember to put a CallBack or an Early Stopping mechanism to STOP the Training if there isn't much change in Accuracy or Loss.
Learning Rate - Determine what effect does changing the Learning Rate have on the Neural Model. A standard value is 0.01. Very High value would cause the system to learn very fast but might never give satisfactory results, whereas Very Low and the Model might take too long to learn.
Activation Function - Based on your problem statement, try out various functions as they all have their preferences. Eg: Tanh and Sigmoid work well with Classification.
Regularisation - It is a technique used for tuning the function by adding an additional penalty term in the error function. The additional term controls the excessively fluctuating function such that the coefficients don't take extreme values.
Normalisation - Normalising the Dataset to a particular range before feeding it to the Network, makes sure of the uniformity across the selected Features and their Importance when first perceived by the Model.
Hidden Layer & Neurons - Sometimes increasing the number of Neurons or Layers in an Architecture can help the Model learn more features and thereby improving the Accuracy.
I hope this article find you well and helps you in your future endeavours in speeding up the process of Model Development Cycle and also assisting in providing you with High performing Results.
In our future posts we will be talking more about How to Speed up the process of Training and How we can Leverage previous Learnings from different Models and Datasets to attain Higher Accuracies Faster.
STAY TUNED 😁.