Difference between revisions of "24F Final Project: Overfitting"

Revision as of 07:27, 22 October 2022

By Thomas Zhang

Overfitting is a phenomenon in machine learning which occurs when a learning algorithm fits too closely (or even exactly) to its training data, resulting in a model that is unable to make accurate predictions on new data.^[1] More generally, it means that a machine learning model has learned the training data too well, including “noise,” or irrelevant information, and random fluctuations, leading to decreased performance when presented with new data. This is a major problem as the ability of machine learning models to make predictions/decisions and classify data has many real-world applications; overfitting interferes with a model’s ability to generalize well to new data, directly affecting its ability to do the classification and prediction tasks it was intended for.^[1]

Background/History

The term “overfitting” first originated in the field of statistics, with this subject being extensively studied in the context of regression analysis and pattern recognition; however, with the arrival of artificial intelligence and machine learning, this phenomenon has been subject to increased attention due to its important implications on the performance of AI models.^[2] Since its early days, the concept has evolved significantly, with researchers continuously endeavoring to develop methods to mitigate overfitting’s adverse effects on the accuracy of models and their ability to generalize.^[2]

Overfitting

Overfitting occurs when a machine learning model captures not only the underlying patterns in the training data, but the random noise or errors as well. This tends to happen when the model trains for too long on sample data or when the model is too complex. When a model is excessively trained on a limited dataset, it leads to the model memorizing the specific data points rather than learning the underlying patterns of the dataset.^[2] Complex models with a large number of parameters are also conducive for overfitting in that they have the capacity to learn intricate details, including noise, from the training data.^[2]

With this in mind, one may logically think that ending the training process earlier, known as “early stopping,” or reducing complexity in the model would prevent overfitting; however, pausing training too early or excluding too many important features may result in the opposite problem: underfitting.^[1] Underfitting describes a model which does not capture the underlying relationship in the dataset which it was trained on.^[3] While not training for long enough or being trained with poorly chosen hyperparameters does lead to underfitting, the most common reason that models underfit is because they exhibit too much bias.^[3] Basically, an underfit model will exhibit high bias and low variance, meaning it will generate reliably inaccurate predictions - while reliability is desirable, inaccuracy is not.^[3] On the other hand, overfitting results in models that have lower bias but increased variance.^[1]

Figure illustrating underfitting versus overfitting^[4]

Both overfitting and underfitting mean the model cannot establish the dominant pattern within the training data and, as a result, cannot generalize well to new data. This leads us to the concept called the bias-variance tradeoff, which describes the need to balance between underfitting and overfitting when one is devising a model that performs well; you are essentially trading off between the bias and variance components of a model’s error so that neither becomes overwhelming.^[3] Tuning a model away from either underfitting or overfitting pushes it closer towards the other issue.

How to Detect Overfit Models

The performance (accuracy) of a machine learning model is assessed to help one determine if there are any issues, like overfitting or underfitting, in the model; k-fold cross-validation is one of the most popular techniques to assess model fitness.^[1] In this procedure, we divide the dataset into k groups and take each unique group as a test data set with the remaining groups composing the training data set associated with that test data set.^[5] We then fit a model on each training set and evaluate it on the corresponding test set, retaining the evaluation score and discarding the model.^[5] When all evaluations have been done (there should have been k of them), the performance of the overall model can be summarized with the obtained scores; this often means averaging the evaluation scores to assess the overall model’s performance.^[5] Additionally, it is generally good practice to include a measure of the variance of the scores, such as the standard deviation or standard error. ^[5]

How to Avoid Overfitting

Below are some techniques that one can use to prevent overfitting:

Early Stopping

Validation error vs testing error ^[6]

As mentioned earlier, this method involves pausing training before the model starts learning unnecessary noise in the training data. The phenomenon known as the “learning speed slow-down,” which means that the accuracy of algorithms stops improving after some point (or even gets worse due to noise-learning), can be avoided in this way.^[6] As shown in the figure to the right, where the horizontal axis is epoch, and the vertical axis is error, the blue line shows the training era and the red line shows the validation error. If the model continues learning after the exclamation point, the validation error will increase while the training error will continue decreasing; stopping learning before the point is underfitting and stopping learning after is overfitting - one should aim to find the exact point to stop training, striking the perfect balance between underfitting and overfitting.^[7]

References

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 What is Overfitting? IBM. https://www.ibm.com/topics/overfitting
↑ ^2.0 ^2.1 ^2.2 ^2.3 Lark Editorial Team. (2023, Dec 26). Overfitting. Lark Suite. https://www.larksuite.com/en_us/topics/ai-glossary/overfitting
↑ ^3.0 ^3.1 ^3.2 ^3.3 What is Underfitting in Machine Learning? Domino Data Lab. https://domino.ai/data-science-dictionary/underfitting
↑ Model Fit: Underfitting vs. Overfitting. Amazon Web Services. https://docs.aws.amazon.com/machine-learning/latest/dg/model-fit-underfitting-vs-overfitting.html
↑ ^5.0 ^5.1 ^5.2 ^5.3 Brownlee, J. (2023, Oct 4). ‘’A Gentle Introduction to k-fold Cross-Validation’’ Machine Learning Mastery. https://machinelearningmastery.com/k-fold-cross-validation/
↑ ^6.0 ^6.1 Ying, X. (2019). An Overview of Overfitting and its Solutions. 2018 International Conference on Computer Information Science and Application Technology, 1168(2), 1-7. https://doi.org/10.1088/1742-6596/1168/2/022022
↑ Cite error: Invalid <ref> tag; no text was provided for refs named ying

[.E2.80.9Dibm.E2.80.9D-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 What is Overfitting? IBM. https://www.ibm.com/topics/overfitting

[.E2.80.9Dlark.E2.80.9D-2] 2.0 ^2.1 ^2.2 ^2.3 Lark Editorial Team. (2023, Dec 26). Overfitting. Lark Suite. https://www.larksuite.com/en_us/topics/ai-glossary/overfitting

[.E2.80.9Ddomino.E2.80.9D-3] 3.0 ^3.1 ^3.2 ^3.3 What is Underfitting in Machine Learning? Domino Data Lab. https://domino.ai/data-science-dictionary/underfitting

[.E2.80.9Cimg.E2.80.9D-4] Model Fit: Underfitting vs. Overfitting. Amazon Web Services. https://docs.aws.amazon.com/machine-learning/latest/dg/model-fit-underfitting-vs-overfitting.html

[.E2.80.9Dkfold.E2.80.9D-5] 5.0 ^5.1 ^5.2 ^5.3 Brownlee, J. (2023, Oct 4). ‘’A Gentle Introduction to k-fold Cross-Validation’’ Machine Learning Mastery. https://machinelearningmastery.com/k-fold-cross-validation/

[.E2.80.9Cying.E2.80.9D-6] 6.0 ^6.1 Ying, X. (2019). An Overview of Overfitting and its Solutions. 2018 International Conference on Computer Information Science and Application Technology, 1168(2), 1-7. https://doi.org/10.1088/1742-6596/1168/2/022022

[ying-7] Cite error: Invalid <ref> tag; no text was provided for refs named ying

[1]

[2]

[3]

[4]

[5]

[6]

[7]

@@ Line 18: / Line 18: @@
 === Early Stopping ===
-[[File:Early_stopping_figure.png|right|400px|thumb|caption|Validation error vs testing error <ref name=“ying”>Ying, X. (2019). An Overview of Overfitting and its Solutions. ''2018 International Conference on Computer Information Science and Application Technology'', 1168(2), 1-7. https://doi.org/10.1088/1742-6596/1168/2/022022</ref>]]As mentioned earlier, this method involves pausing training before the model starts learning unnecessary noise in the training data. The phenomenon known as the “learning speed slow-down,” which means that the accuracy of algorithms stops improving after some point (or even gets worse due to noise-learning), can be avoided in this way.<ref name=“ying” /> As shown in the figure to the right, where the horizontal axis is epoch, and the vertical axis is error, the blue line shows the training era and the red line shows the validation error. If the model continues learning after the exclamation point, the validation error will increase while the training error will continue decreasing; stopping learning before the point is underfitting and stopping learning after is overfitting - one should aim to find the exact point to stop training, striking the perfect balance between underfitting and overfitting.<ref name=”ying” />
+[[File:Early_stopping_figure.png|right|400px|thumb|caption|Validation error vs testing error <ref name=“ying”>Ying, X. (2019). An Overview of Overfitting and its Solutions. ''2018 International Conference on Computer Information Science and Application Technology'', 1168(2), 1-7. https://doi.org/10.1088/1742-6596/1168/2/022022</ref>]]As mentioned earlier, this method involves pausing training before the model starts learning unnecessary noise in the training data. The phenomenon known as the “learning speed slow-down,” which means that the accuracy of algorithms stops improving after some point (or even gets worse due to noise-learning), can be avoided in this way.<ref name=“ying” /> As shown in the figure to the right, where the horizontal axis is epoch, and the vertical axis is error, the blue line shows the training era and the red line shows the validation error. If the model continues learning after the exclamation point, the validation error will increase while the training error will continue decreasing; stopping learning before the point is underfitting and stopping learning after is overfitting - one should aim to find the exact point to stop training, striking the perfect balance between underfitting and overfitting.<ref name="ying" />
 == References ==
 <references />

Difference between revisions of "24F Final Project: Overfitting"

Revision as of 07:27, 22 October 2022

Contents

Background/History

Overfitting

How to Detect Overfit Models

How to Avoid Overfitting

Early Stopping

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools