Double descent
Template:Short description Script error: No such module "For".Template:Machine learning bar
Double descent in statistics and machine learning is the phenomenon where a model with a small number of parameters and a model with an extremely large number of parameters both have a small training error, but a model whose number of parameters is about the same as the number of data points used to train the model will have a much greater test error than one with a much larger number of parameters.[2] This phenomenon has been considered surprising, as it contradicts assumptions about overfitting in classical machine learning.[3]
History
Early observations of what would later be called double descent in specific models date back to 1989.[4][5]
The term "double descent" was coined by Belkin et. al.[6] in 2019,[3] when the phenomenon gained popularity as a broader concept exhibited by many models.[7][8] The latter development was prompted by a perceived contradiction between the conventional wisdom that too many parameters in the model result in a significant overfitting error (an extrapolation of the bias–variance tradeoff),[9] and the empirical observations in the 2010s that some modern machine learning techniques tend to perform better with larger models.[6][10]
Theoretical models
Double descent occurs in linear regression with isotropic Gaussian covariates and isotropic Gaussian noise.[11]
A model of double descent at the thermodynamic limit has been analyzed using the replica trick, and the result has been confirmed numerically.[12]
Empirical examples
The scaling behavior of double descent has been found to follow a broken neural scaling law[13] functional form.
See also
References
Further reading
- Script error: No such module "Citation/CS1".
- Script error: No such module "citation/CS1".
- Script error: No such module "Citation/CS1".
- Script error: No such module "Citation/CS1".
- Script error: No such module "Citation/CS1".
External links
- Script error: No such module "citation/CS1".
- Script error: No such module "citation/CS1".
- Understanding "Deep Double Descent" at evhub.
Template:Statistics Template:Artificial intelligence navbox
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ a b Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ a b Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Caballero, Ethan; Gupta, Kshitij; Rish, Irina; Krueger, David (2022). "Broken Neural Scaling Laws". International Conference on Learning Representations (ICLR), 2023.