Double descent

An example of the double descent phenomenon in a two-layer neural network: as the ratio of parameters to data points increases, the test error first falls, then rises, then falls again.[1] The vertical line marks the "interpolation threshold" boundary between the underparametrized region (more data points than parameters) and the overparameterized region (more parameters than data points).

Double descent in statistics and machine learning is the phenomenon where a model with a small number of parameters and a model with an extremely large number of parameters have a small test error, but a model whose number of parameters is about the same as the number of data points used to train the model will have a large error.[2] This phenomenon has been considered surprising, as it contradicts assumptions about overfitting in classical machine learning.[3]

  1. ^ Rocks, Jason W. (2022). "Memorizing without overfitting: Bias, variance, and interpolation in overparameterized models". Physical Review Research. 4 (1). arXiv:2010.13933. doi:10.1103/PhysRevResearch.4.013201.
  2. ^ "Deep Double Descent". OpenAI. 2019-12-05. Retrieved 2022-08-12.
  3. ^ Schaeffer, Rylan; Khona, Mikail; Robertson, Zachary; Boopathy, Akhilan; Pistunova, Kateryna; Rocks, Jason W.; Fiete, Ila Rani; Koyejo, Oluwasanmi (2023-03-24). "Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle". arXiv:2303.14151v1 [cs.LG].

From Wikipedia, the free encyclopedia · View on Wikipedia

Developed by Nelliwinne