模型选择与评价
• Model Selection refers to the process of optimizing a model (e.g., a classifier, a regression analyzer, and so on).
• Model Selection encompasses both the selection of a model (e.g., C4.5 versus Naïve Bayes) and the adjustment of a particular model’s parameters (e.g., adjusting the number of hidden units in a neural network).
• 回归-For the quantitative response variable Y:
err
1 N
N i 1
L( yi ,
fˆ (xi ))
• 分类-For the categorical response variable G:
N
err
1 N
I (gi Gˆ (xi ))
i 1
err
Model Selection and Evaluation
20130708
Book-The Elements of Statistical Learning (Second Edition) Chapter 7 Model Selection and Evaluation
What is Model Selection?
K
L(G, pˆ ( X )) 2 I (G k) log pˆk
k 1
2 log pˆG ( X )
(log-likelihood)
Training Error
• Training error is the average loss over the training sample.(经验风险)
What are potential issues with Model Selection?
• It is usually possible to improve a model’s fit with the data (up to a certain point). (e.g., more hidden units will allow a neural network to fit the data on which it is trained, better).
2 N
N i 1
log
pˆ gi
(xi )
Test Error (Generalization Error)
• Generalization error or test error: -- the expected prediction error over an independent test sample. (真风险,推广性误差)
• For quantitative response Y:
Err E[L(Y , fˆ ( X ))]
• For categorical response G:
Err E[L(G,Gˆ (X ))]
Err E[L(G, pˆ (X ))]
Bias(偏置), Variance(估计方差) and Model Complexity
L(Y ,
fˆ
(
X
))
(Y |Y
fˆ ( X )) fˆ ( X )
2
|
(squared error) (absolute error)
• 分类-Typical choices for categorical response G:
L(G,Gˆ ( X )) I (G Gˆ ( X ))
(0-1 loss)
What do we see from the preceding figure?
• There is an optimal model complexity that gives minimum test error.
• Training error is not a good estimate of the test error.
• There is a bias-variance tradeoff in choosing the appropriate complexity of the model.
Goals
• Model Selection: estimating the performance of different models in order to choose the best one.
Training set: used to fit the models.
• As such, model selection is very tightly linked with the issue of the Bias/Variance tradeoff.
Performance Assessment: Loss Function
• 回归-Typical choices for quantitative response Y:
• We want the model to use enough information from the data set to be as unbiased as possible, but we want it to discard all the information it needs to make it generalize as well as it can (i.e., fare as well as possible on a variety of different context).
• Model Assessment: having chosen a final model, estimating its generalization error on new data.
Splitting the data
Split the dataset into three parts: