
1. Re: BeginnerHow to evaluate the return value by CV function？
Ajit Kumar Pookalangara Dec 19, 2017 8:17 PM (in response to ZYQ)The goal of CV is tuning the hyperparameters of the algorithm being used (e.g. KNN ) or model selection. We would be choosing the parameters which will best generalize the data and along the best K value. You can use CV to find out this best K
You run the KNN for a range of values of K (say 1: 10) and check the validation scores
Model Selection:
Compare how different models are performing by simply calculating the mean of the scores
print(cross_val_score(knn, X, y, cv=10, scoring='accuracy').mean())
say 0.97
print(cross_val_score(logreg, X, y, cv=10, scoring='accuracy').mean())
say 0.93
We can conclude that KNN is better than Logistic regression
Parameter Selection : Finding the best value of K when using KNN
k_range = range(1, 10)
k_scores = []
for k in k_range:
knn = KNeighborsClassifier(n_neighbors=k)
scores = cross_val_score(knn, X, y, cv=10, scoring='accuracy')
k_scores.append(scores.mean()) # use average
You can plot the K values vs. scores and choose K that gives best accuracy or print and see the scores