Testing Machine Learning Algorithms with K-Fold Cross Validation

Click for: original source

Norbert Krupa wrote blog post on choosing a machine learning algorithm, then using a validation technique. He uses Talend Studio without hand coding.

Taking the example of predicting a user’s activity based on mobile phone accelerometer data, we must be able to classify a category for the data (resting, walking, or running).

This classification exercise presents common algorithms such as Logistic Regression, Decision Tree, Random Forest, and Naïve Bayes. In K-Fold Cross Validation, the training dataset is partitioned into two pieces: training and test, where K represents the number of folds or observations to take place.

Interesting article, with how to steps and loads of screenshots.

[Read More]

Tags machine-learning big-data