Preparing the Data for Machine Learning Problems

Our process machine learning process will follow these steps:

1- Import Data 2- Clean the Data 3- Split the Data into Training/Test Sets 4- Create a Model 5- Train the Model 6- Make Predictions 7- Evaluate and Improve

When we train a model, we give it two separate data sets: the input set and the output set. Output set, contains the predictions. So, we train our model.

The CSV file used in this tutorial:

In this tutorial, we have elements of age, gender and music genre. So, we will eventually try to make predictions according to our data.

import pandas as pd

Now, we should use the "drop" method to prepare our data. It works like this:

import pandas as pd
X = music.drop(columns=['genre'])

data set me.png