Preparing the Data for Machine Learning Problems

Our process machine learning process will follow these steps:

1- Import Data 2- Clean the Data 3- Split the Data into Training/Test Sets 4- Create a Model 5- Train the Model 6- Make Predictions 7- Evaluate and Improve

When we train a model, we give it two separate data sets: the input set and the output set. Output set, contains the predictions. So, we train our model.

The CSV file used in this tutorial: bit.ly/3muqqta

In this tutorial, we have elements of age, gender and music genre. So, we will eventually try to make predictions according to our data.

import pandas as pd
music=pd.read_csv('music.csv')
music

Now, we should use the "drop" method to prepare our data. It works like this:

import pandas as pd
music=pd.read_csv('music.csv')
X = music.drop(columns=['genre'])
X

data set me.png