Preparing the Data for Machine Learning Problems
Our process machine learning process will follow these steps:
1- Import Data 2- Clean the Data 3- Split the Data into Training/Test Sets 4- Create a Model 5- Train the Model 6- Make Predictions 7- Evaluate and Improve
When we train a model, we give it two separate data sets: the input set and the output set. Output set, contains the predictions. So, we train our model.
The CSV file used in this tutorial: bit.ly/3muqqta
In this tutorial, we have elements of age, gender and music genre. So, we will eventually try to make predictions according to our data.
import pandas as pd
music=pd.read_csv('music.csv')
music
Now, we should use the "drop" method to prepare our data. It works like this:
import pandas as pd
music=pd.read_csv('music.csv')
X = music.drop(columns=['genre'])
X