Learning and Predicting, Machine Learning
Day 22 of #100DaysOfCode
Previous blog can be helpful to understand this one too: ilkecandan.hashnode.dev/preparing-the-data-..
Building a model using a machine learning algorithm is the next stage in our Machine Learning journey. There have previously been several algorithms developed. They all have their benefits and drawbacks. It may be contingent on accuracy and performance. Now we'll utilize a decision tree, which is a fairly basic method.
Decision Trees are a type of supervised machine learning in which the data is continually split according to a parameter (you explain what the input is and what the related output is in the training data). For the time being, we don't need to explicitly program these methods. They've already been implemented for us in the "scikit-learn" library.
"sklearn" is the package that comes with "scikit-learn" library. It is the most popular machine learning library in Python. We will also use a class called "DecisionTreeClassifier". This class implements the decision tree algorithm. We also need to create a new instance of the class. Let's create an object called "model". And, we set it to a new instance of "DecisionTreeClassifier".
Now, we have the model. Next, we have to train it so it learns patters in the data. We call that model "fit". This method takes two data sets: the input set and the output set. We will define them as X and Y.
Like this:
model.fit(X, y)
Finally, we can ask the model to make a prediction. We can ask it something like "What is the kind of music that a 20 year old female would listen?"
But, first let's impact our initial data.
So our code looks like this:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
music=pd.read_csv('music.csv')
X = music.drop(columns=['genre'])
y= music['genre']
model=DecisionTreeClassifier()
model.fit(X,y)
music
Result will look like:
We don't have to have all the ages that were specified. For the ones, that are between our algorithm can predict which data they will like.
Let's move ahead. And, we will predict the music type according to ages. We should delete the last line. And add this "predictions" line:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
music=pd.read_csv('music.csv')
X = music.drop(columns=['genre'])
y= music['genre']
model=DecisionTreeClassifier()
model.fit(X,y)
predictions = model.predict([[21, 0], [29,1]])
predictions
0 is female and 1 is male. We are asking what a 21 year old female and an 29 year old male would like to listen now.
Output:
array(['Dance', 'Jazz'], dtype=object)