Day27 ML Review - Perceptron (2)
Step by Step - Training Perceptron (2)
(Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition)
Choosing the right classification algorithm for a particular problem involves practice and experience. Each algorithm has its distinct characteristics and is based on specific assumptions.
We will review the following steps for these classification algorithms, starting with the Perceptron.
- Selecting features and collecting labeled training examples.
- Choosing a performance metric.
- Choosing a classifier and optimization algorithm.
- Evaluating the performance of the model.
- Tuning the algorithm.
Step by step of Training Perceptron (2)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)
Continuing from the last posting, we will proceed with the above steps for standardization.
Using the fit
method, StandardScaler
estimated the parameters $\mu$ and $\sigma$.
Having standardized the training data, we can now train a perceptron model. Most algorithms in scikit-learn already support multiclass classification by default via the one-vs.-rest (OVR)
method, which allows us to feed the three flower classes to the perceptron all at once.
from sklearn.linear_model import Perceptron
ppn = Perceptron(eta0=0.1, random_state=1)
ppn.fit(X_train_std, y_train)
The model parameter, eta0
, is equivalent to the learning rate, eta
, that we used in our perceptron implementation, and the n_iter
parameter defines the number of epochs (passes over the training dataset).
Finding an appropriate learning rate requires some experimentation.
The algorithm will overshoot the global cost minimum if the learning rate is too large. The updates can overshoot the optimal solution, causing the algorithm to oscillate and potentially never converge.
On the other hand, if the learning rate is too small, the algorithm will require more epochs until convergence, making the learning slow - especially for large datasets. The algorithm makes very small updates, which can slow down convergence. Also, we used the random_state
parameter to ensure the reproducibility of the initial shuffling of the training dataset after each epoch.
Having trained a model in scikit-learn, we can make predictions via the predict
method.
y_pred = ppn.predict(X_test_std)
print(‘Misclassified examples: %d’ % (y_test 1 != y_pred).sum()
Leave a comment