Day30 ML Review - Logistic Regression (2)

July 20, 2024 1 minute read

Cost Function of Logistic Regression

Learning the Weights of the Logistic Cost Function

In logistic regression, the weights (or parameters) are crucial in determining the decision boundary that separates the classes. During training, the weights are optimized to minimize the logistic cost function, which evaluates how closely the model’s predictions align with the actual class labels.

The logistic cost function, also known as the log-loss or binary cross-entropy loss, is used to evaluate the performance of a logistic regression model. It is defined as follows:

For a single training example, the cost function $J(\theta)$ is given by: $ J(\theta) = - \left[ y \log(h_\theta(x)) + (1 - y) \log(1 - h_\theta(x)) \right] $

For a dataset with $m$ training examples, the overall cost function is the average of the individual costs: $ J(\theta) = \frac{1}{m} \sum_{i=1}^{m} \left[ - y^{(i)} \log(h_\theta(x^{(i)})) + (1 - y^{(i)}) \log(1 - h_\theta(x^{(i)})) \right] \ $

Here:

$ \theta$ represents the vector of weights.
$y^{(i)}$ is the true label for the $(i)^{th}$ training example.
$x^{(i)}$is the feature vector for the $(i)^{th}$ training example.
$h_\theta(x^{(i)}) = \sigma(\theta^T x^{(i)})$ is the predicted probability that the output is 1, given by the sigmoid function $\sigma(z) = \frac{1}{1 + e^{-z}}$.

Role of Weights

The weights $\theta = (\theta_0, \theta_1, \ldots, \theta_n)$ determine the influence of each feature $x_j$ on the predicted probability $ h_\theta(x) $. Specifically:

$\theta_0$ is the intercept (bias term).
$ \theta_1, \theta_2, \ldots, \theta_n $ are the coefficients for the respective features $ x_1, x_2, \ldots, x_n $.

The weights in logistic regression are important parameters that determine the classifier’s decision boundary. They are optimized by minimizing the logistic cost function using gradient descent or other optimization techniques, potentially with regularization to prevent overfitting.

(Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition)

The cost approaches 0 on the y-axis if we correctly predict that an example belongs to class 1. Similarly, the cost also approaches 0 if we correctly predict y=0. However, if the prediction is wrong, the cost goes toward infinity. The main point is that we penalize wrong predictions with an increasingly higher cost.

Share on

Twitter Facebook LinkedIn

Wonha Leah Shin

Day30 ML Review - Logistic Regression (2)

Cost Function of Logistic Regression

Learning the Weights of the Logistic Cost Function

Role of Weights

Share on

Leave a comment

You may also enjoy

Day208 - Leetcode: Python 121 & SQL 175,176 & DL Review

Day207 - Leetcode: Python 53 & SQL 185 & DL Review

Day206 - Leetcode: Python 217 & SQL 175,176 & DL Review

Day205 - Leetcode: Python 175 & SQL Inner Join & DL Review