3 minute read

Designing Machine Learning Systems: Framing ML Problems (2) (Types of ML Tasks & Objective Functions)

8B370752-999D-4F2F-B67C-5271211EB4C4_1_105_c


Understanding ML Task Types

The nature of your output determines the type of ML task you’re solving. The two most common types are classification and regression.

  • Classification assigns inputs to discrete categories (e.g., spam vs. not spam, or selecting a department).
  • Regression predicts a continuous value, such as estimating house prices.

Interestingly, many problems can be reframed from one to the other.

A regression task, such as house pricing, can be transformed into a classification task if prices are bucketed into ranges. Likewise, spam detection could be modeled as a regression problem if the output is a probability score, which is then thresholded for classification.


Subtypes of Classification

Binary vs. Multiclass
  • Binary classification involves two classes: fraud versus non-fraud, positive versus negative sentiment, etc.
  • Multiclass classification involves three or more classes. For instance, routing customer support tickets to one of four departments.

The complexity increases with the number of classes, especially when the data is imbalanced or when some classes are rare. A common rule of thumb: models need at least 100 examples per class to learn effectively.

High Cardinality & Hierarchical Classification

When the number of classes grows to hundreds or thousands, such as in disease diagnosis or product categorization, we face high cardinality challenges. To manage this, we often use hierarchical classification, where:

  • A top-level model predicts a coarse category, such as “Fashion.”
  • A second-level model narrows down to a subcategory (e.g., “Shoes”).
Multilabel Classification

In multilabel problems, one instance can belong to multiple classes. For example, a news article might be labeled as both tech and finance. This is different from multiclass classification, where each instance belongs to exactly one class.

Multilabel problems introduce complications:

  • Label disagreement: Annotators may differ on how many labels to assign.
  • Prediction extraction: It’s unclear how many labels to predict per instance, especially when model outputs are probability scores like [0.45, 0.2, 0.02, 0.33].

To handle multilabel tasks, we either:

  1. Use a binary classifier for each label.
  2. Treat it as a vector prediction problem with multi-hot encoded targets.


Framing Matters: Multiple Perspectives

How you frame a problem can make a massive difference in scalability and maintenance.

Take the example of predicting which app a user will open next. A naive approach is to treat it as multiclass classification, outputting a probability distribution over N possible apps. But this approach breaks down if a new app is added—you’d need to retrain the model from scratch.

Instead, you can frame it as a regression problem:

  • Input: user features, environment features, and the candidate app’s features
  • Output: a score between 0 and 1 representing the likelihood that the user opens that app

Now, simply run N predictions—one per app—and rank them. This decoupled framing is flexible and future-proof, as new apps can be evaluated without retraining the model.


Defining the Objective Function

To guide learning, every machine learning (ML) model requires an objective function, also referred to as a loss function. In supervised learning, this function measures the difference between the model’s output and the true label.

Standard objective functions include:

  • RMSE or MAE for regression
  • Log loss for binary classification
  • Cross-entropy loss for multiclass classification

For example, if your model predicts [0.45, 0.2, 0.02, 0.33] for a politics-labeled article [0, 0, 0, 1], you can use cross-entropy to measure the loss.

  • In Python:

    import numpy as np
    def cross_entropy(p, q):
        return -sum([p[i] * np.log(q[i]) for i in range(len(p))])
    

This loss becomes the signal that guides the learning algorithm during training.


Decoupling Objectives in Multi-Goal Systems

Real-world applications often require optimizing multiple goals, and these goals can conflict. Take a newsfeed ranking system:

  • You want to maximize engagement (i.e., clicks and likes).
  • You also want to filter out spam, NSFW content, and misinformation.
  • You may even want to promote quality content.

How do you balance these competing objectives?

  • One way is to combine losses:

    total_loss = α * quality_loss + β * engagement_loss
    

    But tuning α and β requires retraining the model each time priorities change.

  • An alternative is to decouple the objectives:

    • Train one model to score engagement.
    • Train another model to score quality.
    • Combine outputs at inference time with a weighted sum:
    final_score = α * quality_score + β * engagement_score
    

This allows you to adjust your business trade-offs without retraining. It also supports independent updates: spam filters can be retrained more frequently than quality scorers, for instance.



Leave a comment