Alex's Notes

Chollet: Chapter 06: Universal Workflow

Metadata

Core Ideas

The chapter introduces a template workflow for tackling any machine learning problem. It has three main parts:

  • Define the task. Understand the problem domain and business logic, collect a dataset and understand what it represents. Choose evaluation metrics.

  • Develop the model. Prepare the data, select a baseline and evaluation scores, train a first model with generalization power and able to overfit. Tune the model to achieve best generalization performance.

  • Deploy the model Present to stakeholders, ship to a web server, mobile app, web page, embedded device or whatever. Monitor performance, collect the data to build the next model.

Define the Task

The first step is Task Definition

When you take on a new machine learning project, first define the problem at hand:

  • Understand the broader context of what you’re setting out to do—what’s the end goal and what are the constraints?

  • Collect and annotate a dataset; make sure you understand your data in depth.

  • Choose how you’ll measure success for your problem—what metrics will you monitor on your validation data?

Develop the Model

The second step is Model Development

Once you understand the problem and you have an appropriate dataset, develop a model:

  • Prepare your data.
  • Pick your evaluation protocol: holdout validation? K-fold validation? Which portion of the data should you use for validation?

  • Achieve statistical power: beat a simple baseline.

  • Scale up: develop a model that can overfit.

  • Regularize your model and tune its hyperparameters, based on performance on the validation data. A lot of machine learning research tends to focus only on this step, but keep the big picture in mind.

Deploy the Model

The final step is Model Deployment

When your model is ready and yields good performance on the test data, it’s time for deployment:

  • First, make sure you set appropriate expectations with stakeholders.

  • Optimize a final model for inference, and ship a model to the deployment environment of choice—web server, mobile, browser, embedded device, etc.

  • Monitor your model’s performance in production, and keep collecting data so you can develop the next generation of the model.