Challenge 4: Automagic training with pipelines

Previous Challenge Next Challenge

Introduction

The previous challenge introduced the concept of build pipelines. But there are different types of pipelines, and this task is getting started with Agent Platform Pipelines for Continuous Training. In our example the continuous training pipeline will extract data from BigQuery, validate it, prepare it, train a model with it (using the Python package that’s built during the previous challenge), evaluate that model and register it in Agent Platform Model Registry.

Description

If you’ve successfully completed the previous challenge, your training code has been packaged and can be run from a Agent Platform Pipeline.

The provided project has a pipeline.py file that can generate a pipeline definition. Run that to generate a pipeline definition file in YML format. Use the generated pipeline definition file to create a new Pipeline Run through the GCP Console. Make sure to use the service account sa-mlops-kfp (Kubeflow Pipelines Service Account). In the Runtime configuration step fill in the required pipeline parameters. Do not set/override the endpoint and monitoring_job parameters (keep the default values).

 Note
Once the pipeline is triggered, it will take ~10 minutes to complete.

Success Criteria

  1. There’s at least one successful Agent Platform Pipeline run that has generated a Managed Model in the Model Registry.
  2. No code was modified.

Tips

  • Read the pipeline.py to understand what it does.
  • Note that the pipeline.py can generate JSON and YML pipeline definition files based on the extension of the output file name.
  • You can either upload the pipeline definition from a local machine, or put it on GCS and refer to its location.
  • The service account can be configured in the Run details phase when you expand the Advanced options.
  • You have already created a bucket, you can use that as the pipeline root (optionally add pipelines folder in it).
  • For the parameter location look up the region of the storage bucket created in the first challenge.
  • And for the python_pkg parameter check the Cloud Build pipeline to find out where the created Python package is stored and browse to that location to get the name of the package.
  • If you’re in doubt about the parameters, remember to Use the Force and read the Source ;)

Learning Resources

Previous Challenge Next Challenge