Build ML Pipelines at Scale with Kubeflow

Prafull Ladha

Artificial Intelligence / Machine Learning

Tags:

Kubeflow

ML workloads

ML pipeline

ML at scale

Setting up a ML stack requires lots of tools, analyzing data, and training a model in the ML pipeline. But it is even harder to set up the same stack in multi-cloud environments. This is when Kubeflow comes into the picture and makes it easy to develop, deploy, and manage ML pipelines.

In this article, we are going to learn how to install Kubeflow on Kubernetes (GKE), train a ML model on Kubernetes and publish the results. This introductory guide will be helpful for anyone who wants to understand how to use Kubernetes to run a ML pipeline in a simple, portable and scalable way.

Kubeflow Installation on GKE

You can install Kubeflow onto any Kubernetes cluster no matter which cloud it is, but the cluster needs to fulfill the following minimum requirements:

4 CPU
50 GB storage
12 GB memory

The recommended Kubernetes version is 1.14 and above.

You need to download kfctl from the Kubeflow website and untar the file:
tar -xvf kfctl_v1.0.2_<platform>.tar.gz -C /home/velotio/kubeflow</platform>

Also, install kustomize using these instructions.

Start by exporting the following environment variables:

CODE: https://gist.github.com/velotiotech/8515baf54b6007207cacd7f3ec4a9f72.js

After we’ve exported these variables, we can build the kubebuilder and customize everything according to our needs. Run the following command:

CODE: https://gist.github.com/velotiotech/b2ad11c5dfa00d7bb65fb3f8b5e3517e.js

This will download the file kfctl_k8s_istio.v1.0.2.yaml and a kustomize folder. If you want to expose the UI with LoadBalancer, change the file $KF_DIR/kustomize/istio-install/base/istio-noauth.yaml and edit the service istio-ingressgateway from NodePort to LoadBalancer.

Now, you can install KubeFlow using the following commands:

CODE: https://gist.github.com/velotiotech/02bd57036f42e0ed5f3f57c3b19e4e0e.js

This will install a bunch of services that are required to run the ML workflows.

Once successfully deployed, you can access the Kubeflow UI dashboard on the istio-ingressgateway service. You can find the IP using following command:

CODE: https://gist.github.com/velotiotech/cc99f9c01c24b416305cb0651fea00d2.js

ML Workflow

Developing your ML application consists of several stages:

Gathering data and data analysis
Researching the model for the type of data collected
Training and testing the model
Tuning the model
Deploy the model

These are multi-stage models for any ML problem you’re trying to solve, but where does Kubeflow fit in this model?

Kubeflow provides its own pipelines to solve this problem. The Kubeflow pipeline consists of the ML workflow description, the different stages of the workflow, and how they combine in the form of graph.

Kubeflow provides an ability to run your ML pipeline on any hardware be it your laptop, cloud or multi-cloud environment. Wherever you can run Kubernetes, you can run your ML pipeline.

Training your ML Model on Kubeflow

Once you’ve deployed Kubeflow in the first step, you should be able to access the Kubeflow UI, which would look like:

The first step is to upload your pipeline. However, to do that, you need to prepare your pipeline in the first place. We are going to use a financial series database and train our model. You can find the example code here:

CODE: https://gist.github.com/velotiotech/a183d92dcb81c9a5deec210013e7b0ce.js

This command above will build the docker images, and we will create the bucket to store our data and model artifacts.

CODE: https://gist.github.com/velotiotech/f0897435b3096fc9240b7611a90342e1.js

Once we have our image ready on the GCR repo, we can start our training job on Kubernetes. Please have a look at the tfjob resource in CPU/tfjob1.yaml and update the image and bucket reference.

CODE: https://gist.github.com/velotiotech/91b1c63a4dffcf0b6d457a27e9972f0a.js

Kubeflow Pipelines needs our pipeline file into a domain-specific-language. We can compile our python3 file with a tool called dsl-compile that comes with the Python3 SDK, which compile our pipeline into DSL. So, first, install that SDK:

CODE: https://gist.github.com/velotiotech/bb5bc96a8a17ccd77673a367d042628c.js

Next, inspect the ml_pipline.py and update the ml_pipeline.py with the CPU image path that you built in the previous steps. Then, compile the DSL, using:

CODE: https://gist.github.com/velotiotech/e34348dbe3737ac037f396cfccad05e1.js

Now, a file ml_pipeline.py.tar_gz is generated, which we can upload to the Kubeflow pipelines UI.

Once the pipeline is uploaded, you can see the stages in a graph-like format.

Next, we can click on the pipeline and create a run. For each run, you need to specify the params that you want to use. When the pipeline is running, you can inspect the logs:

Run Jupyter Notebook in your ML Pipeline

You can also interactively define your pipeline from the Jupyter notebook:

1. Navigate to the Notebook Servers through the Kubeflow UI

2. Select the namespace and click on “new server.”

3. Give the server a name and provide the docker image for the TensorFlow on which you want to train your model. I took the TensorFlow 1.15 image.

4. Once a notebook server is available, click on “connect” to connect to the server.

5. This will open up a new window and a Jupyter terminal.

6. Input the following command: pip install -U kfp.

7. Download the notebook using following command:

CODE: https://gist.github.com/velotiotech/79aec0e153a621f5e5e2415cc614e34d.js

8. Now that you have notebook, you can replace the environment variables like WORKING_DIR, PROJECT_NAME and GITHUB_TOKEN. Once you do that, you can run the notebook step-by-step (one cell at a time) by pressing shift+enter, or you can run the whole notebook by clicking on menu and run all options.

Conclusion

The ML world has its own challenges; the environments are tightly coupled and the tools you needed to deploy to build an ML stack was extremely hard to set up and configure. This becomes harder in production environments because you have to be extremely cautious you are not breaking the components that are already present.

Kubeflow makes getting started on ML highly accessible. You can run your ML workflows anywhere you can run Kubernetes. Kubeflow made it possible to run your ML stack on multi cloud environments, which enables ML engineers to easily train their models at scale with the scalability of Kubernetes.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Build ML Pipelines at Scale with Kubeflow

Kubeflow Installation on GKE

You can install Kubeflow onto any Kubernetes cluster no matter which cloud it is, but the cluster needs to fulfill the following minimum requirements:

4 CPU
50 GB storage
12 GB memory

The recommended Kubernetes version is 1.14 and above.

You need to download kfctl from the Kubeflow website and untar the file:
tar -xvf kfctl_v1.0.2_<platform>.tar.gz -C /home/velotio/kubeflow</platform>

Also, install kustomize using these instructions.

Start by exporting the following environment variables:

CODE: https://gist.github.com/velotiotech/8515baf54b6007207cacd7f3ec4a9f72.js

After we’ve exported these variables, we can build the kubebuilder and customize everything according to our needs. Run the following command:

CODE: https://gist.github.com/velotiotech/b2ad11c5dfa00d7bb65fb3f8b5e3517e.js

Now, you can install KubeFlow using the following commands:

CODE: https://gist.github.com/velotiotech/02bd57036f42e0ed5f3f57c3b19e4e0e.js

This will install a bunch of services that are required to run the ML workflows.

Once successfully deployed, you can access the Kubeflow UI dashboard on the istio-ingressgateway service. You can find the IP using following command:

CODE: https://gist.github.com/velotiotech/cc99f9c01c24b416305cb0651fea00d2.js

ML Workflow

Developing your ML application consists of several stages:

Gathering data and data analysis
Researching the model for the type of data collected
Training and testing the model
Tuning the model
Deploy the model

These are multi-stage models for any ML problem you’re trying to solve, but where does Kubeflow fit in this model?

Kubeflow provides an ability to run your ML pipeline on any hardware be it your laptop, cloud or multi-cloud environment. Wherever you can run Kubernetes, you can run your ML pipeline.

Training your ML Model on Kubeflow

Once you’ve deployed Kubeflow in the first step, you should be able to access the Kubeflow UI, which would look like:

CODE: https://gist.github.com/velotiotech/a183d92dcb81c9a5deec210013e7b0ce.js

This command above will build the docker images, and we will create the bucket to store our data and model artifacts.

CODE: https://gist.github.com/velotiotech/f0897435b3096fc9240b7611a90342e1.js

Once we have our image ready on the GCR repo, we can start our training job on Kubernetes. Please have a look at the tfjob resource in CPU/tfjob1.yaml and update the image and bucket reference.

CODE: https://gist.github.com/velotiotech/91b1c63a4dffcf0b6d457a27e9972f0a.js

CODE: https://gist.github.com/velotiotech/bb5bc96a8a17ccd77673a367d042628c.js

Next, inspect the ml_pipline.py and update the ml_pipeline.py with the CPU image path that you built in the previous steps. Then, compile the DSL, using:

CODE: https://gist.github.com/velotiotech/e34348dbe3737ac037f396cfccad05e1.js

Now, a file ml_pipeline.py.tar_gz is generated, which we can upload to the Kubeflow pipelines UI.

Once the pipeline is uploaded, you can see the stages in a graph-like format.

Next, we can click on the pipeline and create a run. For each run, you need to specify the params that you want to use. When the pipeline is running, you can inspect the logs:

Run Jupyter Notebook in your ML Pipeline

You can also interactively define your pipeline from the Jupyter notebook:

1. Navigate to the Notebook Servers through the Kubeflow UI

2. Select the namespace and click on “new server.”

3. Give the server a name and provide the docker image for the TensorFlow on which you want to train your model. I took the TensorFlow 1.15 image.

4. Once a notebook server is available, click on “connect” to connect to the server.

5. This will open up a new window and a Jupyter terminal.

6. Input the following command: pip install -U kfp.

7. Download the notebook using following command:

CODE: https://gist.github.com/velotiotech/79aec0e153a621f5e5e2415cc614e34d.js

Conclusion

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

Velotio Technologies is an outsourced software product development partner for top technology startups and enterprises. We partner with companies to design, develop, and scale their products. Our work has been featured on TechCrunch, Product Hunt and more.

We have partnered with our customers to built 90+ transformational products in areas of edge computing, customer data platforms, exascale storage, cloud-native platforms, chatbots, clinical trials, healthcare and investment banking.

Since our founding in 2016, our team has completed more than 90 projects with 220+ employees across the following areas:

Building web/mobile applications
Architecting Cloud infrastructure and Data analytics platforms
Designing AI/ML-based solutions
Intelligent Chatbots

Talk to us

Build ML Pipelines at Scale with Kubeflow

Prafull Ladha

Kubeflow Installation on GKE

ML Workflow

Training your ML Model on Kubeflow

Run Jupyter Notebook in your ML Pipeline

Conclusion

Related Articles

MORE POSTS BY THIS AUTHOR

Prafull Ladha

You may also like

Policy Insights: Chatbots and RAG in Health Insurance Navigation

Shreyash Panchal

The Responsible Use of Artificial Intelligence - Shaping a Safer Tomorrow

Shivali Bari

Vector Search: The New Frontier in Personalized Recommendations

Afshan Khan

Build ML Pipelines at Scale with Kubeflow

Kubeflow Installation on GKE

ML Workflow

Training your ML Model on Kubeflow

Run Jupyter Notebook in your ML Pipeline

Conclusion

Related Articles

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

About Velotio

Subscribe to get the latest technology updates

Related Posts

Services

By Company Stage

By Engagement Model

Expertise

Product Engineering

Data and AI

Cloud & DevOps

Strategy and Consulting

Subscribe to get the latest technology updates

Build ML Pipelines at Scale with Kubeflow

Prafull Ladha

Kubeflow Installation on GKE

ML Workflow

Training your ML Model on Kubeflow

Run Jupyter Notebook in your ML Pipeline

Conclusion

Related Articles

MORE POSTS BY THIS AUTHOR

Prafull Ladha

You may also like

Policy Insights: Chatbots and RAG in Health Insurance Navigation

Shreyash Panchal

The Responsible Use of Artificial Intelligence - Shaping a Safer Tomorrow

Shivali Bari

Vector Search: The New Frontier in Personalized Recommendations

Afshan Khan

Build ML Pipelines at Scale with Kubeflow

Kubeflow Installation on GKE

ML Workflow

Training your ML Model on Kubeflow

Run Jupyter Notebook in your ML Pipeline

Conclusion

Related Articles

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

About Velotio

Subscribe to get the latest technology updates

Related Posts

Policy Insights: Chatbots and RAG in Health Insurance Navigation

The Responsible Use of Artificial Intelligence - Shaping a Safer Tomorrow

Vector Search: The New Frontier in Personalized Recommendations

Unlocking Legal Insights: Effortless Document Summarization with OpenAI's LLM and LangChain

Building an Intelligent Recommendation Engine with Collaborative Filtering

Exploring OpenAI Gym: A Platform for Reinforcement Learning Algorithms

Real Time Text Classification Using Kafka and Scikit-learn

Your Complete Guide to Building Stateless Bots Using Rasa Stack

Chatbots With Google DialogFlow: Build a Fun Reddit Chatbot in 30 Minutes

Amazon Lex + AWS Lambda: Beyond Hello World

Machine Learning for your Infrastructure: Anomaly Detection with Elastic + X-Pack

A Quick Guide to Building a Serverless Chatbot With Amazon Lex

Building an Intelligent Chatbot Using Botkit and Rasa NLU

Explanatory vs. Predictive Models in Machine Learning

Benefits of Using Chatbots: How Companies Are Using Them to Their Advantange

A Step Towards Machine Learning Algorithms: Univariate Linear Regression

A Quick Introduction to Data Analysis With Pandas

Product Engineering

Data and AI

Cloud & DevOps

Strategy and Consulting