Kubeflow - Tool to create large scale ML workflows.

Everything to get production ready Kubeflow applications.

By Nandeshwar

Jan 21, 2023

Kubeflow banner


Main repo I will keep updating this repo every week. Please support by adding a star. Thanks!

Meta

Created originally by Google engineers. Wikipedia - Page

Why Kubeflow?

  • Kubeflow is primarily used to build and run ML-based containerized workflows that are portable and scalable.
  • Make ML workflows on Kubernetes simple, portable, and scalable.
  • Horizontally infinitely scalable with self-healing.

Common Kubeflow use cases

  1. Deploying web-scale models to production
  2. Shared multi-tenant ML Environment
  3. Framework for model experimentation, tracking & versioning
  4. Running Jupyter NBs on GPU
  5. Enable recurring training tasks (periodic ones)

Can solve the following bottlenecks

  1. One process blocks other processes to run (compute bottleneck)
  2. No automation and scheduling in existing infra
  3. No auto-scaling (Serverless architecture)
  4. Offline experiment tracking is not present

Kubeflow Components

Pipelines

Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers.

MLMD

Kubeflow uses the Google MLMD library behind the scenes to store all the metadata. It is the library for recording and retrieving metadata associated with ML development
It is used to store artifacts, metrics, etc.

KServe

Highly scalable and standards-based Model Inference Platform on Kubernetes

Examples

Kubeflow
Pipelines
KServe

Plugins

  • Arena Cli- Arena is a command-line interface for data scientists to run and monitor the machine learning training jobs and check their results in an easy way.
  • Kale - KALE (Kubeflow Automated pipeLines Engine) is a project that aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows.

Community

GitHub repo link.


Tags

Kubeflow
ML-Ops
Kubernetes