Kubeflow - Tool to create large scale ML workflows.
Everything to get production ready Kubeflow applications.
Jan 21, 2023
Main repo I will keep updating this repo every week. Please support by adding a star. Thanks!
Meta
Created originally by Google engineers. Wikipedia - Page
Why Kubeflow?
- Kubeflow is primarily used to build and run ML-based containerized workflows that are portable and scalable.
- Make ML workflows on Kubernetes simple, portable, and scalable.
- Horizontally infinitely scalable with self-healing.
Common Kubeflow use cases
- Deploying web-scale models to production
- Shared multi-tenant ML Environment
- Framework for model experimentation, tracking & versioning
- Running Jupyter NBs on GPU
- Enable recurring training tasks (periodic ones)
Can solve the following bottlenecks
- One process blocks other processes to run (compute bottleneck)
- No automation and scheduling in existing infra
- No auto-scaling (Serverless architecture)
- Offline experiment tracking is not present
Kubeflow Components
Pipelines
Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers.
MLMD
Kubeflow uses the Google MLMD library behind the scenes to store all the metadata. It is the library for recording and retrieving metadata associated with ML development
It is used to store artifacts, metrics, etc.
KServe
Highly scalable and standards-based Model Inference Platform on Kubernetes
Examples
Kubeflow
- Kubeflow overview and implementation - Example of using the Kubeflow framework in production.
Pipelines
- Kubeflow - Examples- Official Kubeflow examples repository.
- Kubeflow pipeline termination notification
- Kubeflow MLOps: Automatic pipeline deployment with CI / CD / CT
- From Notebook to Kubeflow Pipelines with HP Tuning
- Periodic or Recurring runs of Kubeflow pipelines
KServe
Plugins
- Arena Cli- Arena is a command-line interface for data scientists to run and monitor the machine learning training jobs and check their results in an easy way.
- Kale - KALE (Kubeflow Automated pipeLines Engine) is a project that aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows.
Community
GitHub repo link.