# 100 Days of MLOps Challenge - Digital Checklist

You can use this Markdown file to digitally track your progress directly within VS Code, Obsidian, or GitHub by clicking the checkboxes.

## Module 1: Environment & Tooling
- [x] **Day 1**: Create a Python Virtual Environment for ML
- [x] **Day 2**: Set Up and Configure Jupyter Notebook Server
- [x] **Day 3**: Fix a Broken uv Lockfile Specification
- [x] **Day 4**: Create a Standard ML Project Structure
- [x] **Day 5**: Create a Makefile for ML Workflow Automation
- [x] **Day 6**: Set Up Code Quality Tools for ML Code
- [x] **Day 7**: Package an ML Project as Installable Python Package
- [x] **Day 8**: Configure Pre-Commit Hooks for ML Repository
- [x] **Day 9**: Create a Custom ML Project Template with Cookiecutter

## Module 2: Data Versioning (DVC)
- [x] **Day 10**: Install and Initialize DVC in an ML Project
- [x] **Day 11**: Track a Dataset with DVC
- [x] **Day 12**: Configure a DVC Remote Storage
- [x] **Day 13**: Pull DVC-Tracked Data from Remote
- [x] **Day 14**: Create a DVC Pipeline for Data Processing
- [x] **Day 15**: Parameterize a DVC Pipeline
- [x] **Day 16**: Track ML Metrics with DVC
- [x] **Day 17**: Run and Compare DVC Experiments
- [x] **Day 18**: Version Datasets and Models Across Git Branches
- [x] **Day 19**: Build Complete DVC ML Pipeline with Remote Storage and Experiments

## Module 3: Experiment Tracking (MLflow)
- [x] **Day 20**: Install and Start the MLflow Tracking Server
- [x] **Day 21**: Log an ML Experiment to MLflow
- [x] **Day 22**: Create and Organize MLflow Experiments
- [x] **Day 23**: Search and Query MLflow Runs
- [x] **Day 24**: Enable MLflow Autologging
- [x] **Day 25**: Register, Version, and Manage Model Lifecycle
- [ ] **Day 26**: Compare Model Runs and Select the Best
- [ ] **Day 27**: Load Model from Registry with Custom Preprocessing
- [ ] **Day 28**: Fix a Broken MLflow Project and Re-Run It
- [ ] **Day 29**: Configure MLflow with Remote Tracking Server and Artifact Store
- [ ] **Day 30**: End-to-End MLflow Lifecycle: Train, Register, Serve, Monitor

## Module 4: Model Training & Tuning
- [ ] **Day 31**: Train a Scikit-Learn Model with Reproducible Script
- [ ] **Day 32**: Manage Training Configuration with YAML
- [ ] **Day 33**: Evaluate a Trained Model and Generate Classification Report
- [ ] **Day 34**: Implement Cross-Validation for Model Selection
- [ ] **Day 35**: Hyperparameter Tuning with Optuna
- [ ] **Day 36**: Automated Model Selection with FLAML AutoML
- [ ] **Day 37**: Distributed Model Training with Joblib Parallelization
- [ ] **Day 38**: Build Modular Training Pipeline with Config-Driven Stages
- [ ] **Day 39**: Train a PyTorch Model with GPU Support and Checkpointing
- [ ] **Day 40**: Production Training System: Tracking, Tuning, and Model Selection

## Module 5: Data Quality & Secrets
- [ ] **Day 41**: Install and Initialize a Feast Feature Store
- [ ] **Day 42**: Define Feature Views in Feast
- [ ] **Day 43**: Materialize Features to the Online Store
- [ ] **Day 44**: Store MLflow's Admin Password in HashiCorp Vault
- [ ] **Day 45**: Fix a Broken Vault KV Policy for the MLflow Reader
- [ ] **Day 46**: Author Data-Quality Expectations with Great Expectations
- [ ] **Day 47**: Debug a Failing Great Expectations Checkpoint
- [ ] **Day 48**: Publish Great Expectations Data Docs as a CI Artefact
- [ ] **Day 49**: Secrets + Data-Quality Integration Capstone

## Module 6: Docker Containerization
- [ ] **Day 50**: Create Docker Image for ML Training Environment
- [ ] **Day 51**: Create Multi-Stage Docker Build for ML Serving
- [ ] **Day 52**: Set Up Local ML Dev Environment with Docker Compose
- [ ] **Day 53**: Create GPU-Enabled Docker Image for Deep Learning
- [ ] **Day 54**: Push ML Model Images to Container Registry
- [ ] **Day 55**: Add Health Checks and Graceful Shutdown to ML Containers
- [ ] **Day 56**: Automate ML Docker Image Building in CI Pipeline

## Module 7: Model Serving
- [ ] **Day 57**: Serve an ML Model with Flask
- [ ] **Day 58**: Serve an ML Model with FastAPI
- [ ] **Day 59**: Run Batch Predictions on a Dataset
- [ ] **Day 60**: Package a Model as a BentoML Service
- [ ] **Day 61**: Containerize an ML Model API with Docker
- [ ] **Day 62**: Implement A/B Testing for Model Deployment
- [ ] **Day 63**: Implement Async Batch Prediction with Task Queue
- [ ] **Day 64**: Serve Multiple Models Behind Unified API Gateway
- [ ] **Day 65**: Implement Canary Deployment for Model Updates
- [ ] **Day 66**: Production Model Serving with Docker Compose

## Module 8: Monitoring & Observability
- [ ] **Day 67**: Add Prometheus as a Grafana Data Source
- [ ] **Day 68**: Generate a Model Performance Report
- [ ] **Day 69**: Generate a Data Quality Report
- [ ] **Day 70**: Create Automated Tests with Evidently Test Suites
- [ ] **Day 71**: Set Up Evidently Monitoring Dashboard
- [ ] **Day 72**: Set Up Drift Detection Alerts
- [ ] **Day 73**: Automatic Retraining Triggered by Drift Detection
- [ ] **Day 74**: Monitor Custom Business Metrics Alongside ML Metrics
- [ ] **Day 75**: End-to-End Monitoring: Prometheus, Grafana, Evidently

## Module 9: Continuous Integration / CD
- [ ] **Day 76**: Create CI Pipeline for ML Code Linting and Testing
- [ ] **Day 77**: Add Data Validation to CI Pipeline
- [ ] **Day 78**: Add Model Validation Tests to CI
- [ ] **Day 79**: Generate Model Performance Reports in CI with CML
- [ ] **Day 80**: Automate Model Registration in CI/CD
- [ ] **Day 81**: Automate Model Deployment with CD Pipeline
- [ ] **Day 82**: End-to-End ML CI/CD Pipeline
- [ ] **Day 83**: Automated Model Rollback with Health Checks
- [ ] **Day 84**: Production ML CI/CD with Multi-Environment Promotion

## Module 10: Orchestration & Kubernetes
- [ ] **Day 85**: Install Argo Workflows on Kubernetes
- [ ] **Day 86**: Create a Basic ML Training Workflow in Argo
- [ ] **Day 87**: Pass Data Between Argo Steps with Output Parameters and Branching
- [ ] **Day 88**: Create an ML Pipeline with Prefect
- [ ] **Day 89**: Parallel Model Training with Argo withItems Fan-Out
- [ ] **Day 90**: Automated Retraining with Argo CronWorkflow
- [ ] **Day 91**: Production ML Pipeline: Argo Workflows + MLflow on Kubernetes
- [ ] **Day 92**: Deploy an ML Model on Kubernetes
- [ ] **Day 93**: Configure HPA for ML Serving Deployment
- [ ] **Day 94**: Deploy a Model with KServe InferenceService
- [ ] **Day 95**: Kubeflow Pipelines - Install and Run a Basic KFP Pipeline
- [ ] **Day 96**: GitOps Model Deployment with ArgoCD

## Module 11: Capstone Projects
- [ ] **Day 97**: Capstone (1/4): End-to-End ML System - Train, Register, Serve
- [ ] **Day 98**: Capstone (2/4): Monitoring and Automated Retraining
- [ ] **Day 99**: Capstone (3/4): Orchestrate the Full MLOps Loop with Argo Workflows
- [ ] **Day 100**: Capstone (4/4): Close the Loop with Prometheus + Grafana Observability


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