- Description
- Curriculum
- Reviews
Machine Learning projects involve much more than simply training models. In real-world environments, organizations require reproducible workflows, experiment tracking, data versioning, automated pipelines, model management, and continuous deployment to ensure reliable and scalable AI solutions. Building ML Pipelines using MLflow & DVC is a comprehensive hands-on course designed to teach learners how to implement modern MLOps practices using industry-leading tools such as MLflow, DVC (Data Version Control), Docker, Flask, GitHub Actions, and AWS.
The course begins with the fundamentals of machine learning project development by covering data collection, project planning, and data preprocessing. Learners will understand how to organize datasets, perform exploratory data analysis (EDA), clean and prepare data, and establish structured workflows that improve collaboration and reproducibility throughout the machine learning lifecycle.
As the course progresses, students will learn how to implement MLflow for experiment tracking, model versioning, and performance monitoring. They will build and improve baseline machine learning models using feature engineering techniques such as Bag of Words (BoW) and TF-IDF, optimize feature selection, handle imbalanced datasets, perform hyperparameter tuning, and develop ensemble models using stacking techniques. Throughout the process, MLflow will be used to compare experiments, monitor model performance, and manage the complete experimentation lifecycle.
The course then introduces Data Version Control (DVC) to build scalable, modular, and reproducible machine learning pipelines. Learners will develop reusable pipeline components for data ingestion, preprocessing, model training, model evaluation, and model registration while understanding how DVC integrates with Git to manage datasets and automate ML workflows. These practices enable teams to collaborate efficiently and maintain consistent machine learning projects across different environments.
Moving beyond model development, the course demonstrates how to transform trained machine learning models into production-ready applications. Learners will build REST APIs using Flask, integrate machine learning models with a Google Chrome Extension, and create complete end-to-end AI solutions capable of delivering real-time predictions through user-friendly interfaces.
Finally, the course focuses on modern deployment and automation using Docker, GitHub Actions, and AWS. Students will containerize machine learning applications, automate testing and deployment through CI/CD pipelines, and deploy production-ready ML solutions on the cloud. By combining MLOps, DevOps, cloud computing, and automation, learners will gain practical experience in managing the complete machine learning lifecycle from data collection to production deployment.
By the end of this course, learners will possess the skills required to build scalable, reproducible, and production-ready machine learning pipelines using modern MLOps tools and industry best practices. Whether you’re an aspiring Machine Learning Engineer, Data Scientist, or MLOps Engineer, this course provides the practical knowledge needed to develop, manage, and deploy enterprise-grade machine learning systems.
What You’ll Learn
- Understand the complete machine learning lifecycle.
- Plan and organize end-to-end ML projects.
- Perform Exploratory Data Analysis (EDA) and data preprocessing.
- Track machine learning experiments using MLflow.
- Build and optimize baseline machine learning models.
- Apply feature engineering techniques including BoW and TF-IDF.
- Handle imbalanced datasets and improve model performance.
- Perform hyperparameter tuning and model stacking.
- Build modular ML pipelines using DVC.
- Version datasets and automate machine learning workflows.
- Develop reusable pipeline components for data ingestion, preprocessing, training, evaluation, and model registration.
- Build REST APIs using Flask for model serving.
- Integrate machine learning models into Google Chrome extensions.
- Containerize applications using Docker.
- Implement CI/CD pipelines with GitHub Actions.
- Deploy machine learning applications on AWS.
- Apply MLOps and DevOps best practices for production environments.
Target Audience
- Machine Learning Engineers
- Data Scientists
- MLOps Engineers
- AI Engineers
- Python Developers
- Software Developers
- Data Engineers
- DevOps Engineers
- Computer Science Students
- Anyone interested in building production-ready machine learning systems
Prerequisites
- Basic knowledge of Python programming.
- Familiarity with machine learning concepts is recommended.
- Basic understanding of Git and GitHub is helpful.
- A computer with Python installed.
- An AWS account (or willingness to create one) for cloud deployment exercises.
- A desire to learn modern MLOps tools and build scalable machine learning pipelines from development to production.
-
21. Introduction Project Planning
This lesson introduces project planning strategies for building organized, scalable, and reproducible machine learning workflows.
-
32. Data Preprocessing (EDA)
This lesson covers Exploratory Data Analysis (EDA) and preprocessing techniques used to clean, understand, and prepare data for machine learning models.
-
43. Setup MLFlow Server on AWS
This lesson demonstrates how to deploy and configure an MLflow Tracking Server on AWS for centralized experiment management.
-
51. Building Baseline Model
This lesson focuses on developing an initial baseline machine learning model to establish performance benchmarks.
-
62. Improving Baseline Model - BOW_ TFIDF
This lesson introduces Bag of Words (BoW) and TF-IDF feature extraction techniques to improve text-based machine learning models.
-
73. Improving Baseline Model - Max features
This lesson explains how selecting the optimal number of features can improve model performance and reduce computational complexity.
-
84. Improving Baseline Model - Handling Imbalanced
This lesson explores techniques for handling imbalanced datasets to improve model fairness and predictive accuracy.
-
95. Hyperparameter tuning with Mutiple Model
This lesson demonstrates how to optimize machine learning models through hyperparameter tuning and comparative evaluation.
-
106. Improving Baseline Model - Stacking Models
This lesson introduces model stacking techniques to combine multiple machine learning models and improve predictive performance.
-
111. Building ML Pipeline using DVC
This lesson introduces Data Version Control (DVC) and demonstrates how to build a reproducible end-to-end machine learning pipeline.
-
122. Data Ingestion Component
This lesson focuses on building the data ingestion component responsible for collecting and preparing data for the machine learning pipeline.
-
133. Data Preprocessing Component
This lesson demonstrates how to develop a preprocessing component that cleans and transforms raw data for model training.
-
144. Model Building Component
This lesson explains how to implement a reusable model training component within a machine learning pipeline.
-
155. Model Evaluation Component
This lesson demonstrates how to evaluate trained models using performance metrics within an automated ML pipeline.
-
166. Model Register Component
This lesson explains how to register, version, and manage trained machine learning models for production environments.
-
171. Flask API Implementation
This lesson demonstrates how to develop a Flask API that connects machine learning models with external applications.
-
182. Implementation of Chrome Plugin
This lesson demonstrates how to integrate the Flask API with a Google Chrome extension to create a complete machine learning application.
-
191. Dockerization
This lesson introduces Docker and demonstrates how to containerize machine learning applications for consistent deployment across different environments.
-
202. AWS CICD Deployment with Github Action
This lesson demonstrates how to automate machine learning deployment using GitHub Actions and AWS CI/CD pipelines.