Getting Started
This guide will help you set up your environment and run your first ML pipeline.
Prerequisites
- Python 3.8 or higher
- pip package manager
- Git
- (Optional) Docker for containerized execution
- (Optional) AWS CLI for cloud deployments
Installation
1. Clone the Repository
git clone https://bitbucket.org/wilsonify/mlops-with-mlflow.git
cd mlops-with-mlflow
2. Create Virtual Environment
python3 -m venv .venv
source .venv/bin/activate # Linux/macOS
# or
.venv\Scripts\activate # Windows
3. Install Dependencies
pip install --upgrade pip
pip install -r requirements.txt
4. Install Local Libraries
The project includes several shared libraries:
pip install -e src/doe-library
pip install -e src/feature-library
pip install -e src/io-library
pip install -e src/metrics-library
pip install -e src/plot-library
Running Your First Pipeline
Option 1: Scikit-learn Pipeline (Fraud Detection)
cd src/mlflow-sklearn
# Install the package
pip install -e .
# Run the full pipeline
make all
This will execute:
- Create training dataset
- Convert CSV to Parquet
- Preprocessing
- Model training
- Scoring
- Evaluation
- Validation
Option 2: TensorFlow Pipeline (Image Classification)
cd src/mlflow-tf
# Install the package
pip install -e .
# Run the full pipeline
python run_pipeline.py all
Starting MLflow UI
To view your experiments:
# From the project root
mlflow ui --port 5000
Then open http://localhost:5000 in your browser.