Contributing
We welcome contributions to the MLOps platform! This guide explains how to contribute code, documentation, and bug fixes.
Getting Started
1. Fork the Repository
# Clone your fork
git clone git@bitbucket.org:YOUR_USERNAME/mlops-with-mlflow.git
cd mlops-with-mlflow
# Add upstream remote
git remote add upstream git@bitbucket.org:wilsonify/mlops-with-mlflow.git
2. Set Up Development Environment
This project uses uv for Python dependency management. See the UV Guide for comprehensive documentation.
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Sync all dependencies and create virtual environment
uv sync
# This installs all dependencies and packages in development mode
That’s it! No need to manually create venvs or install requirements.
Note: For detailed uv usage, see the UV Package Manager Guide.
3. Create a Feature Branch
git checkout -b feature/my-new-feature
Code Style
Python Style
We follow PEP 8 with some modifications:
- Line length: 120 characters (configured in pyproject.toml)
- Use type hints for all function signatures
- Docstrings: Google style
- Python 3.13+ required
def process_data(
input_path: str,
output_path: str,
batch_size: int = 32
) -> pd.DataFrame:
"""Process input data and save results.
Args:
input_path: Path to input data file.
output_path: Path to save processed data.
batch_size: Number of samples per batch.
Returns:
Processed DataFrame.
Raises:
FileNotFoundError: If input file doesn't exist.
"""
pass
Linting and Formatting
We use ruff for linting and formatting (configured in root pyproject.toml).
# Format code
uv run ruff format .
# Check code
uv run ruff check .
# Fix auto-fixable issues
uv run ruff check --fix .
Testing
Running Tests
# Run all tests
make test
# or
uv run pytest
# Run specific test file
uv run pytest tests/test_feature.py
# Run specific test function
uv run pytest tests/test_feature.py::test_function_name
# Run with coverage
uv run pytest --cov=src --cov-report=html
# Run tests in a specific package
cd src/doe-library
uv run pytest tests/
Writing Tests
All tests must:
- Use pytest framework
- Have meaningful test names (prefix with
test_) - Include docstrings explaining what is tested
- Achieve minimum 80% code coverage for new code
- Test both success and failure cases
Example test structure:
import pytest
from my_package import my_function
class TestMyFunction:
"""Tests for my_function."""
def test_returns_expected_value_for_valid_input(self):
"""Test that function returns expected value for valid input."""
result = my_function("valid_input")
assert result == "expected_output"
def test_raises_error_for_invalid_input(self):
"""Test that function raises ValueError for invalid input."""
with pytest.raises(ValueError, match="Invalid input"):
my_function("invalid_input")
@pytest.mark.parametrize("input,expected", [
("a", 1),
("b", 2),
("c", 3),
])
def test_handles_multiple_inputs(self, input, expected):
"""Test function with multiple input values."""
assert my_function(input) == expected
Test Organization
src/my-package/
├── my_package/
│ ├── __init__.py
│ └── module.py
└── tests/
├── __init__.py
├── test_module.py
└── conftest.py # Shared fixtures
Run specific test file
pytest tests/test_training.py
Run with coverage
pytest –cov=mlflow_sklearn –cov-report=html
### Writing Tests
```python
import pytest
from mlflow_sklearn.s04_train import train_model
class TestTraining:
@pytest.fixture
def sample_data(self):
"""Create sample training data."""
return pd.DataFrame({
'feature1': [1, 2, 3],
'feature2': [4, 5, 6],
'target': [0, 1, 0]
})
def test_train_model_returns_model(self, sample_data):
"""Test that training returns a valid model."""
model = train_model(sample_data)
assert model is not None
assert hasattr(model, 'predict')
Pull Request Process
1. Update Your Branch
git fetch upstream
git rebase upstream/master
2. Run Tests
# Run all tests
pytest
# Run linters
make lint
3. Create Pull Request
- Write a clear title and description
- Reference any related issues
- Add screenshots for UI changes
- Update documentation if needed
4. Code Review
- Address reviewer feedback
- Keep commits clean and atomic
- Squash commits before merge if requested
Commit Messages
Follow conventional commits:
type(scope): description
[optional body]
[optional footer]
Types:
feat: New featurefix: Bug fixdocs: Documentationstyle: Code style changesrefactor: Code refactoringtest: Adding testschore: Maintenance tasks
Examples:
feat(sklearn): add hyperparameter optimization
fix(tf): resolve memory leak in training loop
docs(readme): update installation instructions
Adding a New Pipeline
1. Create Package Structure
mkdir -p src/mlflow-newframework/mlflow_newframework/{pipeline,utils}
touch src/mlflow-newframework/mlflow_newframework/__init__.py
2. Implement Pipeline Classes
# mlflow_newframework/pipeline/base.py
from mlflow_tf.pipeline.base import Pipeline
class NewFrameworkPipeline(Pipeline):
def run(self):
# Implementation
pass
3. Add Tests
# tests/test_pipeline.py
def test_pipeline_runs():
pipeline = NewFrameworkPipeline(config)
result = pipeline.run()
assert result is not None
4. Add Documentation
Create documentation in docs/content/users/newframework-pipeline.md.
Getting Help
- Questions: Open a discussion on Bitbucket
- Bugs: Create an issue with reproduction steps
- Features: Propose in discussions first