Shared Libraries

The MLOps platform includes several shared libraries that provide common functionality across pipelines.

Overview

LibraryPurpose
doe-libraryDesign of Experiments utilities
feature-libraryFeature engineering functions
io-libraryInput/Output utilities
metrics-libraryMetrics computation
plot-libraryVisualization utilities

doe-library

Design of Experiments library for hyperparameter optimization.

Installation

pip install -e src/doe-library

Usage

from doe_library import create_experiment_grid

# Create parameter grid
grid = create_experiment_grid(
    learning_rate=[0.001, 0.01, 0.1],
    batch_size=[16, 32, 64],
    epochs=[10, 20, 50]
)

# Run experiments
for params in grid:
    run_experiment(params)

feature-library

Feature engineering and data splitting utilities.

Installation

pip install -e src/feature-library

Usage

from feature_library.get_splits import get_all_splits

# Split data into train/test/validate
(x_train, y_train), (x_test, y_test), (x_val, y_val) = get_all_splits(data)

Functions

FunctionDescription
get_all_splits()Split data into train/test/validate
create_features()Generate engineered features
encode_categorical()Encode categorical variables

io-library

Input/Output utilities for S3 and local filesystem.

Installation

pip install -e src/io-library

Usage

from io_library.read_from_s3 import read_csv_from_s3, read_parquet_from_s3
from io_library.write_to_s3 import write_csv_to_s3, write_parquet_to_s3

# Read from S3
df = read_csv_from_s3("my-bucket", "path/to/file.csv")

# Write to S3
write_parquet_to_s3(df, "my-bucket", "path/to/output.parquet")

Classes

ConfigDataClass

Configuration management with JSON serialization:

from io_library.ConfigDataClass import Config

# Create config
config = Config(
    n_splits=5,
    learning_rate=0.001
)

# Save to JSON
config.to_json("config.json")

# Load from S3
config = Config.from_json_s3("bucket", "config.json")

metrics-library

Metrics computation for model evaluation.

Installation

pip install -e src/metrics-library

Usage

from metrics_library import compute_classification_metrics

metrics = compute_classification_metrics(y_true, y_pred)
# Returns: accuracy, precision, recall, f1_score

Functions

FunctionDescription
compute_classification_metrics()Classification metrics
compute_regression_metrics()Regression metrics (MSE, MAE, R²)
compute_confusion_matrix()Confusion matrix

plot-library

Visualization utilities for model analysis.

Installation

pip install -e src/plot-library

Usage

from plot_library import plot_confusion_matrix, plot_roc_curve

# Plot confusion matrix
fig = plot_confusion_matrix(y_true, y_pred, classes=['Normal', 'Fraud'])
fig.savefig('confusion_matrix.png')

# Plot ROC curve
fig = plot_roc_curve(y_true, y_scores)
fig.savefig('roc_curve.png')

Available Plots

  • Confusion matrix
  • ROC curve
  • Precision-Recall curve
  • Feature importance
  • Learning curves
  • Distribution plots

Extending Libraries

Adding New Functions

  1. Create a new module in the library:
# src/metrics-library/metrics_library/new_metric.py
def compute_custom_metric(y_true, y_pred):
    """Compute custom metric."""
    return custom_value
  1. Export in __init__.py:
from .new_metric import compute_custom_metric
  1. Add tests:
# tests/test_new_metric.py
def test_custom_metric():
    result = compute_custom_metric([0, 1], [0, 1])
    assert result == expected_value
  1. Update documentation.