Security

This guide covers security best practices for the MLOps platform.

Overview

Security considerations:

  1. Access Control: Who can access what?
  2. Network Security: Protecting communication
  3. Data Security: Protecting sensitive data
  4. Secrets Management: Handling credentials

AWS IAM

Principle of Least Privilege

The pipeline IAM role has minimal permissions:

# S3 access - only to specific bucket
resource "aws_iam_role_policy" "mlops_s3_access" {
  policy = jsonencode({
    Statement = [{
      Effect = "Allow"
      Action = [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ]
      Resource = [
        "arn:aws:s3:::064592191516-mlflow",
        "arn:aws:s3:::064592191516-mlflow/*"
      ]
    }]
  })
}

IAM Best Practices

  1. Use roles, not users for EC2 instances
  2. Enable MFA for console access
  3. Rotate credentials regularly
  4. Audit with CloudTrail

Network Security

Security Groups

Restrict inbound traffic:

resource "aws_security_group" "mlops_pipeline" {
  # SSH - restrict to your IP
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["YOUR_IP/32"]
  }

  # MLflow - internal only or VPN
  ingress {
    from_port   = 5000
    to_port     = 5000
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/8"]  # VPC only
  }

  # Outbound - allow all
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

VPN Access

For production, use VPN or bastion host:

                    Internet
                        │
                        ▼
            ┌───────────────────┐
            │   Bastion Host    │
            │   (Public Subnet) │
            └─────────┬─────────┘
                      │ SSH
                      ▼
            ┌───────────────────┐
            │   Pipeline EC2    │
            │  (Private Subnet) │
            └───────────────────┘

SSH Key Management

# Generate strong key
ssh-keygen -t ed25519 -C "mlops-pipeline"

# Set proper permissions
chmod 600 ~/.ssh/mlops-pipeline-key

# Use SSH agent
eval $(ssh-agent -s)
ssh-add ~/.ssh/mlops-pipeline-key

Data Security

S3 Bucket Security

# Enable versioning
resource "aws_s3_bucket_versioning" "mlflow" {
  bucket = aws_s3_bucket.mlflow_artifacts.id
  versioning_configuration {
    status = "Enabled"
  }
}

# Enable encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "mlflow" {
  bucket = aws_s3_bucket.mlflow_artifacts.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

# Block public access
resource "aws_s3_bucket_public_access_block" "mlflow" {
  bucket = aws_s3_bucket.mlflow_artifacts.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

EBS Encryption

All EBS volumes are encrypted:

block_device_mappings {
  device_name = "/dev/sda1"
  ebs {
    volume_size           = 100
    volume_type           = "gp3"
    encrypted             = true
    delete_on_termination = true
  }
}

Data at Rest

  • S3: Server-side encryption (AES-256)
  • EBS: Encrypted volumes
  • MLflow DB: SQLite with encrypted volume

Data in Transit

  • HTTPS for all API calls
  • SSH for remote access
  • TLS for MLflow (when exposed)

Secrets Management

AWS Secrets Manager

Store sensitive credentials:

# Create secret
aws secretsmanager create-secret \
    --name mlops/database-password \
    --secret-string "your-password"

# Retrieve in code
import boto3

def get_secret(secret_name):
    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId=secret_name)
    return response['SecretString']

Environment Variables

Don’t hardcode secrets:

# Bad
API_KEY = "sk-1234567890"

# Good
import os
API_KEY = os.environ.get("API_KEY")

.gitignore

Ensure secrets aren’t committed:

# Secrets
.env
*.pem
credentials.json
secrets/

Audit and Compliance

CloudTrail

Enable CloudTrail for audit logs:

aws cloudtrail create-trail \
    --name mlops-audit \
    --s3-bucket-name 064592191516-audit-logs

Access Logging

Enable S3 access logging:

resource "aws_s3_bucket_logging" "mlflow" {
  bucket = aws_s3_bucket.mlflow_artifacts.id

  target_bucket = aws_s3_bucket.logs.id
  target_prefix = "s3-access-logs/"
}

Security Scanning

Scan for vulnerabilities:

# Scan Python dependencies
pip install safety
safety check -r requirements.txt

# Scan Docker images
trivy image mlflow-sklearn:latest

# Scan Terraform
tfsec deploy/aws/064592191516/us-east-1/01-infrastructure/

Checklist

Before Deployment

  • Review IAM permissions
  • Restrict security group rules
  • Enable encryption (S3, EBS)
  • Configure CloudTrail
  • Set up VPN/bastion if public

Regular Maintenance

  • Rotate SSH keys quarterly
  • Review CloudTrail logs weekly
  • Update dependencies monthly
  • Security scan before releases
  • Audit IAM permissions quarterly

Incident Response

  1. Detect: Monitor CloudWatch alerts
  2. Contain: Isolate affected resources
  3. Investigate: Review CloudTrail logs
  4. Remediate: Fix vulnerability
  5. Document: Update runbooks