Security
This guide covers security best practices for the MLOps platform.
Overview
Security considerations:
- Access Control: Who can access what?
- Network Security: Protecting communication
- Data Security: Protecting sensitive data
- Secrets Management: Handling credentials
AWS IAM
Principle of Least Privilege
The pipeline IAM role has minimal permissions:
# S3 access - only to specific bucket
resource "aws_iam_role_policy" "mlops_s3_access" {
policy = jsonencode({
Statement = [{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
]
Resource = [
"arn:aws:s3:::064592191516-mlflow",
"arn:aws:s3:::064592191516-mlflow/*"
]
}]
})
}
IAM Best Practices
- Use roles, not users for EC2 instances
- Enable MFA for console access
- Rotate credentials regularly
- Audit with CloudTrail
Network Security
Security Groups
Restrict inbound traffic:
resource "aws_security_group" "mlops_pipeline" {
# SSH - restrict to your IP
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["YOUR_IP/32"]
}
# MLflow - internal only or VPN
ingress {
from_port = 5000
to_port = 5000
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"] # VPC only
}
# Outbound - allow all
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
VPN Access
For production, use VPN or bastion host:
Internet
│
▼
┌───────────────────┐
│ Bastion Host │
│ (Public Subnet) │
└─────────┬─────────┘
│ SSH
▼
┌───────────────────┐
│ Pipeline EC2 │
│ (Private Subnet) │
└───────────────────┘
SSH Key Management
# Generate strong key
ssh-keygen -t ed25519 -C "mlops-pipeline"
# Set proper permissions
chmod 600 ~/.ssh/mlops-pipeline-key
# Use SSH agent
eval $(ssh-agent -s)
ssh-add ~/.ssh/mlops-pipeline-key
Data Security
S3 Bucket Security
# Enable versioning
resource "aws_s3_bucket_versioning" "mlflow" {
bucket = aws_s3_bucket.mlflow_artifacts.id
versioning_configuration {
status = "Enabled"
}
}
# Enable encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "mlflow" {
bucket = aws_s3_bucket.mlflow_artifacts.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
# Block public access
resource "aws_s3_bucket_public_access_block" "mlflow" {
bucket = aws_s3_bucket.mlflow_artifacts.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
EBS Encryption
All EBS volumes are encrypted:
block_device_mappings {
device_name = "/dev/sda1"
ebs {
volume_size = 100
volume_type = "gp3"
encrypted = true
delete_on_termination = true
}
}
Data at Rest
- S3: Server-side encryption (AES-256)
- EBS: Encrypted volumes
- MLflow DB: SQLite with encrypted volume
Data in Transit
- HTTPS for all API calls
- SSH for remote access
- TLS for MLflow (when exposed)
Secrets Management
AWS Secrets Manager
Store sensitive credentials:
# Create secret
aws secretsmanager create-secret \
--name mlops/database-password \
--secret-string "your-password"
# Retrieve in code
import boto3
def get_secret(secret_name):
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
return response['SecretString']
Environment Variables
Don’t hardcode secrets:
# Bad
API_KEY = "sk-1234567890"
# Good
import os
API_KEY = os.environ.get("API_KEY")
.gitignore
Ensure secrets aren’t committed:
# Secrets
.env
*.pem
credentials.json
secrets/
Audit and Compliance
CloudTrail
Enable CloudTrail for audit logs:
aws cloudtrail create-trail \
--name mlops-audit \
--s3-bucket-name 064592191516-audit-logs
Access Logging
Enable S3 access logging:
resource "aws_s3_bucket_logging" "mlflow" {
bucket = aws_s3_bucket.mlflow_artifacts.id
target_bucket = aws_s3_bucket.logs.id
target_prefix = "s3-access-logs/"
}
Security Scanning
Scan for vulnerabilities:
# Scan Python dependencies
pip install safety
safety check -r requirements.txt
# Scan Docker images
trivy image mlflow-sklearn:latest
# Scan Terraform
tfsec deploy/aws/064592191516/us-east-1/01-infrastructure/
Checklist
Before Deployment
- Review IAM permissions
- Restrict security group rules
- Enable encryption (S3, EBS)
- Configure CloudTrail
- Set up VPN/bastion if public
Regular Maintenance
- Rotate SSH keys quarterly
- Review CloudTrail logs weekly
- Update dependencies monthly
- Security scan before releases
- Audit IAM permissions quarterly
Incident Response
- Detect: Monitor CloudWatch alerts
- Contain: Isolate affected resources
- Investigate: Review CloudTrail logs
- Remediate: Fix vulnerability
- Document: Update runbooks