Cost Optimization Guide for Bedrock Budgeteer

Overview

Comprehensive cost optimization strategies implemented in the production environment to minimize AWS costs while maintaining performance and reliability.

Production Architecture Cost Strategy

Focus

Predictable costs with auto-scaling performance for enterprise workloads

DynamoDB Optimization

  • Billing Mode: PROVISIONED (predictable baseline costs)
  • Auto-Scaling: Enabled with table-specific configurations
  • Encryption: AWS-managed encryption (SSE) by default, optional customer-managed KMS
  • Point-in-Time Recovery: Enabled (data protection)
  • Log Retention: 6 months (compliance and troubleshooting)

Auto-Scaling Configuration

# Table-specific scaling limits
scaling_configs = {
    "user_budgets": {
        "min_read": 5, "max_read": 40,
        "min_write": 5, "max_write": 40
    },
    "usage_tracking": {
        "min_read": 10, "max_read": 100,
        "min_write": 20, "max_write": 200  # Higher for usage events
    },
    "budget_alerts": {
        "min_read": 3, "max_read": 25,
        "min_write": 3, "max_write": 25
    },
    "audit_logs": {
        "min_read": 10, "max_read": 100,
        "min_write": 25, "max_write": 250  # Higher for audit events
    }
}

Auto-Scaling Strategy

Performance Targets

  • Target Utilization: 70% (optimal cost/performance balance)
  • Scale-Out Cooldown: 60 seconds (quick response to load)
  • Scale-In Cooldown: 60 seconds (prevent thrashing)

Scaling Behavior

  • Read Capacity: Scales based on consumed read capacity units
  • Write Capacity: Scales based on consumed write capacity units
  • Global Secondary Indexes: Auto-scaled independently

Cost Benefits

  • Baseline Costs: Predictable minimum capacity for budgeting
  • Peak Handling: Automatic scaling for traffic spikes
  • Cost Control: Maximum limits prevent runaway costs
  • Efficiency: 70% target utilization optimizes cost per operation

Cost Monitoring and Alerts

System Budget

Production Environment: $5000/month
  - No automatic cleanup (compliance requirement)
  - Alert at 80% ($4000)
  - Critical alert at 95% ($4750)
  - Emergency controls at 100% ($5000)

Cost Allocation Tags

All resources tagged for cost tracking:

  • Environment: production
  • Component: data-storage, security, monitoring, event-ingestion, workflow-orchestration
  • CostCenter: engineering-ops
  • BillingProject: bedrock-budget-control

Implementation Details

DynamoDB Cost Optimization

# Production billing mode
def _get_billing_mode(self) -> dynamodb.BillingMode:
    # Production environment uses provisioned capacity with auto-scaling
    return dynamodb.BillingMode.PROVISIONED

# Auto-scaling always enabled for production
self._add_auto_scaling(table, table_type)

Point-in-Time Recovery

def _get_point_in_time_recovery(self) -> bool:
    # Production environment always enables PITR for data protection
    return True

Encryption Configuration

def _get_encryption_config(self) -> Dict[str, Any]:
    """Get encryption configuration based on available KMS key"""
    if self.kms_key:
        # Use customer-managed KMS key if provided
        return {
            "encryption": dynamodb.TableEncryption.CUSTOMER_MANAGED,
            "encryption_key": self.kms_key
        }
    else:
        # Default to AWS-managed encryption (SSE)
        return {
            "encryption": dynamodb.TableEncryption.AWS_MANAGED
        }

Cost Optimization Best Practices

1. Right-Sizing Resources

  • Production: Provisioned capacity with auto-scaling for predictable workloads
  • Baseline: Conservative minimum capacity to handle normal load
  • Scaling: Automatic adjustment for traffic spikes and valleys

2. Data Lifecycle Management

  • Log Retention: 6-month retention for compliance and troubleshooting
  • Backup Strategy: PITR enabled for all tables
  • Archive Strategy: S3 lifecycle policies for long-term storage

3. Feature Configuration

  • Encryption: AWS-managed encryption by default, optional customer-managed KMS keys
  • Monitoring: Comprehensive monitoring for production reliability
  • Backup: Multi-layer backup strategy with PITR and automated snapshots

4. Resource Tagging

  • Cost Allocation: Clear attribution of costs to components and projects
  • Lifecycle Management: Automated management based on tags
  • Budget Tracking: Component-level cost monitoring

S3 Cost Optimization

Lifecycle Management

Active Logs (0-30 days): Standard Storage
Archive Transition (30-90 days): Infrequent Access
Cold Archive (90-365 days): Glacier
Deep Archive (365+ days): Deep Archive

Storage Classes

  • Standard: Active logs and frequently accessed data
  • Infrequent Access: Monthly reports and historical data
  • Glacier: Long-term compliance archives
  • Deep Archive: Regulatory compliance data (7+ years)

Monitoring Cost Optimization

Key Metrics

  • DynamoDB Consumed Capacity: Track utilization vs provisioned
  • Auto-Scaling Events: Monitor scaling frequency and triggers
  • S3 Storage Costs: Track storage class transitions and access patterns
  • Lambda Duration: Monitor function efficiency and cold starts

Optimization Opportunities

  • Unused Provisioned Capacity: Reduce baseline where consistently under-utilized
  • Inefficient Query Patterns: Optimize access patterns to reduce RCU/WCU
  • Over-Provisioned Auto-Scaling: Adjust max limits based on actual usage
  • S3 Lifecycle Policies: Optimize transition timing based on access patterns

Cost Savings Strategies

Auto-Scaling Benefits

  • Baseline Optimization: 20-40% savings vs peak-provisioned capacity
  • Optimal Utilization: 70% target maximizes cost efficiency
  • Burst Handling: Automatic scaling prevents over-provisioning

Storage Optimization

  • Lifecycle Policies: 50-80% savings on long-term storage costs
  • Compression: GZIP compression reduces storage and transfer costs
  • Data Partitioning: Optimized partition keys reduce query costs

Regular Optimization Tasks

Weekly

  • Review DynamoDB capacity utilization
  • Check for unused provisioned capacity
  • Monitor cost trends and anomalies
  • Analyze auto-scaling patterns

Monthly

  • Optimize baseline capacity settings
  • Review and adjust budget alerts
  • Analyze S3 storage class distribution
  • Review Lambda function performance metrics

Quarterly

  • Evaluate new AWS cost optimization features
  • Review access patterns for optimization opportunities
  • Update capacity planning based on growth trends
  • Assess Reserved Instance opportunities

Annually

  • Comprehensive cost optimization review
  • Reserved capacity planning for predictable workloads
  • Architecture review for cost efficiency improvements
  • Cost allocation model updates

Emergency Cost Controls

The system includes built-in cost protection mechanisms:

  • Budget Thresholds: Automatic alerts at 80%, 90%, and 95%
  • Emergency Stop: System-wide halt capability at 100% budget
  • Suspension Controls: Automatic user suspension for budget violations
  • Admin Override: Manual controls for emergency situations

Cost Monitoring Integration

  • Real-time Alerts: SNS notifications for budget threshold violations
  • Dashboard Monitoring: CloudWatch dashboards for cost tracking
  • Automated Responses: Step Functions workflows for cost control actions
  • Audit Trail: Complete audit log of all cost-related actions

Bedrock Budgeteer is an open-source project licensed under MIT.

This site uses Just the Docs, a documentation theme for Jekyll.