Deployment Strategies
Methods for safely deploying new versions without downtime.
Blue-Green Deployment
Prepare two identical environments (Blue/Green) and switch traffic all at once.
Pros: Fast rollback | Cons: Requires 2x resources
Rolling Update
Replaces instances sequentially. This is Cloud Run's default strategy.
Pros: Resource efficient | Cons: Rollback is the same as a new deployment
Rollback Method
# Rollback to a previous version on Cloud Run
gcloud run services update-traffic my-api \
--to-revisions=my-api-00001-abc=100 \
--region=asia-northeast3
# Cloudflare Pages rollback (from the dashboard)
# Deployments → Previous deployment → Rollback to this deployMonitoring Basics
Log Collection
Cloud Run automatically collects logs in Cloud Logging.
# Output structured logs in FastAPI
import logging
import json
logger = logging.getLogger(__name__)
@app.get("/api/users/{user_id}")
def get_user(user_id: int):
logger.info(json.dumps({
"action": "get_user",
"user_id": user_id,
"status": "success"
}))
return {"id": user_id}Key Metrics
Response Time
Check p50, p95, p99
Target: p95 < 200ms
Error Rate
Percentage of 5xx errors
Target: < 0.1%
Request Count
Requests per minute/hour
Understand traffic patterns
Resource Usage
CPU, Memory
Scaling criteria
Alert Setup
# Create alert policy on GCP (CLI)
gcloud alpha monitoring policies create \
--display-name="High Error Rate Alert" \
--condition-display-name="Error rate > 1%" \
--condition-filter='resource.type="cloud_run_revision" AND metric.type="run.googleapis.com/request_count"'Recommended Alert Channels
- Slack: Instant alerts to team channels
- Email: Daily summary reports
- PagerDuty: On-call escalation for critical incidents
Incident Response
Incident Response Process
Detect
Monitoring alerts or user reports
Classify
Determine severity (Critical/High/Medium/Low)
Mitigate
Rollback, scale up, block traffic, etc.
Resolve
Identify and fix the root cause
Retrospect
Write a postmortem, establish prevention measures
Rollback Decision Criteria
- Error rate increases by more than 1%
- p95 response time increases by more than 2x
- Core functionality is down for more than 5 minutes
- Estimated fix time is more than 15 minutes
Cost Management
Leveraging Free Tiers
| Service | Free Allowance |
|---|---|
| Cloudflare Pages | Unlimited requests, 500 builds/month |
| Cloud Run | 2M requests/month, 180,000 vCPU-seconds/month |
| Supabase | 500MB DB, 1GB storage |
| Neon | 0.5GB storage, 191 hours/month compute |
Budget Alert Setup
# GCP budget alert setup (via console)
# 1. Billing → Budgets & alerts → Create budget
# 2. Set thresholds at $10, $50, $100/month etc.
# 3. Email alerts at 50%, 90%, 100% of budgetWatch out for unexpected costs
- Data transfer (Egress) costs can be surprisingly high
- Log storage costs are not negligible
- Don't forget to shut down development instances
Operations Runbook Extension
Congratulations!
You have completed all learning cycles! You now have the fundamentals to build, deploy, and operate a web service. Try starting a real project using Agent Recipes.