Responsibilities
- Assist with dashboards, alerts, and log queries for critical services
- Keep incident timelines, action items, and runbooks accurate and discoverable
- Shadow on-call, help triage noise vs. actionable alerts, and document learnings
- Support SLO/error-budget reviews by gathering metrics and simple charts
- Help test failover or game-day drills alongside senior SREs