Get Check Alerts

Overview

This endpoint returns all alerts that have been triggered by a specific check. This is useful for analyzing the alert history and patterns for individual checks to understand their reliability and performance trends.

Response Example

{
  "data": [
    {
      "id": "alert_789123456",
      "checkId": "check_api_health",
      "checkName": "Production API Health Check",
      "checkType": "api",
      "alertType": "check_failure",
      "severity": "critical",
      "status": "resolved",
      "message": "Check failed: HTTP 500 Internal Server Error",
      "triggeredAt": "2024-01-25T14:30:00.000Z",
      "resolvedAt": "2024-01-25T14:45:00.000Z",
      "duration": 900000,
      "location": {
        "id": "us-east-1",
        "name": "N. Virginia, USA"
      },
      "checkResult": {
        "id": "result_123456789",
        "responseTime": 5240,
        "statusCode": 500,
        "error": "Internal Server Error",
        "url": "https://api.production.com/health",
        "headers": {
          "content-type": "application/json",
          "server": "nginx/1.18.0"
        },
        "body": "{\"error\": \"Database connection failed\"}"
      },
      "alertChannel": {
        "id": "channel_email_prod",
        "name": "Production Team Email",
        "type": "email",
        "target": "prod-team@example.com"
      },
      "escalationLevel": 2,
      "acknowledgedBy": {
        "id": "user_456",
        "name": "Sarah DevOps",
        "email": "sarah@example.com"
      },
      "acknowledgedAt": "2024-01-25T14:32:00.000Z",
      "resolutionNotes": "Database connection pool exhausted. Restarted database service.",
      "tags": ["api", "production", "critical", "database"]
    },
    {
      "id": "alert_456789123",
      "checkId": "check_api_health",
      "checkName": "Production API Health Check",
      "checkType": "api",
      "alertType": "performance_degradation",
      "severity": "medium",
      "status": "resolved",
      "message": "Response time exceeded threshold: 2.1s > 2.0s",
      "triggeredAt": "2024-01-24T09:15:00.000Z",
      "resolvedAt": "2024-01-24T09:22:00.000Z",
      "duration": 420000,
      "location": {
        "id": "eu-west-1",
        "name": "Ireland, Europe"
      },
      "checkResult": {
        "id": "result_456789123",
        "responseTime": 2100,
        "statusCode": 200,
        "url": "https://api.production.com/health"
      },
      "alertChannel": {
        "id": "channel_slack_ops",
        "name": "Operations Slack",
        "type": "slack",
        "target": "#ops-alerts"
      },
      "escalationLevel": 0,
      "acknowledgedBy": null,
      "acknowledgedAt": null,
      "resolutionNotes": "Performance returned to normal after CDN cache warming.",
      "tags": ["api", "production", "performance"]
    }
  ],
  "meta": {
    "checkId": "check_api_health",
    "checkName": "Production API Health Check",
    "totalAlerts": 47,
    "currentPage": 1,
    "totalPages": 5,
    "limit": 10,
    "timeframe": {
      "from": "2024-01-01T00:00:00.000Z",
      "to": "2024-01-31T23:59:59.999Z"
    },
    "summary": {
      "totalFailures": 12,
      "totalRecoveries": 12,
      "totalPerformanceDegradations": 23,
      "averageResolutionTime": 480000,
      "longestOutage": 1800000,
      "uptimePercentage": 99.1,
      "mttr": 480,
      "mttd": 120
    }
  }
}

Common Use Cases

Check Reliability Assessment

Evaluate reliability and uptime of individual checks

GET /v1/check-alerts/{checkId}?period=30d&includeSummary=true

Performance Trend Analysis

Analyze performance degradation patterns over time

GET /v1/check-alerts/{checkId}?alertType=performance_degradation

Incident Response Analysis

Evaluate team response times and incident handling

GET /v1/check-alerts/{checkId}?severity=critical&status=resolved

Alert Optimization

Optimize alert thresholds and reduce false positives

GET /v1/check-alerts/{checkId}?period=90d&limit=100

Query Parameters

Time Range Filters

from (string): Start date for alerts (ISO 8601 format)
to (string): End date for alerts (ISO 8601 format)
period (string): Predefined time period (24h, 7d, 30d, 90d)

Example:

?from=2024-01-01T00:00:00Z&to=2024-01-31T23:59:59Z

Default: Last 30 days if no parameters provided

Filtering Options

alertType (string): Filter by alert type (check_failure, check_recovery, performance_degradation)
severity (string): Filter by severity level (low, medium, high, critical)
status (string): Filter by alert status (triggered, acknowledged, resolved)
location (string): Filter by monitoring location

Example:

?alertType=check_failure&severity=critical&status=resolved

Pagination & Sorting

page (integer): Page number (default: 1)
limit (integer): Number of alerts per page (default: 10, max: 100)
sortBy (string): Sort field (triggeredAt, duration, severity)
sortOrder (string): Sort order (asc, desc, default: desc)
includeSummary (boolean): Include alert summary statistics (default: true)

Default: Returns first 10 alerts sorted by most recent

Alert Summary Metrics

Reliability Metrics

Uptime Percentage: Overall availability during the time period
Total Failures: Number of failure alerts
Total Recoveries: Number of recovery alerts
MTTR: Mean Time To Recovery (seconds)
MTTD: Mean Time To Detection (seconds)

Performance Metrics

Performance Degradations: Number of performance alerts
Average Resolution Time: Mean time to resolve issues
Longest Outage: Duration of the longest outage
Escalation Rate: Percentage of alerts that escalated

Alert Patterns

Peak Alert Hours: Times when most alerts occur
Location Breakdown: Alerts by monitoring location
Severity Distribution: Breakdown of alert severities
Recovery Trends: How quickly issues are resolved

Team Response

Acknowledgment Rate: Percentage of alerts acknowledged
Response Time: Time from alert to acknowledgment
Resolution Notes: Documentation of fixes applied
Escalation Paths: How alerts were escalated

Alert Analysis Use Cases

Check Reliability Assessment

Evaluate the reliability of individual checks:

Calculate uptime percentages for SLA reporting
Identify patterns in check failures
Track improvement trends after optimizations
Compare reliability across different time periods

// Example: Calculate check reliability metrics
const alerts = response.data;
const summary = response.meta.summary;

const reliabilityScore = {
  uptime: summary.uptimePercentage,
  mttr: summary.mttr / 60, // Convert to minutes
  mttd: summary.mttd / 60, // Convert to minutes
  failureRate: (summary.totalFailures / summary.totalAlerts) * 100
};

console.log(`Check reliability: ${reliabilityScore.uptime}% uptime`);
console.log(`Average recovery time: ${reliabilityScore.mttr} minutes`);

Performance Trend Analysis

Analyze performance degradation patterns:

Track response time alert frequency
Identify performance regression patterns
Monitor seasonal performance trends
Detect gradual performance degradation

# Example: Analyze performance degradation trends
performance_alerts = [
    alert for alert in alerts 
    if alert['alertType'] == 'performance_degradation'
]

# Group by time of day to find performance patterns
from collections import defaultdict
hourly_performance_issues = defaultdict(int)

for alert in performance_alerts:
    hour = datetime.fromisoformat(alert['triggeredAt'].replace('Z', '+00:00')).hour
    hourly_performance_issues[hour] += 1

peak_hours = sorted(hourly_performance_issues.items(), key=lambda x: x[1], reverse=True)[:3]
print(f"Peak performance issue hours: {peak_hours}")

Incident Response Analysis

Evaluate incident response effectiveness:

Measure team response times
Track escalation patterns
Analyze resolution approaches
Identify areas for process improvement

# Example: Get critical alerts with response times
curl -X GET "https://api.checklyhq.com/v1/check-alerts/check_api_health?severity=critical&includeSummary=true" \
  -H "Authorization: Bearer cu_1234567890abcdef" \
  -H "X-Checkly-Account: 550e8400-e29b-41d4-a716-446655440000"

Alert Configuration Optimization

Optimize alert thresholds and configurations:

Analyze false positive rates
Evaluate alert noise levels
Optimize severity classifications
Fine-tune alert thresholds

// Example: Analyze alert patterns for optimization
const alertsByType = alerts.reduce((acc, alert) => {
  const key = `${alert.alertType}_${alert.severity}`;
  if (!acc[key]) {
    acc[key] = { count: 0, avgDuration: 0, escalations: 0 };
  }
  acc[key].count++;
  acc[key].avgDuration += alert.duration || 0;
  if (alert.escalationLevel > 0) acc[key].escalations++;
  return acc;
}, {});

// Calculate optimization insights
Object.keys(alertsByType).forEach(key => {
  const data = alertsByType[key];
  data.avgDuration = data.avgDuration / data.count;
  data.escalationRate = (data.escalations / data.count) * 100;
  
  if (data.escalationRate < 5 && data.avgDuration < 300000) {
    console.log(`Consider raising threshold for ${key}: low escalation rate and quick resolution`);
  }
});

Additional Examples

curl -X GET "https://api.checklyhq.com/v1/check-alerts/check_api_health?period=30d&includeSummary=true" \
  -H "Authorization: Bearer cu_1234567890abcdef" \
  -H "X-Checkly-Account: 550e8400-e29b-41d4-a716-446655440000"

This endpoint provides detailed alert history for individual checks, enabling deep analysis of check reliability, performance patterns, and incident response effectiveness.

Platform

Detect

Communicate

Resolve

Overview

Response Example

Common Use Cases

Check Reliability Assessment

Performance Trend Analysis

Incident Response Analysis

Alert Optimization

Query Parameters

Alert Summary Metrics

Reliability Metrics

Performance Metrics

Alert Patterns

Team Response

Alert Analysis Use Cases

Additional Examples

Platform

Detect

Communicate

Resolve

​Overview

​Response Example

​Common Use Cases

Check Reliability Assessment

Performance Trend Analysis

Incident Response Analysis

Alert Optimization

​Query Parameters

​Alert Summary Metrics

Reliability Metrics

Performance Metrics

Alert Patterns

Team Response

​Alert Analysis Use Cases

​Additional Examples

Overview

Response Example

Common Use Cases

Query Parameters

Alert Summary Metrics

Alert Analysis Use Cases

Additional Examples