Services Management

Overview

Services in Warrn represent the core components of your infrastructure that you want to monitor. Each service can have multiple integrations, health monitoring, and automated triage capabilities.

Service Registry

Centralized registry for all your services with rich metadata and categorization.

Health Monitoring

Real-time health status tracking with customizable alerting and escalation.

Service Registry

Service Structure

Each service in Warrn contains comprehensive metadata for effective monitoring and management:

interface Service {
  id: string;
  name: string;
  team_id: string;
  organization_id: string;
  description?: string;
  
  // Technology Stack
  language?: string;
  runtime?: string;
  cloud_provider?: string;
  
  // Infrastructure
  database?: string;
  cache_layer?: string;
  
  // Monitoring & Observability
  logs_provider?: string;
  tracing_provider?: string;
  alerting_provider?: string;
  monitoring_tool?: string;
  
  // Deployment
  deployment_tool?: string;
  
  // Configuration
  auto_triage_enabled: boolean;
  
  // Metrics (computed)
  health_status: "healthy" | "degraded" | "critical" | "unknown";
  alert_count: number;
  incident_count: number;
}

Service Creation

Basic Information

Define service name, description, and assign to a team for proper ownership.

Technology Stack

Specify programming language, runtime, and cloud provider for better categorization.

Infrastructure Setup

Configure database, caching, and other infrastructure components.

Monitoring Integration

Connect logging, tracing, alerting, and monitoring tools.

Enable Auto-Triage

Configure AI-powered automatic triage for faster incident resolution.

Service Categories

Services can be categorized using technology stack information:

Health Monitoring

Health Status Levels

Healthy

All systems operational, no active alerts.

Degraded

Some issues detected, service partially affected.

Critical

Major issues, service significantly impacted.

Unknown

Health status cannot be determined.

Health Determination Logic

const determineHealthStatus = (service: Service): HealthStatus => {
  const criticalAlerts = service.alerts?.filter(
    alert => alert.severity === 'critical' && alert.status === 'open'
  ).length || 0;
  
  const totalOpenAlerts = service.alerts?.filter(
    alert => alert.status === 'open'
  ).length || 0;

  if (criticalAlerts > 0) return "critical";
  if (totalOpenAlerts > 2) return "degraded";  
  if (totalOpenAlerts === 0) return "healthy";
  return "unknown";
};

Service Integrations

Integration Types

Services support multiple integration types for comprehensive monitoring:

Integration Management

// Create a new integration for a service
const createIntegration = async (serviceId: string, integrationData: {
  type: 'alert_collector' | 'logs' | 'tracing';
  provider: string;
  config: Record<string, any>;
}) => {
  const response = await api.post(`/services/${serviceId}/integrations`, {
    integration_type: integrationData.type,
    provider: integrationData.provider,
    configuration: integrationData.config,
    is_active: true
  });
  
  return response.data;
};

Service Dashboard

Overview Cards

The service detail view provides comprehensive insights through organized sections:

Service Overview

Basic service information
Technology stack details
Team ownership
Health status indicator

Quick Stats

Active alert count
Recent incident count
Integration status
Auto-triage status

Service Tabs

Service Search & Filtering

Advanced Search

The service management interface includes powerful search and filtering capabilities:

const filteredServices = services.filter(service => {
  const teamName = mappings.teams[service.team_id] || 'Unknown Team';
  
  const matchesSearch = 
    service.name.toLowerCase().includes(searchQuery.toLowerCase()) ||
    service.description?.toLowerCase().includes(searchQuery.toLowerCase()) ||
    teamName.toLowerCase().includes(searchQuery.toLowerCase());
    
  return matchesSearch;
});

Sorting Options

Name

Alphabetical sorting by service name.

Created Date

Sort by service creation date.

Health Status

Sort by current health status.

Alert Count

Sort by number of active alerts.

Multi-tenant Architecture

All service operations respect multi-tenant boundaries and ensure complete data isolation between organizations.

Organization Isolation

// Service queries automatically scope to user's organization
const { data: services } = useServices(); // Only returns current org's services

// Team mappings respect organization boundaries
const { mappings } = useEntityMappings(); // Only includes current org's teams

Team-based Access Control

Services are associated with teams for proper access control and ownership:

Team Assignment: Each service must be assigned to a team
Role-based Access: Team members have different permission levels
Cross-team Visibility: Services can be made visible across teams when needed

Best Practices

Service Organization

Logical Grouping

Group related services by business function or technical domain.

Consistent Naming

Use consistent naming conventions across services for better discoverability.

Rich Metadata

Fill out all relevant metadata fields for better categorization and searching.

Team Ownership

Ensure every service has clear team ownership for accountability.

Monitoring Strategy

Performance Optimization

Lazy Loading: Service details are loaded on-demand to improve initial page load
Caching: Entity mappings are cached with appropriate stale times
Pagination: Large service lists are paginated for better performance
Optimistic Updates: UI updates immediately while API calls complete in background

Integration Examples

Alert Collector Setup

# Create a Prometheus alert collector
POST /api/services/{serviceId}/integrations
{
  "integration_type": "alert_collector",
  "provider": "prometheus",
  "configuration": {
    "endpoint": "https://prometheus.example.com/api/v1/alerts",
    "interval_minutes": 5,
    "authentication": {
      "type": "bearer_token",
      "token": "your-prometheus-token"
    }
  },
  "is_active": true
}

Health Check Integration

# Configure service health endpoint
POST /api/services/{serviceId}/health-checks
{
  "endpoint": "https://api.example.com/health",
  "method": "GET",
  "interval_seconds": 30,
  "timeout_seconds": 10,
  "expected_status": 200
}

API Reference

Core Endpoints

# List services
GET /api/services

# Get service details
GET /api/services/{id}

# Create service
POST /api/services
{
  "name": "API Gateway",
  "team_id": "team-uuid",
  "description": "Main API gateway service",
  "language": "Node.js",
  "cloud_provider": "AWS"
}

# Update service
PATCH /api/services/{id}
{
  "auto_triage_enabled": true,
  "monitoring_tool": "DataDog"
}

# Delete service
DELETE /api/services/{id}

See the Services API documentation for complete endpoint details and examples.

Core Components

API Integration

​Overview

Service Registry

Health Monitoring

​Service Registry

​Service Structure

​Service Creation

​Service Categories

​Health Monitoring

​Health Status Levels

Healthy

Degraded

Critical

Unknown

​Health Determination Logic

​Service Integrations

​Integration Types

​Integration Management

​Service Dashboard

​Overview Cards

​Service Tabs

​Service Search & Filtering

​Advanced Search

​Sorting Options

Name

Created Date

Health Status

Alert Count

​Multi-tenant Architecture

​Organization Isolation

​Team-based Access Control

​Best Practices

​Service Organization

​Monitoring Strategy

​Performance Optimization

​Integration Examples

​Alert Collector Setup

​Health Check Integration

​API Reference

​Core Endpoints

Overview

Service Registry

Service Structure

Service Creation

Service Categories

Health Monitoring

Health Status Levels

Health Determination Logic

Service Integrations

Integration Types

Integration Management

Service Dashboard

Overview Cards

Service Tabs

Service Search & Filtering

Advanced Search

Sorting Options

Multi-tenant Architecture

Organization Isolation

Team-based Access Control

Best Practices

Service Organization

Monitoring Strategy

Performance Optimization

Integration Examples

Alert Collector Setup

Health Check Integration

API Reference

Core Endpoints