Product/Fleet-Scale Operations

Fleet-Scale Operations

Consistent execution across hundreds or thousands of nodes. Execute commands, configs, and deployments across the fleet with canary and rolling strategies, failure thresholds, and per-agent tracking.

5-Step Deployment Wizard

1
Select Operation
Choose from 34+ operation types
2
Configure
Operation-specific forms with validation
3
Select Targets
All agents, groups, or individual
4
Strategy
All-at-once, Canary, or Rolling Waves
5
Review & Deploy
Pre-execution summary

34+ Operation Types

Deployment & Config
  • Config Deploy with dry-run + SafeApply
  • Command Exec with live streaming
  • Script Exec (Bash/Python/Ruby/Perl)
  • File Operations
  • App Deploy
Service & Package
  • Service Control
  • Package Management
  • Patch & Update + rollback
  • Cron & Timer management
System Admin
  • User Management
  • Group Management
  • System Config (hostname/timezone/locale/NTP)
  • Disk & Storage
  • Repository Management
  • Network Config
Security
  • SSH Key deployment
  • SSH Hardening with auto-rollback
  • Firewall rules (nftables/firewalld)
  • TLS Certificate deployment
  • CIS Benchmark scanning + auto-fix
  • Fail2Ban
  • Sudoers validation
  • Password policy
  • Port scan detection
DevOps
  • Docker/Podman management
  • Reverse proxy config
  • Environment variables
  • Git Deploy
  • Health Checks
  • Process Control
  • Backup
  • Database maintenance
  • Log shipping
Advanced
  • Config Drift detection
  • Kernel Tuning (sysctl/modules/GRUB)
  • Webhooks
  • Templates

Deployment Strategies

All-at-once
Immediate parallel execution on all targets
Canary
Test on 1-N agents first, promote if healthy
Rolling Waves
Progressive waves (25%→50%→75%→100%), pause between, stop-on-failure

Health Check Gates

Service RestartVerify service starts successfully
HTTP ProbeHTTP endpoint returns 2xx
Command ExecCustom command returns exit code 0
File ExistsVerify file presence
Process RunningCheck process is alive
TCP PortVerify port is accepting connections

Safety Features

  • Dry-run preview before execution
  • Config diff visualization (side-by-side)
  • SafeApply with automatic rollback on failure
  • Circuit breaker — auto-abort on repeated failures
  • Per-agent retry (max 3 attempts)
  • Abort in-progress operations
  • Retry failed agents without re-running successful ones
  • Time-travel snapshots for rollback capability

Fleet Users & SSH Management

Create/modify/delete users across all agents or groups
Group management with member assignment
SSH key deployment, removal, replacement per user
Operation history and tracking

Fleet Monitoring

  • Live monitor with real-time job tracking and progress bars
  • Agent-level progress: per-agent success/failure/in-progress
  • Log streaming with timestamps and agent identification
  • Job history: searchable, filterable (Running/Success/Failed/Scheduled)

Fleet Scheduling

Schedule Types
  • Once or Recurring
  • Hourly/Daily/Weekly/Monthly/Custom
Configuration
  • Enable/disable
  • Skip-if-running
  • Retry-on-failure
  • Tracking (next run, last run, success/failure counts)