Documentation/Getting Started

Documentation

Cloud-hosted ScriptMesh — deploy a Docker agent on any server and start running scripts via REST API or dashboard in minutes.

Quick Start

ScriptMesh is cloud-hosted at api.getscriptmesh.com. No server setup needed. The four-step flow is: create account → deploy agent → register agent → run first script.

01
Create account
02
Deploy agent
03
Register agent
04
Run script

Architecture

ScriptMesh follows a hub-and-spoke model. A central orchestrator (cloud-hosted) controls a fleet of lightweight agents deployed wherever your workloads run — cloud servers, on-prem machines, edge nodes, or CI runners.

scriptmesh — architecture
Dashboard
REST API
CI/CD
api.getscriptmesh.com
Orchestrator
JWT AuthJob QueueAPSchedulerEvent BusMetrics
prod-us-east
production
/scripts
prod-eu-west
production
/scripts
staging-1
staging
/scripts
agents send heartbeats every 60s (outbound only — no inbound ports)

Outbound-only agents

Agents poll the orchestrator for jobs and send heartbeats. No inbound ports. No VPN. Deploy behind any NAT or firewall.

Tenant isolation

Every agent, job, schedule, and API key is scoped to your tenant. Multi-tenant at the DB level — no data bleeds between workspaces.

Async job queue

Scripts execute asynchronously. The orchestrator queues the job, the agent picks it up, stdout/stderr stream back on completion.

Deploy an Agent

Agents are lightweight Docker containers you run on any server — cloud, on-prem, or edge. They connect outbound to the orchestrator. No inbound ports, no VPN needed.

docker run — any server
docker run -d \
  --name scriptmesh-agent \
  --restart unless-stopped \
  -e AGENT_NAME=prod-server-1 \
  -e ORCHESTRATOR_URL=https://api.getscriptmesh.com \
  -e API_KEY=sm_live_YOUR_API_KEY \
  -v /path/to/scripts:/scripts \
  scriptmesh/agent:latest

Replace sm_live_YOUR_API_KEY with an API key from your dashboard under Settings → API Keys. Generate a dedicated key per agent for isolated revocation.

Register the Agent

When the agent container starts, it automatically registers with the orchestrator. You can also manually register via the REST API:

curl — register agent
curl -X POST https://api.getscriptmesh.com/register-agent \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_name": "prod-server-1",
    "url": "https://prod-server-1.internal:8001",
    "api_key": "agent-key-here",
    "tags": ["production", "backend"]
  }'

# Response
{ "status": "ok", "agent": "prod-server-1" }

Run Your First Script

Trigger a script on a registered agent. The script must be listed in the agent's manifest.

curl — trigger script
# Trigger async execution
curl -X POST https://api.getscriptmesh.com/trigger-script \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "prod-server-1",
    "run_script": "backup.sh",
    "params": { "target": "s3://my-bucket" },
    "async_exec": true
  }'

# Returns immediately with a job ID
{ "job_id": "job_a3f8c9d2", "status": "pending" }

# Poll the job result
curl https://api.getscriptmesh.com/jobs/job_a3f8c9d2 \
  -H "Authorization: Bearer $TOKEN"

# When complete
{
  "job_id": "job_a3f8c9d2",
  "status": "success",
  "exit_code": 0,
  "stdout": "Backup complete. 2.4GB archived to s3://my-bucket",
  "stderr": "",
  "duration_ms": 4231,
  "completed_at": "2026-03-10T14:32:11Z"
}

Fan-Out Execution

Fan-out lets you run the same script across your entire fleet simultaneously — or a filtered subset — with a single API call. Every target agent gets its own job record, so you can track, diff, and audit each individually.

Fan-out pattern — trigger script on all production agents
# 1. Get all online production agents
AGENTS=$(curl -s https://api.getscriptmesh.com/agents \
  -H "Authorization: Bearer $TOKEN" | \
  jq -r 'to_entries[] | select(.value.status=="online") |
          select(.value.tags[] | contains("production")) | .key')

# 2. Fan out — fire all in parallel, collect job IDs
JOB_IDS=()
for AGENT in $AGENTS; do
  JOB_ID=$(curl -s -X POST https://api.getscriptmesh.com/trigger-script \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d "{
      \"agent\": \"$AGENT\",
      \"run_script\": \"deploy.sh\",
      \"params\": { \"version\": \"v2.4.1\" },
      \"async_exec\": true
    }" | jq -r '.job_id')
  JOB_IDS+=("$JOB_ID")
  echo "Triggered $AGENT → $JOB_ID"
done

# 3. Wait and check all results
for JOB_ID in "${JOB_IDS[@]}"; do
  STATUS=$(curl -s https://api.getscriptmesh.com/jobs/$JOB_ID \
    -H "Authorization: Bearer $TOKEN" | jq -r '.status')
  echo "$JOB_ID: $STATUS"
done

Fan-out is built for workflows like rolling deploys, fleet-wide config pushes, and parallel health checks. Each job is independently tracked — if two agents succeed and one fails, you can pinpoint exactly which one without wading through interleaved logs.

Rolling deploys

Push a new version to every prod agent simultaneously, monitor exit codes per node.

Fleet config push

Distribute updated config files across 50+ agents in one API call.

Parallel health checks

Run diagnostics across regions simultaneously — first-class per-agent result diffing.

Real-World Examples

End-to-end examples showing ScriptMesh in common production scenarios.

Nightly backup across all servers

Schedule a backup script to run at 2 AM on every production agent. A Slack notification fires when any job completes or fails.

nightly-backup.sh (on agent)
#!/bin/bash
# /scripts/backup.sh  — mounted into agent container
set -euo pipefail

TARGET="${target:-s3://my-backups/$(hostname)}"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
ARCHIVE="/tmp/backup_${TIMESTAMP}.tar.gz"

echo "Archiving /var/app/data to ${TARGET}/${TIMESTAMP}.tar.gz ..."
tar czf "$ARCHIVE" /var/app/data/

aws s3 cp "$ARCHIVE" "${TARGET}/${TIMESTAMP}.tar.gz" --storage-class STANDARD_IA
rm -f "$ARCHIVE"

echo "Done. Size: $(du -sh $ARCHIVE | cut -f1)"
curl — create schedule for all prod agents
# Repeat for each production agent
for AGENT in prod-us-east prod-eu-west prod-ap-south; do
  curl -X POST https://api.getscriptmesh.com/schedules \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d "{
      \"agent\": \"$AGENT\",
      \"script\": \"backup.sh\",
      \"cron_expression\": \"0 2 * * *\",
      \"params\": { \"target\": \"s3://my-backups/$AGENT\" },
      \"webhook_url\": \"https://hooks.slack.com/services/...\"
    }"
done

Multi-region deploy rollout

Trigger a canary deploy: run deploy.sh on staging, validate exit code, then fan-out to production only on success.

deploy-pipeline.sh
#!/bin/bash
TOKEN="$SM_TOKEN"
VERSION="$1"

# Step 1: deploy to staging first
echo "→ Deploying $VERSION to staging..."
JOB=$(curl -s -X POST https://api.getscriptmesh.com/trigger-script \
  -H "Authorization: Bearer $TOKEN" \
  -d "{"agent":"staging-1","run_script":"deploy.sh",
       "params":{"version":"$VERSION"},"async_exec":true}" | jq -r '.job_id')

# Poll until complete (max 5 mins)
for i in $(seq 1 60); do
  STATUS=$(curl -s https://api.getscriptmesh.com/jobs/$JOB \
    -H "Authorization: Bearer $TOKEN" | jq -r '.status')
  [[ "$STATUS" != "pending" && "$STATUS" != "running" ]] && break
  sleep 5
done

if [[ "$STATUS" != "success" ]]; then
  echo "✗ Staging deploy failed — aborting production rollout"
  exit 1
fi

# Step 2: fan-out to all production agents
echo "✓ Staging OK → deploying $VERSION to production..."
for AGENT in prod-us-east prod-eu-west prod-ap-south; do
  curl -s -X POST https://api.getscriptmesh.com/trigger-script \
    -H "Authorization: Bearer $TOKEN" \
    -d "{"agent":"$AGENT","run_script":"deploy.sh",
         "params":{"version":"$VERSION"},"async_exec":true}" | \
    jq -r '"Triggered (.job_id) on $AGENT"'
done

Fleet health check + PagerDuty alert

Run a diagnostic script across the fleet every 5 minutes. Any non-zero exit code triggers a PagerDuty incident via the integration event bus.

curl — schedule health check with PagerDuty webhook
# Create a PagerDuty integration first
curl -X POST https://api.getscriptmesh.com/integrations \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "type": "pagerduty",
    "name": "fleet-alerts",
    "config": {
      "routing_key": "pd_live_...",
      "events": ["job.failed"]
    }
  }'

# Schedule health_check.py every 5 mins on each agent
for AGENT in prod-us-east prod-eu-west prod-ap-south; do
  curl -X POST https://api.getscriptmesh.com/schedules \
    -H "Authorization: Bearer $TOKEN" \
    -d "{
      \"agent\": \"$AGENT\",
      \"script\": \"health_check.py\",
      \"cron_expression\": \"*/5 * * * *\"
    }"
done

Script Manifest

Each agent has a script_manifest.json that whitelists which scripts it may execute. Any request for an unlisted script is rejected without execution.

script_manifest.json
{
  "scripts": [
    {
      "name": "backup.sh",
      "path": "/scripts/backup.sh",
      "description": "Archive logs to S3"
    },
    {
      "name": "deploy.sh",
      "path": "/scripts/deploy.sh",
      "description": "Pull and restart latest container"
    },
    {
      "name": "health_check.py",
      "path": "/scripts/health_check.py",
      "description": "Check service endpoints"
    }
  ]
}

Mount the manifest file into the agent container at /agent/script_manifest.json. The agent reloads the manifest on every heartbeat cycle (every 60 seconds).

Authentication

ScriptMesh supports two authentication methods:

JWT Bearer Tokens

For human users and short-lived sessions. Tokens expire in 15 minutes. Use the refresh token to obtain new access tokens.

get JWT token
# Login
curl -X POST https://api.getscriptmesh.com/auth/login \
  -H "Content-Type: application/json" \
  -d '{ "email": "you@company.com", "password": "your_password" }'

# Response
{
  "access_token": "eyJhbGci...",
  "refresh_token": "eyJhbGci...",
  "token_type": "bearer"
}

# Use in requests
curl https://api.getscriptmesh.com/agents \
  -H "Authorization: Bearer eyJhbGci..."

API Keys

For programmatic access, CI/CD pipelines, and agent authentication. Keys are prefixed sm_live_ and shown only once at creation.

create API key
# Create a key
curl -X POST https://api.getscriptmesh.com/auth/api-keys \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "name": "CI Pipeline", "expires_days": 90 }'

# Returns the raw key ONCE — store it securely
{
  "id": "key_...",
  "key": "sm_live_xyz...",
  "key_prefix": "sm_live_xyz",
  "warning": "Store this key securely. It will not be shown again."
}

API Reference

Interactive docs are available at https://api.getscriptmesh.com/docs (Swagger UI). Key endpoints:

POST/auth/registerCreate account and tenant
POST/auth/loginAuthenticate and get JWT
POST/auth/verify-emailVerify email with 6-digit code
POST/register-agentRegister an agent with the orchestrator
GET/agentsList all agents and their health status
POST/trigger-scriptExecute a script asynchronously
GET/jobsList job executions with filters
GET/jobs/{job_id}Get full result for a specific job
GET/schedulesList cron schedules
POST/schedulesCreate a new cron schedule
GET/metrics/jsonAgent health and execution metrics

Environment Variables

VariableRequiredDescription
AGENT_NAMEYesUnique name for this agent. Shown in dashboard.
ORCHESTRATOR_URLYesURL of the orchestrator. Default: https://api.getscriptmesh.com
API_KEYYesScriptMesh API key (sm_live_...) for authentication
AGENT_PORTNoPort the agent listens on. Default: 8001
HEARTBEAT_INTERVALNoSeconds between health heartbeats. Default: 60
SCRIPTS_DIRNoDirectory containing executable scripts. Default: /scripts
JWT_SECRETNoOrchestrator JWT signing secret. Set in cloud — auto-managed on Pro.
RESEND_API_KEYNoResend API key for transactional email. Auto-managed on Pro.

Scheduling

Create cron schedules via the API. Schedules run via APScheduler with missed-fire detection.

curl — create schedule
curl -X POST https://api.getscriptmesh.com/schedules \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "prod-server-1",
    "script": "backup.sh",
    "cron_expression": "0 2 * * *",
    "params": { "target": "s3://my-bucket" },
    "webhook_url": "https://hooks.example.com/job-done",
    "timeout": 600
  }'

# Response
{
  "id": "sched_abc123",
  "cron_expression": "0 2 * * *",
  "next_fire": "2026-03-11T02:00:00Z"
}

Cron expressions use standard 5-field format: minute hour day-of-month month day-of-week. The webhook URL receives a POST with the full job result when the script completes.

Integrations

Configure integrations in the dashboard under Settings → Integrations, or via the API. All integrations are per-tenant and scoped to specific event types.

curl — create Slack integration
curl -X POST https://api.getscriptmesh.com/integrations \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "slack",
    "name": "ops-alerts",
    "config": {
      "webhook_url": "https://hooks.slack.com/services/...",
      "events": ["job.failed", "agent.offline"]
    }
  }'

Supported integration types: slack, discord, teams, pagerduty, datadog, splunk, prometheus, webhook.

Security Best Practices

Use one API key per agent

Generate a dedicated API key for each agent. If an agent is compromised, revoke only its key without affecting the rest of your fleet.

Store API keys in secrets managers

Never hardcode API keys. Use AWS Secrets Manager, HashiCorp Vault, Docker secrets, or Kubernetes Secrets to inject keys at runtime.

Validate your script manifest

Keep your manifest minimal — only whitelist scripts you actively use. Review it regularly to remove scripts that are no longer needed.

Rotate compromised keys immediately

If an API key is exposed, revoke it from the dashboard immediately. Generate a new key and update the agent environment variable. The old key stops working instantly.

Use TLS for agent-to-orchestrator traffic

On Pro, all traffic goes over HTTPS to api.getscriptmesh.com with TLS 1.3. On self-hosted, always put a reverse proxy with a valid TLS certificate in front of the orchestrator.

Troubleshooting

Common issues and how to fix them.

Agent shows as offline immediately after starting

Possible causes

  • ORCHESTRATOR_URL is wrong or unreachable from the agent host
  • API_KEY is invalid or has been revoked
  • Agent container exited — check docker logs scriptmesh-agent
fix
# Check agent logs
docker logs scriptmesh-agent --tail 50

# Verify the orchestrator is reachable from the agent host
curl -I https://api.getscriptmesh.com/health

# Confirm the API key works
curl https://api.getscriptmesh.com/agents \
  -H "Authorization: Bearer sm_live_YOUR_KEY"

Trigger returns 403 — script not in manifest

Possible causes

  • Script name in the request doesn't match the name in script_manifest.json exactly
  • The manifest file isn't mounted at /agent/script_manifest.json
  • Agent hasn't reloaded the manifest yet (wait up to 60s for next heartbeat)
fix
# Check which scripts the agent knows about
curl https://api.getscriptmesh.com/get-scripts?agent=prod-server-1 \
  -H "Authorization: Bearer $TOKEN"

# If empty, the manifest isn't loaded — verify the Docker mount
docker inspect scriptmesh-agent | jq '.[0].Mounts'

Job stays in 'pending' status indefinitely

Possible causes

  • Agent went offline after the job was queued
  • Script is blocking on stdin or waiting for input
  • Timeout was set too low and the job was silently killed
fix
# Check the agent's current status
curl https://api.getscriptmesh.com/agents \
  -H "Authorization: Bearer $TOKEN" | jq '.["prod-server-1"].status'

# Check stderr for clues
curl https://api.getscriptmesh.com/jobs/job_abc123 \
  -H "Authorization: Bearer $TOKEN" | jq '.stderr'

Webhook not receiving job completion events

Possible causes

  • webhook_url is not publicly reachable (localhost URLs won't work)
  • The receiving endpoint doesn't return a 2xx status code
  • Job failed before reaching the webhook dispatch step
fix
# Use a service like https://webhook.site to test
# Update the schedule with a test webhook URL
curl -X POST https://api.getscriptmesh.com/schedules \
  -H "Authorization: Bearer $TOKEN" \
  -d '{ ..., "webhook_url": "https://webhook.site/your-uuid" }'

401 Unauthorized — token expired

Possible causes

  • Access tokens expire after 15 minutes
  • Refresh token has also expired (7 day TTL)
fix
# Refresh your access token (valid for 15 min)
curl -X POST https://api.getscriptmesh.com/auth/refresh \
  -H "Content-Type: application/json" \
  -d '{ "refresh_token": "eyJhbGci..." }'

# If refresh token is also expired, log in again
curl -X POST https://api.getscriptmesh.com/auth/login \
  -d '{ "email": "you@company.com", "password": "..." }'