Documentation
Cloud-hosted ScriptMesh — deploy a Docker agent on any server and start running scripts via REST API or dashboard in minutes.
Quick Start
ScriptMesh is cloud-hosted at api.getscriptmesh.com. No server setup needed. The four-step flow is: create account → deploy agent → register agent → run first script.
Architecture
ScriptMesh follows a hub-and-spoke model. A central orchestrator (cloud-hosted) controls a fleet of lightweight agents deployed wherever your workloads run — cloud servers, on-prem machines, edge nodes, or CI runners.
Outbound-only agents
Agents poll the orchestrator for jobs and send heartbeats. No inbound ports. No VPN. Deploy behind any NAT or firewall.
Tenant isolation
Every agent, job, schedule, and API key is scoped to your tenant. Multi-tenant at the DB level — no data bleeds between workspaces.
Async job queue
Scripts execute asynchronously. The orchestrator queues the job, the agent picks it up, stdout/stderr stream back on completion.
Deploy an Agent
Agents are lightweight Docker containers you run on any server — cloud, on-prem, or edge. They connect outbound to the orchestrator. No inbound ports, no VPN needed.
docker run -d \ --name scriptmesh-agent \ --restart unless-stopped \ -e AGENT_NAME=prod-server-1 \ -e ORCHESTRATOR_URL=https://api.getscriptmesh.com \ -e API_KEY=sm_live_YOUR_API_KEY \ -v /path/to/scripts:/scripts \ scriptmesh/agent:latest
Replace sm_live_YOUR_API_KEY with an API key from your dashboard under Settings → API Keys. Generate a dedicated key per agent for isolated revocation.
Register the Agent
When the agent container starts, it automatically registers with the orchestrator. You can also manually register via the REST API:
curl -X POST https://api.getscriptmesh.com/register-agent \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent_name": "prod-server-1",
"url": "https://prod-server-1.internal:8001",
"api_key": "agent-key-here",
"tags": ["production", "backend"]
}'
# Response
{ "status": "ok", "agent": "prod-server-1" }Run Your First Script
Trigger a script on a registered agent. The script must be listed in the agent's manifest.
# Trigger async execution
curl -X POST https://api.getscriptmesh.com/trigger-script \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "prod-server-1",
"run_script": "backup.sh",
"params": { "target": "s3://my-bucket" },
"async_exec": true
}'
# Returns immediately with a job ID
{ "job_id": "job_a3f8c9d2", "status": "pending" }
# Poll the job result
curl https://api.getscriptmesh.com/jobs/job_a3f8c9d2 \
-H "Authorization: Bearer $TOKEN"
# When complete
{
"job_id": "job_a3f8c9d2",
"status": "success",
"exit_code": 0,
"stdout": "Backup complete. 2.4GB archived to s3://my-bucket",
"stderr": "",
"duration_ms": 4231,
"completed_at": "2026-03-10T14:32:11Z"
}Fan-Out Execution
Fan-out lets you run the same script across your entire fleet simultaneously — or a filtered subset — with a single API call. Every target agent gets its own job record, so you can track, diff, and audit each individually.
# 1. Get all online production agents
AGENTS=$(curl -s https://api.getscriptmesh.com/agents \
-H "Authorization: Bearer $TOKEN" | \
jq -r 'to_entries[] | select(.value.status=="online") |
select(.value.tags[] | contains("production")) | .key')
# 2. Fan out — fire all in parallel, collect job IDs
JOB_IDS=()
for AGENT in $AGENTS; do
JOB_ID=$(curl -s -X POST https://api.getscriptmesh.com/trigger-script \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"agent\": \"$AGENT\",
\"run_script\": \"deploy.sh\",
\"params\": { \"version\": \"v2.4.1\" },
\"async_exec\": true
}" | jq -r '.job_id')
JOB_IDS+=("$JOB_ID")
echo "Triggered $AGENT → $JOB_ID"
done
# 3. Wait and check all results
for JOB_ID in "${JOB_IDS[@]}"; do
STATUS=$(curl -s https://api.getscriptmesh.com/jobs/$JOB_ID \
-H "Authorization: Bearer $TOKEN" | jq -r '.status')
echo "$JOB_ID: $STATUS"
doneFan-out is built for workflows like rolling deploys, fleet-wide config pushes, and parallel health checks. Each job is independently tracked — if two agents succeed and one fails, you can pinpoint exactly which one without wading through interleaved logs.
Rolling deploys
Push a new version to every prod agent simultaneously, monitor exit codes per node.
Fleet config push
Distribute updated config files across 50+ agents in one API call.
Parallel health checks
Run diagnostics across regions simultaneously — first-class per-agent result diffing.
Real-World Examples
End-to-end examples showing ScriptMesh in common production scenarios.
Nightly backup across all servers
Schedule a backup script to run at 2 AM on every production agent. A Slack notification fires when any job completes or fails.
#!/bin/bash
# /scripts/backup.sh — mounted into agent container
set -euo pipefail
TARGET="${target:-s3://my-backups/$(hostname)}"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
ARCHIVE="/tmp/backup_${TIMESTAMP}.tar.gz"
echo "Archiving /var/app/data to ${TARGET}/${TIMESTAMP}.tar.gz ..."
tar czf "$ARCHIVE" /var/app/data/
aws s3 cp "$ARCHIVE" "${TARGET}/${TIMESTAMP}.tar.gz" --storage-class STANDARD_IA
rm -f "$ARCHIVE"
echo "Done. Size: $(du -sh $ARCHIVE | cut -f1)"# Repeat for each production agent
for AGENT in prod-us-east prod-eu-west prod-ap-south; do
curl -X POST https://api.getscriptmesh.com/schedules \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"agent\": \"$AGENT\",
\"script\": \"backup.sh\",
\"cron_expression\": \"0 2 * * *\",
\"params\": { \"target\": \"s3://my-backups/$AGENT\" },
\"webhook_url\": \"https://hooks.slack.com/services/...\"
}"
doneMulti-region deploy rollout
Trigger a canary deploy: run deploy.sh on staging, validate exit code, then fan-out to production only on success.
#!/bin/bash
TOKEN="$SM_TOKEN"
VERSION="$1"
# Step 1: deploy to staging first
echo "→ Deploying $VERSION to staging..."
JOB=$(curl -s -X POST https://api.getscriptmesh.com/trigger-script \
-H "Authorization: Bearer $TOKEN" \
-d "{"agent":"staging-1","run_script":"deploy.sh",
"params":{"version":"$VERSION"},"async_exec":true}" | jq -r '.job_id')
# Poll until complete (max 5 mins)
for i in $(seq 1 60); do
STATUS=$(curl -s https://api.getscriptmesh.com/jobs/$JOB \
-H "Authorization: Bearer $TOKEN" | jq -r '.status')
[[ "$STATUS" != "pending" && "$STATUS" != "running" ]] && break
sleep 5
done
if [[ "$STATUS" != "success" ]]; then
echo "✗ Staging deploy failed — aborting production rollout"
exit 1
fi
# Step 2: fan-out to all production agents
echo "✓ Staging OK → deploying $VERSION to production..."
for AGENT in prod-us-east prod-eu-west prod-ap-south; do
curl -s -X POST https://api.getscriptmesh.com/trigger-script \
-H "Authorization: Bearer $TOKEN" \
-d "{"agent":"$AGENT","run_script":"deploy.sh",
"params":{"version":"$VERSION"},"async_exec":true}" | \
jq -r '"Triggered (.job_id) on $AGENT"'
doneFleet health check + PagerDuty alert
Run a diagnostic script across the fleet every 5 minutes. Any non-zero exit code triggers a PagerDuty incident via the integration event bus.
# Create a PagerDuty integration first
curl -X POST https://api.getscriptmesh.com/integrations \
-H "Authorization: Bearer $TOKEN" \
-d '{
"type": "pagerduty",
"name": "fleet-alerts",
"config": {
"routing_key": "pd_live_...",
"events": ["job.failed"]
}
}'
# Schedule health_check.py every 5 mins on each agent
for AGENT in prod-us-east prod-eu-west prod-ap-south; do
curl -X POST https://api.getscriptmesh.com/schedules \
-H "Authorization: Bearer $TOKEN" \
-d "{
\"agent\": \"$AGENT\",
\"script\": \"health_check.py\",
\"cron_expression\": \"*/5 * * * *\"
}"
doneScript Manifest
Each agent has a script_manifest.json that whitelists which scripts it may execute. Any request for an unlisted script is rejected without execution.
{
"scripts": [
{
"name": "backup.sh",
"path": "/scripts/backup.sh",
"description": "Archive logs to S3"
},
{
"name": "deploy.sh",
"path": "/scripts/deploy.sh",
"description": "Pull and restart latest container"
},
{
"name": "health_check.py",
"path": "/scripts/health_check.py",
"description": "Check service endpoints"
}
]
}Mount the manifest file into the agent container at /agent/script_manifest.json. The agent reloads the manifest on every heartbeat cycle (every 60 seconds).
Authentication
ScriptMesh supports two authentication methods:
JWT Bearer Tokens
For human users and short-lived sessions. Tokens expire in 15 minutes. Use the refresh token to obtain new access tokens.
# Login
curl -X POST https://api.getscriptmesh.com/auth/login \
-H "Content-Type: application/json" \
-d '{ "email": "you@company.com", "password": "your_password" }'
# Response
{
"access_token": "eyJhbGci...",
"refresh_token": "eyJhbGci...",
"token_type": "bearer"
}
# Use in requests
curl https://api.getscriptmesh.com/agents \
-H "Authorization: Bearer eyJhbGci..."API Keys
For programmatic access, CI/CD pipelines, and agent authentication. Keys are prefixed sm_live_ and shown only once at creation.
# Create a key
curl -X POST https://api.getscriptmesh.com/auth/api-keys \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{ "name": "CI Pipeline", "expires_days": 90 }'
# Returns the raw key ONCE — store it securely
{
"id": "key_...",
"key": "sm_live_xyz...",
"key_prefix": "sm_live_xyz",
"warning": "Store this key securely. It will not be shown again."
}API Reference
Interactive docs are available at https://api.getscriptmesh.com/docs (Swagger UI). Key endpoints:
/auth/registerCreate account and tenant/auth/loginAuthenticate and get JWT/auth/verify-emailVerify email with 6-digit code/register-agentRegister an agent with the orchestrator/agentsList all agents and their health status/trigger-scriptExecute a script asynchronously/jobsList job executions with filters/jobs/{job_id}Get full result for a specific job/schedulesList cron schedules/schedulesCreate a new cron schedule/metrics/jsonAgent health and execution metricsEnvironment Variables
| Variable | Required | Description |
|---|---|---|
AGENT_NAME | Yes | Unique name for this agent. Shown in dashboard. |
ORCHESTRATOR_URL | Yes | URL of the orchestrator. Default: https://api.getscriptmesh.com |
API_KEY | Yes | ScriptMesh API key (sm_live_...) for authentication |
AGENT_PORT | No | Port the agent listens on. Default: 8001 |
HEARTBEAT_INTERVAL | No | Seconds between health heartbeats. Default: 60 |
SCRIPTS_DIR | No | Directory containing executable scripts. Default: /scripts |
JWT_SECRET | No | Orchestrator JWT signing secret. Set in cloud — auto-managed on Pro. |
RESEND_API_KEY | No | Resend API key for transactional email. Auto-managed on Pro. |
Scheduling
Create cron schedules via the API. Schedules run via APScheduler with missed-fire detection.
curl -X POST https://api.getscriptmesh.com/schedules \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "prod-server-1",
"script": "backup.sh",
"cron_expression": "0 2 * * *",
"params": { "target": "s3://my-bucket" },
"webhook_url": "https://hooks.example.com/job-done",
"timeout": 600
}'
# Response
{
"id": "sched_abc123",
"cron_expression": "0 2 * * *",
"next_fire": "2026-03-11T02:00:00Z"
}Cron expressions use standard 5-field format: minute hour day-of-month month day-of-week. The webhook URL receives a POST with the full job result when the script completes.
Integrations
Configure integrations in the dashboard under Settings → Integrations, or via the API. All integrations are per-tenant and scoped to specific event types.
curl -X POST https://api.getscriptmesh.com/integrations \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"type": "slack",
"name": "ops-alerts",
"config": {
"webhook_url": "https://hooks.slack.com/services/...",
"events": ["job.failed", "agent.offline"]
}
}'Supported integration types: slack, discord, teams, pagerduty, datadog, splunk, prometheus, webhook.
Security Best Practices
Use one API key per agent
Generate a dedicated API key for each agent. If an agent is compromised, revoke only its key without affecting the rest of your fleet.
Store API keys in secrets managers
Never hardcode API keys. Use AWS Secrets Manager, HashiCorp Vault, Docker secrets, or Kubernetes Secrets to inject keys at runtime.
Validate your script manifest
Keep your manifest minimal — only whitelist scripts you actively use. Review it regularly to remove scripts that are no longer needed.
Rotate compromised keys immediately
If an API key is exposed, revoke it from the dashboard immediately. Generate a new key and update the agent environment variable. The old key stops working instantly.
Use TLS for agent-to-orchestrator traffic
On Pro, all traffic goes over HTTPS to api.getscriptmesh.com with TLS 1.3. On self-hosted, always put a reverse proxy with a valid TLS certificate in front of the orchestrator.
Troubleshooting
Common issues and how to fix them.
Agent shows as offline immediately after starting
Possible causes
- —ORCHESTRATOR_URL is wrong or unreachable from the agent host
- —API_KEY is invalid or has been revoked
- —Agent container exited — check docker logs scriptmesh-agent
# Check agent logs docker logs scriptmesh-agent --tail 50 # Verify the orchestrator is reachable from the agent host curl -I https://api.getscriptmesh.com/health # Confirm the API key works curl https://api.getscriptmesh.com/agents \ -H "Authorization: Bearer sm_live_YOUR_KEY"
Trigger returns 403 — script not in manifest
Possible causes
- —Script name in the request doesn't match the name in script_manifest.json exactly
- —The manifest file isn't mounted at /agent/script_manifest.json
- —Agent hasn't reloaded the manifest yet (wait up to 60s for next heartbeat)
# Check which scripts the agent knows about curl https://api.getscriptmesh.com/get-scripts?agent=prod-server-1 \ -H "Authorization: Bearer $TOKEN" # If empty, the manifest isn't loaded — verify the Docker mount docker inspect scriptmesh-agent | jq '.[0].Mounts'
Job stays in 'pending' status indefinitely
Possible causes
- —Agent went offline after the job was queued
- —Script is blocking on stdin or waiting for input
- —Timeout was set too low and the job was silently killed
# Check the agent's current status curl https://api.getscriptmesh.com/agents \ -H "Authorization: Bearer $TOKEN" | jq '.["prod-server-1"].status' # Check stderr for clues curl https://api.getscriptmesh.com/jobs/job_abc123 \ -H "Authorization: Bearer $TOKEN" | jq '.stderr'
Webhook not receiving job completion events
Possible causes
- —webhook_url is not publicly reachable (localhost URLs won't work)
- —The receiving endpoint doesn't return a 2xx status code
- —Job failed before reaching the webhook dispatch step
# Use a service like https://webhook.site to test
# Update the schedule with a test webhook URL
curl -X POST https://api.getscriptmesh.com/schedules \
-H "Authorization: Bearer $TOKEN" \
-d '{ ..., "webhook_url": "https://webhook.site/your-uuid" }'401 Unauthorized — token expired
Possible causes
- —Access tokens expire after 15 minutes
- —Refresh token has also expired (7 day TTL)
# Refresh your access token (valid for 15 min)
curl -X POST https://api.getscriptmesh.com/auth/refresh \
-H "Content-Type: application/json" \
-d '{ "refresh_token": "eyJhbGci..." }'
# If refresh token is also expired, log in again
curl -X POST https://api.getscriptmesh.com/auth/login \
-d '{ "email": "you@company.com", "password": "..." }'