KB LabsDocs

Overview

Last updated April 7, 2026


Workflows, jobs, and steps — the declarative orchestration layer.

Workflows are how KB Labs orchestrates multi-step work. A workflow is a declarative spec — written in JSON, YAML, or TypeScript — that the workflow engine turns into a sequence of jobs and steps running inside sandboxed execution environments.

A plugin command handles "do one thing". A workflow handles "do several things in sequence, with dependencies, retries, approval gates, and artifacts". Use plugin commands for single operations that the user invokes directly; use workflows when you need orchestration across multiple operations.

Source of truth: platform/kb-labs-workflow/packages/workflow-contracts/src/schemas.ts.

The three things

Workflow  ─── declarative spec (trigger, jobs, inputs, env)

   ├─► Job      ─── unit of scheduling (target, isolation, concurrency, artifacts)
   │    │
   │    └─► Step ─── unit of execution (uses, with, if, timeout)

   └─► Job
        └─► Step
        └─► Step
  • A workflow is the spec you author. It declares triggers (how it starts), inputs (what it accepts), env (non-secret config), secrets (sensitive config), and one or more jobs. It's not a runtime object — it's a template.
  • A job is one unit of scheduling. Each job runs on a target (local or sandbox), gets its own isolation profile, and can depend on other jobs via needs. Jobs produce and consume artifacts.
  • A step is one unit of execution. Steps run sequentially inside a job, can be conditional (if), and invoke either a built-in handler (builtin:shell, builtin:approval, builtin:gate) or a plugin-provided handler.

When you trigger a workflow, the engine creates a run — the live instance. Runs have IDs, timestamps, status, and are persisted in the run store. Jobs within a run become job runs; steps within a job become step runs. Everything has its own status independently.

Workflow vs. plugin command

Both run plugin code. The difference is what they're optimized for:

Plugin commandWorkflow
Entry pointpnpm kb <command>Workflow daemon (trigger, API)
ShapeOne functionN jobs, M steps per job
DependenciesLinear executionDAG via needs
StateEphemeralPersisted runs with history
RetriesCaller's responsibilityBuilt-in retries policy
ConcurrencyOne at a timeConcurrency groups, priorities
ArtifactsOut-of-bandFirst-class, with merge strategies
Approval gatesNot supportedbuiltin:approval step
TriggersManualManual, push, schedule, webhook

Rule of thumb: if the work takes one call and you just want to surface it in the CLI, write a plugin command. If it has multiple phases, needs retry semantics, or a human has to approve something in the middle, write a workflow.

Triggers

A workflow declares when it starts via the on field:

TypeScript
on: {
  manual?: boolean;                      // triggered by CLI / API
  push?: boolean;                        // triggered by git push
  webhook?: boolean | {
    secret?: string;
    path?: string;
    headers?: Record<string, string>;
  };
  schedule?: {
    cron: string;                        // cron expression
    timezone?: string;
  };
}

At least one trigger must be set (refine on the Zod schema enforces this). A single workflow can combine multiple triggers — manual: true plus schedule: { cron: '0 3 * * *' } gives you "run nightly at 3am, or whenever I ask".

See Triggers for the full details.

Inputs

Workflows accept typed inputs declared in the spec:

TypeScript
inputs: {
  version: {
    type: 'string',
    description: 'Version to release',
    required: true,
  },
  dryRun: {
    type: 'boolean',
    default: false,
  },
}

Types are 'string' | 'number' | 'boolean' — that's the entire type system. Complex payloads go through webhook triggers with custom handlers instead.

Inputs are passed when creating a run:

TypeScript
await engine.createRun({
  spec,
  trigger: { type: 'manual', actor: 'alice', payload: { version: '1.0.0', dryRun: true } },
});

Inside steps, inputs are reachable through ${{ trigger.payload.version }} in if expressions and step with blocks.

Jobs

Every workflow has at least one job. Jobs are keyed by ID inside the spec:

TypeScript
jobs: {
  build: {
    runsOn: 'local',
    steps: [
      { name: 'Install deps', uses: 'builtin:shell', with: { command: 'pnpm install' } },
      { name: 'Build',        uses: 'builtin:shell', with: { command: 'pnpm build' } },
    ],
  },
  test: {
    runsOn: 'sandbox',
    needs: ['build'],
    steps: [
      { name: 'Run tests', uses: 'builtin:shell', with: { command: 'pnpm test' } },
    ],
  },
}

Each job has:

  • runsOn'local' (in the workflow daemon's process) or 'sandbox' (isolated execution). See Concepts → Execution Model.
  • needs — IDs of jobs that must complete before this one starts. Forms the DAG.
  • target — optional ExecutionTarget overriding the workflow-level target (environment/workspace/namespace/workdir).
  • isolation'strict' | 'balanced' | 'relaxed' isolation profile.
  • concurrency{ group, cancelInProgress? } to serialize or cancel runs by group.
  • retries — retry policy with exp or lin backoff.
  • timeoutMs — hard cap in milliseconds (max 24 hours).
  • if — gate expression; the job is skipped if it evaluates false.
  • env, secrets — job-level overrides of workflow-level values.
  • artifacts — produce/consume lists + merge config.
  • hookspre/post/onFailure/onSuccess step lists.
  • priority'high' | 'normal' | 'low' scheduling priority.

See Jobs for the full reference.

Steps

Steps are the actual units of execution. Every step has a name and usually a uses:

TypeScript
{
  name: 'Deploy',
  uses: 'plugin:release:deploy',
  with: { env: 'production' },
  if: "${{ steps.tests.outputs.passed == 'true' }}",
  timeoutMs: 600_000,
  continueOnError: false,
}

Three built-in step types live in the engine:

  • builtin:shell — runs a shell command. with: { command: '...' }.
  • builtin:approval — pauses the run until a human approves via Studio or the REST API.
  • builtin:gate — a decision router that reads a value from previous step outputs and routes the pipeline (continue, fail, or restart from an earlier step).

Custom step types come from plugins: plugin:<plugin-id>:<handler-id> resolves to a workflow handler declared in the plugin's manifest. There's also workflow:<workflow-id> for invoking another workflow as a step (with mode: 'wait' | 'fire-and-forget').

Steps can also have:

  • id — addressable ID for ${{ steps.<id>.outputs.* }} references.
  • with — parameters passed to the handler.
  • env, secrets — step-level overrides.
  • continueOnError — if true, a failed step doesn't fail the job.
  • Presentation fieldssummary, phase, progress, artifacts — consumed by Studio and CLI for human output.

See Steps for the full reference.

Run lifecycle

A workflow run goes through well-defined states defined in workflow-constants:

TypeScript
type RunState = 'queued' | 'running' | 'success' | 'failed' | 'cancelled' | 'skipped' | 'dlq';
  • queued — created but not yet started. The engine is waiting for a scheduler slot or pending dependencies.
  • running — at least one job is in-flight.
  • success — all jobs completed successfully.
  • failed — at least one non-optional job failed.
  • cancelled — explicitly cancelled via engine.cancelRun().
  • skipped — not started because an if evaluated false.
  • dlq — "dead letter queue": failed permanently, retried to exhaustion, parked for manual inspection.

Jobs have the same states plus 'interrupted' (restart mid-flight). Steps add 'waiting_approval' for the built-in approval gate.

Expression language

A tiny embedded expression language runs inside if conditions and ${{ … }} interpolation in string fields. Source: expressions.ts.

Supported:

  • Operators: ==, !=, &&, ||, !, parentheses.
  • Functions: contains(s, sub), startsWith(s, pre), endsWith(s, suf).
  • Contexts: env.*, trigger.*, steps.<id>.outputs.*, matrix.*.

Examples:

TypeScript
if: "${{ env.DEPLOY_ENV == 'production' }}"
if: "${{ trigger.type == 'schedule' || trigger.actor == 'admin' }}"
if: "${{ steps.tests.outputs.passed == 'true' && !contains(env.SKIP, 'deploy') }}"

Expressions are boolean-only in if. String interpolation is evaluated separately — ${{ trigger.payload.version }} inside a with.run resolves to the value at runtime before the step executes.

Artifacts, briefly

Artifacts are named outputs that pass between steps and jobs. The workflow engine stores them, merges them across runs, and surfaces them in Studio and the CLI.

  • Steps declare which artifacts they produce via artifacts: { <key>: StepArtifact }, typed as 'markdown' | 'issues' | 'table' | 'diff' | 'log' | 'json' | 'link'.
  • Jobs declare produce and consume lists plus optional merge config (strategies: 'append' | 'overwrite' | 'json-merge').
  • Workflows carry the aggregate across runs in run.artifacts: string[].

See Artifacts for the full details including merge strategies.

A minimal workflow

JSON
{
  "name": "hello",
  "version": "1",
  "on": { "manual": true },
  "inputs": {
    "name": { "type": "string", "default": "world" }
  },
  "jobs": {
    "greet": {
      "runsOn": "local",
      "steps": [
        {
          "name": "Say hi",
          "uses": "builtin:shell",
          "with": { "command": "echo Hello, ${{ trigger.payload.name }}!" }
        }
      ]
    }
  }
}

Trigger with:

Bash
pnpm kb workflow:run --workflow-id=hello --inputs='{"name":"Alice"}'

This creates a run, schedules the greet job, executes the shell step, prints Hello, Alice!, and finishes in success status.

Non-trivial example: build + test with dependencies

JSON
{
  "name": "ci",
  "version": "1",
  "on": { "push": true, "manual": true },
  "env": {
    "NODE_ENV": "test"
  },
  "jobs": {
    "build": {
      "runsOn": "sandbox",
      "isolation": "balanced",
      "steps": [
        { "name": "Install",  "uses": "builtin:shell", "with": { "command": "pnpm install --frozen-lockfile" } },
        { "name": "Build",    "uses": "builtin:shell", "with": { "command": "pnpm build" } }
      ],
      "artifacts": {
        "produce": ["build-output"]
      },
      "timeoutMs": 900000
    },
    "test": {
      "runsOn": "sandbox",
      "needs": ["build"],
      "steps": [
        { "name": "Test", "uses": "builtin:shell", "with": { "command": "pnpm test" } }
      ],
      "retries": {
        "max": 2,
        "backoff": "exp",
        "initialIntervalMs": 5000
      }
    },
    "deploy": {
      "runsOn": "local",
      "needs": ["build", "test"],
      "if": "${{ trigger.type == 'manual' && trigger.actor == 'release-bot' }}",
      "steps": [
        { "name": "Approve", "uses": "builtin:approval", "with": { "message": "Deploy to prod?" } },
        { "name": "Deploy",  "uses": "plugin:release:deploy", "with": { "env": "production" } }
      ]
    }
  }
}

What happens:

  1. build runs first (no dependencies).
  2. test waits for build; retries up to 2 times on failure with exponential backoff.
  3. deploy waits for both build and test, and evaluates its if condition — only runs for manual triggers by release-bot.
  4. deploy pauses at the approval step. A human approves in Studio. Then the release plugin's deploy handler runs with env: 'production'.

Each job has its own isolation, its own retry policy, its own environment. The engine orchestrates the DAG and surfaces progress through Studio.

Overview — KB Labs Docs