Эта страница ещё не переведена на русский.Помочь с переводом на GitHub →

Execution Model

Обновлено 12 апреля 2026 г.

How plugin code actually runs — in-process, worker pool, container, or remote.

When a user runs pnpm kb commit:commit, something has to actually execute the commit plugin's handler function. Where that happens is a deployment-time choice, not a plugin-time choice. The same handler code can run in-process (fastest, no isolation), in a worker pool (fault isolation, single-node production), in a Docker container (full OS isolation, multi-tenant), or on a remote executor (distributed fleet, roadmap).

Plugin code doesn't know or care which one it is. The SDK sandboxes every operation through permission-gated shims, so the plugin sees the same ctx.runtime.fs, ctx.runtime.fetch, and ctx.runtime.env regardless of the underlying execution backend. This page covers how the four modes differ and how to pick one.

Source of truth: ExecutionConfig in core-runtime/src/config.ts.

The four modes

TypeScript

mode?: 'auto' | 'in-process' | 'worker-pool' | 'remote' | 'container';

in-process — the plugin handler runs in the same Node process as the service (REST API, workflow daemon, CLI) that invoked it. Fastest path, no process boundary, no isolation. Use for dev and trusted first-party plugins.
worker-pool — plugins run in a pool of worker threads. Each worker has its own V8 isolate. A plugin crash takes down one worker, not the whole service. Default for single-node production.
container — plugins run inside Docker containers provisioned on demand. Full OS-level isolation, enforced resource limits, network namespaces. Use for multi-tenant deployments or untrusted plugins. Requires container.gatewayDispatchUrl and container.gatewayInternalSecret.
remote — type is defined in the config schema, but the adapters are not shipped. Roadmap for distributed fleets.
auto — default. Detects the mode from environment variables (EXECUTION_MODE, KUBERNETES_SERVICE_HOST). Picks container in Kubernetes, worker-pool elsewhere.

In-process

The handler file is dynamically imported into the calling service's process. handler.execute(ctx, input) is a plain async function call.

Service process (REST API)
  │
  └─► import('./dist/cli/commands/commit.js')
      │
      └─► handler.execute(ctx, input)
          │
          └─► runs here, same process

Pros:

Zero overhead. No IPC, no serialization, no process boundary.
Fastest startup. Module imports happen on first call and are cached.
Easy debugging. Single process, single stack trace.

Cons:

No isolation. A plugin throwing or crashing takes down the whole service.
No resource limits. quotas.memoryMb is advisory; plugins can use as much memory as the Node heap allows.
Shared module state. Two plugins loaded in-process share the same V8 heap, which can leak state between them in subtle ways.

When to use: dev mode, CI smoke tests, trusted first-party plugins in single-tenant deployments where you know every plugin is safe. Never use for untrusted or marketplace plugins.

Worker pool

The service maintains a pool of forked Node.js processes. Each worker is a separate V8 process with its own heap. When a handler is invoked, the pool picks an idle worker, sends the invocation via IPC, and awaits the result.

Service process (REST API)
  │
  ├─► Worker pool (N workers, forked processes)
  │     │
  │     ├─► Worker 1 (idle)
  │     │     └─► ChildIPCServer ← adapter:call/response → IPCTransport
  │     ├─► Worker 2 (running plugin A)
  │     │     └─► LLMProxy → IPCTransport → parent → real OpenAI adapter
  │     └─► Worker 3 (running plugin B)
  │
  └─► Dispatch: send ExecuteMessage to an idle worker

Configured under platform.execution.workerPool:

TypeScript

interface WorkerPoolConfig {
  min?: number;                          // default 2
  max?: number;                          // default 10
  maxRequestsPerWorker?: number;         // default 1000
  maxUptimeMsPerWorker?: number;         // default 1800000 (30 min)
  maxConcurrentPerPlugin?: number;       // optional per-plugin cap
  warmup?: {
    mode?: 'none' | 'top-n' | 'marked';  // default 'none'
    topN?: number;                       // default 5
    maxHandlers?: number;                // default 20
  };
}

Pros:

Fault isolation. A plugin crash kills the worker, not the service. The pool replaces crashed workers automatically.
Resource recycling. Workers are recycled after maxRequestsPerWorker or maxUptimeMsPerWorker — bounds memory leaks in long-running plugins.
Concurrency control. maxConcurrentPerPlugin prevents one chatty plugin from monopolizing the pool.
Warmup support. Pre-initialize hot handlers (warmup.mode: 'top-n' or 'marked') to avoid cold-start latency on the first call.

Cons:

IPC overhead. Every call serializes input and output across the worker boundary. Large payloads become a bottleneck (~2-5ms per adapter call, negligible for network I/O like LLM).
No OS-level sandboxing. Workers share the process's filesystem access, network access, and env vars. Permissions are enforced by the runtime shims, not by the OS.
Harder to debug. Stack traces cross worker boundaries; symptoms from one worker can show up in others via shared resources.

When to use: single-node production deployments with trusted plugins. This is the default mode in auto selection outside Kubernetes.

Platform services in workers

Workers don't have direct access to platform adapters (LLM, cache, vector store, etc.) — those live in the parent process. Instead, each worker has proxy adapters that forward calls to the parent via IPC:

On spawn, the parent creates a ChildIPCServer per worker that listens for adapter:call messages and dispatches to real adapters.
The worker creates proxy adapters via createProxyPlatform() backed by IPCTransport (Node.js fork IPC channel).
runInProcess() wraps handler execution in AsyncLocalStorage context, so usePlatform(), useLLM(), useCache() return the correct proxy platform.
Permission enforcement: governed wrapper (child-side, fast reject) + ChildIPCServer permission check (parent-side, security boundary).

The transport is pluggable via PlatformTransportFactory — see Adapter System → Platform Transport Adapters.

Warmup

Cold starts are the worst-case latency — the first time a plugin runs, the worker has to import the handler module, which for large plugins can take 200–500ms. Warmup pre-imports handlers during startup so the first real invocation is instant.

TypeScript

warmup: {
  mode: 'top-n',
  topN: 5,         // pre-warm the 5 most-used handlers from analytics
  maxHandlers: 20, // safety cap
}

'top-n' reads recent analytics events to pick the hottest handlers; 'marked' warms up handlers that declare warmup: true in their manifest (if supported). Use 'top-n' for deployments with stable traffic patterns; 'marked' for known-critical handlers regardless of traffic.

Container

Each handler invocation runs in a freshly-provisioned Docker container. The container has its own filesystem, network namespace, memory limits, and CPU limits. When the handler finishes, the container is torn down.

Service process (REST API)
  │
  └─► Gateway /internal/dispatch (POST)
      │
      └─► Docker environment adapter: create + start container
          │
          └─► Container (runtime server)
              │
              └─► handler.execute(ctx, input)

Configured under platform.execution:

TypeScript

{
  mode: 'container',
  container: {
    gatewayDispatchUrl: string;           // e.g. http://gateway:4000/internal/dispatch
    gatewayInternalSecret: string;        // must match gateway's GATEWAY_INTERNAL_SECRET
  }
}

With adapter options for the Docker environment provider configured under platform.adapterOptions.environment:

JSON

{
  "environment": {
    "defaultImage": "kb-runtime-server:local",
    "autoRemove": true,
    "mountWorkspace": true,
    "workspaceMountPath": "/workspace",
    "network": "kb-labs",
    "gateway": {
      "wsUrl": "ws://host.docker.internal:4000",
      "jwtSecret": "${GATEWAY_JWT_SECRET}",
      "dispatchSecret": "${GATEWAY_INTERNAL_SECRET}",
      "dispatchUrl": "http://host.docker.internal:4000/internal/dispatch"
    }
  }
}

Pros:

Full OS isolation. Kernel-enforced process, filesystem, and network namespaces. A plugin can't escape its container except through declared mounts and network rules.
Hard resource limits. cgroups enforce memoryMb and cpuMs from the permission spec.
Per-tenant isolation. Each container is fresh, so multi-tenant deployments don't share state between tenants.
Untrusted code is safe. Containers are designed to run code you don't trust.

Cons:

Latency overhead. Container startup is 100–500ms depending on image size and Docker setup. Per-invocation cost.
Operational complexity. Requires a Docker daemon on the host, image management, container orchestration.
Resource overhead. Each container reserves memory even while idle (until torn down).
Requires the gateway. Container-mode dispatch goes through the gateway's /internal/dispatch endpoint, so the gateway has to be up and reachable.

When to use: multi-tenant deployments, untrusted marketplace plugins, deployments where regulatory or security requirements mandate OS-level isolation. Also useful for plugins that need unusual system tools that don't fit in the host's base image.

Gateway dispatch

In container mode, the flow is:

Service receives the invocation (CLI, REST, workflow step).
Service calls POST /internal/dispatch on the gateway with { handlerRef, input, context }.
Gateway mints a short-lived JWT for the container to call back with.
Gateway asks the Docker environment adapter to provision a container with the JWT in its env.
Container starts, connects to the gateway via WebSocket using the JWT, imports the handler, runs it.
Handler results stream back through the gateway to the calling service.
Container is torn down.

See Gateway → Architecture → Internal dispatch endpoint for details.

Remote

The 'remote' mode is defined in the config schema but no adapter ships today. The intended use case is a dedicated execution service that handles all plugin invocations for a distributed fleet of platform instances — think "a serverless function runtime for plugin handlers".

TypeScript

remote?: {
  endpoint?: string;
}

Right now, attempting to use this mode fails at startup because no adapter is registered. Treat it as "coming eventually, don't plan around it".

Auto-detection

Default is mode: 'auto'. The runtime inspects the environment:

process.env.EXECUTION_MODE — explicit override. If set, uses that mode directly.
process.env.KUBERNETES_SERVICE_HOST — if present, assumes Kubernetes and picks container.
Otherwise — picks worker-pool.

In practice: set an explicit mode in kb.config.json for production. Auto-detect is a convenience for dev.

Isolation profiles

Workflows have an orthogonal concept of isolation profile on each job:

TypeScript

type IsolationProfile = 'strict' | 'balanced' | 'relaxed';

This is separate from the execution mode. Isolation profile is a hint from the workflow author about how much isolation the job needs; the execution backend translates it into concrete choices (worker vs container, shared vs dedicated workspace, etc.).

strict — full isolation. Requires both workspace and environment adapters. Each job runs in a fresh container with a fresh workspace. No sharing with other jobs.
balanced — default. Requires a workspace adapter. Jobs run in their own workspace but may share worker threads or containers if the backend allows.
relaxed — no workspace required. Jobs run in-process against the shared cwd. Fastest, least isolated.

The workflow daemon warns at startup if the workspace or environment adapters are missing for the isolation levels declared in workflow specs. See Services → Workflow Daemon for the warning behavior.

Execution target

Every workflow job (and some REST requests) can specify an ExecutionTarget:

TypeScript

interface ExecutionTarget {
  environmentId?: string;              // specific environment to use
  workspaceId?: string;                // specific workspace to attach
  namespace?: string;                  // namespace for routing
  workdir?: string;                    // working directory inside the workspace
}

Targets are pins. A job with target.environmentId: 'prod-env-1' runs specifically in that environment, not in a freshly provisioned one. Useful for jobs that need access to a persistent environment (e.g. a long-lived browser session, a stateful service, a pre-warmed ML model).

Without a target, the execution backend provisions fresh resources per invocation (or picks an idle one from the pool).

What plugin code sees

Plugin handlers don't see the execution mode directly. They see a standard ctx: PluginContextV3 with:

ctx.runtime.fs — sandboxed filesystem access, permission-gated.
ctx.runtime.fetch — sandboxed network access, permission-gated.
ctx.runtime.env — sandboxed env var access, permission-gated.
ctx.platform.* — platform services (LLM, cache, storage, etc.), which may be IPC proxies in worker-pool or container mode.

SDK hooks like useLLM(), useCache(), usePlatform() also work transparently across all modes. They use AsyncLocalStorage to resolve the correct platform for the current execution context — governed with the right permissions and backed by the right transport. No global singletons, no mode-specific code in plugins.

The shims enforce declared permissions identically in every mode. ctx.runtime.fs.readFile('/etc/passwd') throws a permission error in-process and in a container — the only difference is that in a container, the /etc/passwd read wouldn't have been allowed by the OS either, so you get defense-in-depth.

In worker-pool mode, useLLM().complete(...) is an IPC call to the parent process, which calls the real LLM adapter and returns the response. In container mode, it's an RPC call through the gateway. The plugin doesn't know.

Picking a mode

Decision tree:

Dev environment? → in-process. Fast iteration, easy debugging.
Single-node production, trusted plugins? → worker-pool. Fault isolation without Docker.
Multi-tenant or untrusted plugins? → container. Full OS isolation.
Kubernetes? → auto (which will pick container) or explicit container.
Don't know yet? → start with worker-pool and switch if you need more isolation.

Changing modes

Mode is set in kb.config.json under platform.execution.mode. Changing it requires restarting every service that executes plugin handlers (REST API, workflow daemon, CLI). Plugins don't need to change — the same handler code works across all modes.

JSON

{
  "platform": {
    "execution": {
      "mode": "worker-pool",
      "workerPool": {
        "min": 2,
        "max": 10,
        "warmup": { "mode": "top-n", "topN": 5 }
      }
    }
  }
}

Switch to container:

JSON

{
  "platform": {
    "execution": {
      "mode": "container",
      "container": {
        "gatewayDispatchUrl": "http://gateway:4000/internal/dispatch",
        "gatewayInternalSecret": "${GATEWAY_INTERNAL_SECRET}"
      }
    }
  }
}

Restart:

Bash

kb-dev restart

Execution Model

The four modes

In-process

Worker pool

Platform services in workers

Warmup

Container

Gateway dispatch

Remote

Auto-detection

Isolation profiles

Execution target

What plugin code sees

Picking a mode

Changing modes

What to read next