Эта страница ещё не переведена на русский.Помочь с переводом на GitHub →

Multi-Tenancy

Обновлено 7 апреля 2026 г.

Tenant isolation, quotas, and rate limiting — how one platform serves many users.

KB Labs supports multi-tenant deployments — a single platform instance serving multiple logical tenants (customers, teams, environments) with per-tenant quotas, isolation, and rate limiting. The default single-tenant deployment uses a fixed tenant ID ('default') and no enforcement, so you can ignore all of this until you need it.

This page covers the tenant model, how tenant IDs flow through the stack, where isolation actually happens, and when you need multi-tenancy (and when you don't).

What a tenant is

A tenant is a logical boundary around:

Stored data — analytics events, cache entries, storage files, workflow runs are all tagged with a tenant ID.
Quotas — rate limits, memory limits, concurrency limits are enforced per tenant.
Attribution — logs, metrics, and analytics events carry the tenant ID for filtering and billing.
Access control — authenticated requests carry a tenant claim; the platform refuses cross-tenant access.

Tenants are not:

Database namespaces. There's no schema-level isolation; tenants share the same tables.
Separate processes. Every service runs one copy; tenants share code.
Separate configuration. There's one kb.config.json per deployment; tenant-specific config would go in a per-tenant product config under profiles[].products.

Multi-tenancy is a logical boundary enforced by the platform's middleware and adapter layers, not by OS-level or network-level isolation. If you need hard isolation between tenants (compliance, security, regulatory), run separate deployments instead.

Tenant IDs

A tenant is identified by a string:

TypeScript

type TenantId = string;
// Schema: z.string().min(1).max(64).regex(/^[a-zA-Z0-9_-]+$/)

Regex-enforced: alphanumeric, underscore, hyphen. 1–64 characters. Use meaningful IDs — customer names, team slugs, environment identifiers (acme-corp, team-alpha, production-us-east).

The default single-tenant ID is 'default'. Every deployment starts with this as the fallback; explicit tenant IDs only matter when multi-tenancy is enabled.

Where the tenant ID comes from

HTTP requests

Incoming REST requests carry the tenant via header:

X-Tenant-ID: acme-corp

The REST API middleware extracts the header and attaches it to the request context. See rest-api/src/middleware/rate-limit.ts:

TypeScript

const tenantId =
  (request.headers['x-tenant-id'] as string) ||
  process.env.KB_TENANT_ID ||
  'default';

Order of precedence:

X-Tenant-ID header on the request.
KB_TENANT_ID env var on the service process (the default when the header isn't set).
'default' as the final fallback.

For single-tenant deployments, set KB_TENANT_ID=<tenant-id> on every service and don't require the header on incoming requests. For multi-tenant deployments, require the header and validate it against the authenticated identity.

CLI invocations

The CLI reads KB_TENANT_ID at startup and threads it through every downstream call. There's no per-command tenant flag — if you want to switch tenants from the CLI, export the env var.

Workflow runs

Workflow runs carry a tenantId field in the WorkflowRun record. When a workflow is triggered via the workflow daemon HTTP API, the caller's tenant ID propagates into the run and into every job spawned from it. Jobs inherit the tenant from the run; steps inherit from the job.

Plugin handlers

Inside a handler, ctx.tenantId is populated with the effective tenant ID for this invocation:

TypeScript

async execute(ctx, input) {
  const logger = useLogger();
  logger.info('request', { tenantId: ctx.tenantId });
 
  // Tenant-scoped cache key:
  const cache = useCache();
  if (cache) {
    await cache.set(`${ctx.tenantId}:query:${input.id}`, result, 60_000);
  }
}

Plugin code that deals with multi-tenant data must key its operations on ctx.tenantId — the platform doesn't automatically scope cache/storage keys per tenant.

Where tenant isolation actually happens

Rate limiting

The REST API's rate-limit middleware tracks requests per tenant per time window. A tenant exceeding its limit gets HTTP 429 for the rest of the window. Limits are configured via the resource broker:

JSON

{
  "platform": {
    "core": {
      "resourceBroker": {
        "distributed": true,
        "llm": {
          "rateLimits": "pro",
          "maxRetries": 3,
          "timeout": 60000
        }
      }
    }
  }
}

rateLimits accepts either a preset name ('free' | 'pro' | 'enterprise') or a custom RateLimitConfig. Presets set sensible defaults for typical SaaS tiers; custom config gives you fine control.

Quotas

Plugin handlers carry per-invocation quotas (timeoutMs, memoryMb, cpuMs) from the permission spec. For multi-tenant deployments, you can override these per tenant via core.resources.defaultQuotas:

JSON

{
  "platform": {
    "core": {
      "resources": {
        "defaultQuotas": {
          "maxConcurrentCalls": 10,
          "maxRequestsPerMinute": 100
        }
      }
    }
  }
}

defaultQuotas applies to tenants that don't have explicit per-tenant quotas configured. For tenant-specific overrides, the quota resolver walks per-tenant config in the state broker. (Configuring explicit per-tenant quotas is roadmap — current deployments use defaultQuotas as a single platform-wide cap.)

Analytics attribution

Every analytics event carries the tenant ID in its AnalyticsContext:

TypeScript

interface AnalyticsContext {
  source: { product: string; version: string };
  runId: string;
  actor?: { type: 'user' | 'agent' | 'ci'; id?: string; name?: string };
  tenantId?: string;
  ctx?: Record<string, string | number | boolean | null>;
}

Analytics adapters persist events with the tenant ID so dashboards can slice costs, token usage, and request counts per tenant. This is the foundation for billing in multi-tenant SaaS deployments.

State broker scoping

The state broker (used for distributed rate limits, locks, and shared state) prefixes keys with the tenant ID automatically. Two tenants can use the same key name without colliding:

acme-corp:rate-limit:llm:2024-01-15T10:30
team-alpha:rate-limit:llm:2024-01-15T10:30

Plugin code doesn't construct these keys directly — the state broker API takes a tenant ID and handles the prefix internally.

Workspace isolation

Workspaces are inherently per-tenant when materialized by the workspace adapter. Each tenant gets a fresh workspace; workspaces aren't shared across tenants. The workspace ID typically encodes the tenant (ws-acme-corp-abc123).

For git-worktree-based workspaces, each tenant has its own worktree. For container-mode execution, each tenant gets its own container. For the local-fs adapter in single-tenant dev, everybody shares the same workspace — multi-tenancy is effectively disabled.

Tiered rate limiting

The RateLimitPreset type accepts a few built-in tier names:

free — conservative limits for free-tier users.
pro — higher limits for paying users.
enterprise — very high or unlimited for enterprise accounts.

The exact numbers are defined in the @kb-labs/core-resource-broker package. They're starting points — override with a custom RateLimitConfig if your business model doesn't fit the free/pro/enterprise pattern.

To assign a tier per tenant, you'd typically:

Store tenant → tier mapping somewhere (your own database, a JWT claim).
On every request, look up the tier for the incoming tenant.
Apply the tier's rate limit config to the resource broker.

This middleware integration isn't built into the platform — you wire it up in your own auth middleware. The platform provides the primitives; you own the tiering logic.

When to enable multi-tenancy

You need multi-tenancy if:

You're running KB Labs as a SaaS product with multiple paying customers.
You need to bill or attribute resource usage per customer.
You need per-customer rate limits or quotas.
Different customers have different configurations (different LLM providers, different scopes, different tools).

You don't need multi-tenancy if:

You're running KB Labs for a single team or organization.
Everyone in your deployment is using the same config.
You don't need to bill or limit usage per user.
You're running KB Labs locally on a developer laptop.

For single-tenant setups, leave KB_TENANT_ID=default (or unset) and don't implement any tenant-aware logic in your plugins. The platform behaves identically to a non-multi-tenant tool.

Enabling multi-tenancy: checklist

If you decide you need it:

Set up authentication. Every request needs an identity you can map to a tenant. The gateway's JWT model carries hostId / namespaceId, but you may need to add tenant claims on top.
Require X-Tenant-ID on incoming requests. Reject requests without it.
Validate the header against the authenticated identity. Don't trust the caller to claim arbitrary tenant IDs.
Enable distributed state. Set platform.core.resourceBroker.distributed: true and run the state daemon so quotas work across service instances.
Configure rate limits. Pick a preset or write a custom RateLimitConfig.
Set core.resources.defaultQuotas. The baseline for tenants without explicit overrides.
Update plugins to use ctx.tenantId. Every cache key, storage path, state broker key that stores per-tenant data needs tenant scoping.
Set up billing / monitoring. Analytics events are tagged with tenant IDs — wire them into your billing system or dashboards.

The platform handles steps 4–6 via config. Steps 1–3 and 7–8 are your responsibility.

What's NOT isolated

Be aware of these gaps:

kb.config.json is single-config. There's no per-tenant config file. Different tenants using different LLM models means configuring one LLM adapter and letting plugins pick via tiers.
Plugin installs are platform-wide. Every tenant sees the same installed plugins. Per-tenant plugin lists would require rebuilding the discovery layer.
In-process execution shares memory. If you run mode: 'in-process' in a multi-tenant deployment, tenants share the same V8 heap. A buggy plugin can leak data between tenants. Use worker-pool or container for multi-tenant.
Logs mix tenants. The log stream contains events from every tenant. Filtering to a specific tenant requires querying by the tenantId field — not a separate log per tenant.
Database tables are shared. SQL schemas don't carry per-tenant prefixes. Add tenant ID as a column to your tables if you're doing your own SQL access.

For hard isolation, run separate deployments.

Future directions

Roadmap items explicitly called out in the source or the multi-tenancy ADR:

Per-tenant plugin configuration (different plugin sets per tenant).
Per-tenant adapter selection (tenant X uses OpenAI, tenant Y uses Anthropic).
Per-tenant workspace templates.
First-class tiering middleware on the gateway.
Automatic analytics breakdowns per tenant in Studio dashboards.

None of these are shipped today. Current multi-tenancy is "tenant-aware primitives that you compose in your own middleware".

Multi-Tenancy

What a tenant is

Tenant IDs

Where the tenant ID comes from

HTTP requests

CLI invocations

Workflow runs

Plugin handlers

Where tenant isolation actually happens

Rate limiting

Quotas

Analytics attribution

State broker scoping

Workspace isolation

Tiered rate limiting

When to enable multi-tenancy

Enabling multi-tenancy: checklist

What's NOT isolated

Future directions

What to read next