Multi-Tenancy
Last updated April 7, 2026
Tenant isolation, quotas, and rate limiting — how one platform serves many users.
KB Labs supports multi-tenant deployments — a single platform instance serving multiple logical tenants (customers, teams, environments) with per-tenant quotas, isolation, and rate limiting. The default single-tenant deployment uses a fixed tenant ID ('default') and no enforcement, so you can ignore all of this until you need it.
This page covers the tenant model, how tenant IDs flow through the stack, where isolation actually happens, and when you need multi-tenancy (and when you don't).
What a tenant is
A tenant is a logical boundary around:
- Stored data — analytics events, cache entries, storage files, workflow runs are all tagged with a tenant ID.
- Quotas — rate limits, memory limits, concurrency limits are enforced per tenant.
- Attribution — logs, metrics, and analytics events carry the tenant ID for filtering and billing.
- Access control — authenticated requests carry a tenant claim; the platform refuses cross-tenant access.
Tenants are not:
- Database namespaces. There's no schema-level isolation; tenants share the same tables.
- Separate processes. Every service runs one copy; tenants share code.
- Separate configuration. There's one
kb.config.jsonper deployment; tenant-specific config would go in a per-tenant product config underprofiles[].products.
Multi-tenancy is a logical boundary enforced by the platform's middleware and adapter layers, not by OS-level or network-level isolation. If you need hard isolation between tenants (compliance, security, regulatory), run separate deployments instead.
Tenant IDs
A tenant is identified by a string:
type TenantId = string;
// Schema: z.string().min(1).max(64).regex(/^[a-zA-Z0-9_-]+$/)Regex-enforced: alphanumeric, underscore, hyphen. 1–64 characters. Use meaningful IDs — customer names, team slugs, environment identifiers (acme-corp, team-alpha, production-us-east).
The default single-tenant ID is 'default'. Every deployment starts with this as the fallback; explicit tenant IDs only matter when multi-tenancy is enabled.
Where the tenant ID comes from
HTTP requests
Incoming REST requests carry the tenant via header:
X-Tenant-ID: acme-corpThe REST API middleware extracts the header and attaches it to the request context. See rest-api/src/middleware/rate-limit.ts:
const tenantId =
(request.headers['x-tenant-id'] as string) ||
process.env.KB_TENANT_ID ||
'default';Order of precedence:
X-Tenant-IDheader on the request.KB_TENANT_IDenv var on the service process (the default when the header isn't set).'default'as the final fallback.
For single-tenant deployments, set KB_TENANT_ID=<tenant-id> on every service and don't require the header on incoming requests. For multi-tenant deployments, require the header and validate it against the authenticated identity.
CLI invocations
The CLI reads KB_TENANT_ID at startup and threads it through every downstream call. There's no per-command tenant flag — if you want to switch tenants from the CLI, export the env var.
Workflow runs
Workflow runs carry a tenantId field in the WorkflowRun record. When a workflow is triggered via the workflow daemon HTTP API, the caller's tenant ID propagates into the run and into every job spawned from it. Jobs inherit the tenant from the run; steps inherit from the job.
Plugin handlers
Inside a handler, ctx.tenantId is populated with the effective tenant ID for this invocation:
async execute(ctx, input) {
const logger = useLogger();
logger.info('request', { tenantId: ctx.tenantId });
// Tenant-scoped cache key:
const cache = useCache();
if (cache) {
await cache.set(`${ctx.tenantId}:query:${input.id}`, result, 60_000);
}
}Plugin code that deals with multi-tenant data must key its operations on ctx.tenantId — the platform doesn't automatically scope cache/storage keys per tenant.
Where tenant isolation actually happens
Rate limiting
The REST API's rate-limit middleware tracks requests per tenant per time window. A tenant exceeding its limit gets HTTP 429 for the rest of the window. Limits are configured via the resource broker:
{
"platform": {
"core": {
"resourceBroker": {
"distributed": true,
"llm": {
"rateLimits": "pro",
"maxRetries": 3,
"timeout": 60000
}
}
}
}
}rateLimits accepts either a preset name ('free' | 'pro' | 'enterprise') or a custom RateLimitConfig. Presets set sensible defaults for typical SaaS tiers; custom config gives you fine control.
Quotas
Plugin handlers carry per-invocation quotas (timeoutMs, memoryMb, cpuMs) from the permission spec. For multi-tenant deployments, you can override these per tenant via core.resources.defaultQuotas:
{
"platform": {
"core": {
"resources": {
"defaultQuotas": {
"maxConcurrentCalls": 10,
"maxRequestsPerMinute": 100
}
}
}
}
}defaultQuotas applies to tenants that don't have explicit per-tenant quotas configured. For tenant-specific overrides, the quota resolver walks per-tenant config in the state broker. (Configuring explicit per-tenant quotas is roadmap — current deployments use defaultQuotas as a single platform-wide cap.)
Analytics attribution
Every analytics event carries the tenant ID in its AnalyticsContext:
interface AnalyticsContext {
source: { product: string; version: string };
runId: string;
actor?: { type: 'user' | 'agent' | 'ci'; id?: string; name?: string };
tenantId?: string;
ctx?: Record<string, string | number | boolean | null>;
}Analytics adapters persist events with the tenant ID so dashboards can slice costs, token usage, and request counts per tenant. This is the foundation for billing in multi-tenant SaaS deployments.
State broker scoping
The state broker (used for distributed rate limits, locks, and shared state) prefixes keys with the tenant ID automatically. Two tenants can use the same key name without colliding:
acme-corp:rate-limit:llm:2024-01-15T10:30
team-alpha:rate-limit:llm:2024-01-15T10:30Plugin code doesn't construct these keys directly — the state broker API takes a tenant ID and handles the prefix internally.
Workspace isolation
Workspaces are inherently per-tenant when materialized by the workspace adapter. Each tenant gets a fresh workspace; workspaces aren't shared across tenants. The workspace ID typically encodes the tenant (ws-acme-corp-abc123).
For git-worktree-based workspaces, each tenant has its own worktree. For container-mode execution, each tenant gets its own container. For the local-fs adapter in single-tenant dev, everybody shares the same workspace — multi-tenancy is effectively disabled.
Tiered rate limiting
The RateLimitPreset type accepts a few built-in tier names:
free— conservative limits for free-tier users.pro— higher limits for paying users.enterprise— very high or unlimited for enterprise accounts.
The exact numbers are defined in the @kb-labs/core-resource-broker package. They're starting points — override with a custom RateLimitConfig if your business model doesn't fit the free/pro/enterprise pattern.
To assign a tier per tenant, you'd typically:
- Store tenant → tier mapping somewhere (your own database, a JWT claim).
- On every request, look up the tier for the incoming tenant.
- Apply the tier's rate limit config to the resource broker.
This middleware integration isn't built into the platform — you wire it up in your own auth middleware. The platform provides the primitives; you own the tiering logic.
When to enable multi-tenancy
You need multi-tenancy if:
- You're running KB Labs as a SaaS product with multiple paying customers.
- You need to bill or attribute resource usage per customer.
- You need per-customer rate limits or quotas.
- Different customers have different configurations (different LLM providers, different scopes, different tools).
You don't need multi-tenancy if:
- You're running KB Labs for a single team or organization.
- Everyone in your deployment is using the same config.
- You don't need to bill or limit usage per user.
- You're running KB Labs locally on a developer laptop.
For single-tenant setups, leave KB_TENANT_ID=default (or unset) and don't implement any tenant-aware logic in your plugins. The platform behaves identically to a non-multi-tenant tool.
Enabling multi-tenancy: checklist
If you decide you need it:
- Set up authentication. Every request needs an identity you can map to a tenant. The gateway's JWT model carries
hostId/namespaceId, but you may need to add tenant claims on top. - Require
X-Tenant-IDon incoming requests. Reject requests without it. - Validate the header against the authenticated identity. Don't trust the caller to claim arbitrary tenant IDs.
- Enable distributed state. Set
platform.core.resourceBroker.distributed: trueand run the state daemon so quotas work across service instances. - Configure rate limits. Pick a preset or write a custom
RateLimitConfig. - Set
core.resources.defaultQuotas. The baseline for tenants without explicit overrides. - Update plugins to use
ctx.tenantId. Every cache key, storage path, state broker key that stores per-tenant data needs tenant scoping. - Set up billing / monitoring. Analytics events are tagged with tenant IDs — wire them into your billing system or dashboards.
The platform handles steps 4–6 via config. Steps 1–3 and 7–8 are your responsibility.
What's NOT isolated
Be aware of these gaps:
kb.config.jsonis single-config. There's no per-tenant config file. Different tenants using different LLM models means configuring one LLM adapter and letting plugins pick via tiers.- Plugin installs are platform-wide. Every tenant sees the same installed plugins. Per-tenant plugin lists would require rebuilding the discovery layer.
- In-process execution shares memory. If you run
mode: 'in-process'in a multi-tenant deployment, tenants share the same V8 heap. A buggy plugin can leak data between tenants. Useworker-poolorcontainerfor multi-tenant. - Logs mix tenants. The log stream contains events from every tenant. Filtering to a specific tenant requires querying by the
tenantIdfield — not a separate log per tenant. - Database tables are shared. SQL schemas don't carry per-tenant prefixes. Add tenant ID as a column to your tables if you're doing your own SQL access.
For hard isolation, run separate deployments.
Future directions
Roadmap items explicitly called out in the source or the multi-tenancy ADR:
- Per-tenant plugin configuration (different plugin sets per tenant).
- Per-tenant adapter selection (tenant X uses OpenAI, tenant Y uses Anthropic).
- Per-tenant workspace templates.
- First-class tiering middleware on the gateway.
- Automatic analytics breakdowns per tenant in Studio dashboards.
None of these are shipped today. Current multi-tenancy is "tenant-aware primitives that you compose in your own middleware".
What to read next
- Configuration → kb.config.json → core.resourceBroker — rate-limit presets and distributed state.
- Gateway → Authentication — where tenant identity is established.
- SDK → Handler Context —
ctx.tenantIdand how to use it. - Adapters → IAnalytics — how tenant attribution works in analytics events.
- Operations → Security — multi-tenancy as part of the broader security model.