Quickstart
Last updated April 7, 2026
Install @kb-labs/platform-client, authenticate, and make your first call in 5 minutes.
This is the fastest path to a working @kb-labs/platform-client integration. Install, construct, call.
Prerequisites
- A running KB Labs gateway reachable from your app (
http://localhost:4000for local dev). - A bearer token for the gateway. For local dev, use one of the
gateway.staticTokensfromkb.config.json. For production, see Authentication. - Node 18+ (for native
fetch), or any runtime withfetchsupport.
Step 1 — Install the package
pnpm add @kb-labs/platform-clientNo peer dependencies, no setup scripts, no optional deps. The package is a single class and ~300 lines of code.
Step 2 — Construct the client
import { KBPlatform } from '@kb-labs/platform-client';
const platform = new KBPlatform({
endpoint: 'http://localhost:4000',
apiKey: 'dev-studio-token',
});That's it. The endpoint is the gateway URL; the apiKey is a bearer token the gateway will validate on every request.
Optional fields:
const platform = new KBPlatform({
endpoint: 'http://localhost:4000',
apiKey: 'dev-studio-token',
defaultTags: {
source: 'my-backend',
env: 'dev',
},
onError: (err) => {
console.error('[platform-client] background error:', err);
},
});defaultTags is merged into every telemetry event; onError catches background failures (telemetry flushes) so they don't crash your process silently.
Step 3 — Call an LLM
const response = await platform.llm.complete(
'Write a one-line Python function that reverses a string.',
);
console.log(response.content);
console.log(`Used ${response.usage.promptTokens + response.usage.completionTokens} tokens`);
console.log(`Model: ${response.model}`);The call goes to POST /platform/v1/llm/complete on the gateway. The gateway routes to whichever LLM adapter is configured in the server's kb.config.json, runs the completion, and returns a typed LLMResponse.
With options
const response = await platform.llm.complete('Summarize this text...', {
temperature: 0.3,
maxTokens: 200,
systemPrompt: 'You are a concise summarizer.',
});LLMOptions is a small set of generic LLM parameters — model, temperature, max tokens, stop sequences, system prompt. See Overview → Quick API reference.
Step 4 — Use the cache
// Store a value with a 60-second TTL
await platform.cache.set('query:my-unique-key', { data: 'cached result' }, 60_000);
// Retrieve it later
const cached = await platform.cache.get<{ data: string }>('query:my-unique-key');
if (cached) {
console.log(cached.data);
}
// Delete
await platform.cache.delete('query:my-unique-key');cache.get returns T | null. Use the generic to type the return value; the client doesn't validate the shape — it trusts the server.
Step 5 — Search a vector store
// If the gateway has a configured vectorStore adapter:
const results = await platform.vectorStore.search({
query: 'authentication middleware',
limit: 5,
});
console.log(`Found ${results.length} results`);
// Get total document count:
const total = await platform.vectorStore.count();
console.log(`Index has ${total} documents`);The search query is a plain object — the shape depends on the vector store adapter you're talking to. Qdrant accepts { vector, filter, limit }; other adapters may differ. Check the server-side adapter's docs.
Step 6 — Emit telemetry
// Fire-and-forget events (batched, flushed every 5s or at 50 events)
platform.telemetry.event('user.action', { action: 'clicked-button', buttonId: 'cta-1' });
platform.telemetry.metric('page_load_ms', 342);
platform.telemetry.log('info', 'Processing completed', { items: 42 });
// Force a flush if you need to see events immediately
await platform.telemetry.flush();Telemetry calls are synchronous — they just push into an internal buffer. The buffer flushes every 5 seconds or when it reaches 50 events, whichever comes first. Call flush() explicitly if you need events to be sent immediately (e.g. right before a process exit).
For unbuffered, immediate events, use track() or identify() instead:
// Goes straight to the server
await platform.telemetry.track('subscription.upgraded', { plan: 'pro' });
await platform.telemetry.identify('user-123', { email: 'alice@example.com' });See Telemetry for the batching model and when to use which.
Step 7 — Use the generic call for anything else
When you need to hit an adapter or method the typed proxies don't cover:
// Trigger a workflow run:
const run = await platform.call('workflows', 'run', {
workflowId: 'nightly-cleanup',
inputs: { dryRun: false },
});
console.log(`Run started: ${run.id}`);
// Call a sorted-set operation the CacheProxy doesn't expose:
await platform.call('cache', 'zadd', 'queue', Date.now(), 'job-42');
const due = await platform.call<string[]>('cache', 'zrangebyscore', 'queue', 0, Date.now());call<T>(adapter, method, ...args) is typed on the return value (you provide the generic). The args are untyped — pass whatever the server-side method expects.
See Workflows for the full workflow trigger pattern.
Step 8 — Shutdown cleanly
In long-running processes (servers, daemons), call shutdown() before exit to flush buffered telemetry:
process.on('SIGTERM', async () => {
console.log('Shutting down, flushing telemetry...');
await platform.shutdown();
process.exit(0);
});Skipping this means the last few telemetry events are lost when the process exits.
Complete example
Putting it all together — a minimal backend handler:
import { KBPlatform } from '@kb-labs/platform-client';
const platform = new KBPlatform({
endpoint: process.env.KB_GATEWAY_URL!,
apiKey: process.env.KB_API_KEY!,
defaultTags: { source: 'my-backend', env: process.env.NODE_ENV ?? 'dev' },
onError: (err) => console.error('[platform-client]', err),
});
export async function summarize(text: string, userId: string): Promise<string> {
// Check cache first
const cacheKey = `summary:${hashString(text)}`;
const cached = await platform.cache.get<string>(cacheKey);
if (cached) {
platform.telemetry.event('summary.cache.hit', { userId });
return cached;
}
// Call LLM
const start = Date.now();
const response = await platform.llm.complete(
`Summarize in one sentence:\n\n${text}`,
{ temperature: 0.3, maxTokens: 100 },
);
const summary = response.content.trim();
// Cache for an hour
await platform.cache.set(cacheKey, summary, 3_600_000);
// Emit metrics
platform.telemetry.metric('summary.duration_ms', Date.now() - start);
platform.telemetry.event('summary.generated', {
userId,
tokens: response.usage.promptTokens + response.usage.completionTokens,
model: response.model,
});
return summary;
}
// Clean up on exit
process.on('SIGTERM', async () => {
await platform.shutdown();
process.exit(0);
});
function hashString(s: string): string {
// Any hash function — e.g. built-in crypto.createHash('sha256')
return Buffer.from(s).toString('base64').slice(0, 16);
}Three calls — one cache read, one LLM call, one cache write — plus background telemetry. 50 lines, no dependencies, works in any Node runtime.
What to read next
- Overview — the full package surface and the Unified Platform API shape.
- Authentication — how to get a real
apiKeyfor production. - LLM, Cache, Vector Store — the typed proxies in detail.
- Workflows — triggering workflows via
call(). - Telemetry — event batching and flush.
- Error handling — errors, retries, shutdown.