KB LabsDocs

Quickstart

Last updated April 7, 2026


Install @kb-labs/platform-client, authenticate, and make your first call in 5 minutes.

This is the fastest path to a working @kb-labs/platform-client integration. Install, construct, call.

Prerequisites

  • A running KB Labs gateway reachable from your app (http://localhost:4000 for local dev).
  • A bearer token for the gateway. For local dev, use one of the gateway.staticTokens from kb.config.json. For production, see Authentication.
  • Node 18+ (for native fetch), or any runtime with fetch support.

Step 1 — Install the package

Bash
pnpm add @kb-labs/platform-client

No peer dependencies, no setup scripts, no optional deps. The package is a single class and ~300 lines of code.

Step 2 — Construct the client

TypeScript
import { KBPlatform } from '@kb-labs/platform-client';
 
const platform = new KBPlatform({
  endpoint: 'http://localhost:4000',
  apiKey: 'dev-studio-token',
});

That's it. The endpoint is the gateway URL; the apiKey is a bearer token the gateway will validate on every request.

Optional fields:

TypeScript
const platform = new KBPlatform({
  endpoint: 'http://localhost:4000',
  apiKey: 'dev-studio-token',
  defaultTags: {
    source: 'my-backend',
    env: 'dev',
  },
  onError: (err) => {
    console.error('[platform-client] background error:', err);
  },
});

defaultTags is merged into every telemetry event; onError catches background failures (telemetry flushes) so they don't crash your process silently.

Step 3 — Call an LLM

TypeScript
const response = await platform.llm.complete(
  'Write a one-line Python function that reverses a string.',
);
 
console.log(response.content);
console.log(`Used ${response.usage.promptTokens + response.usage.completionTokens} tokens`);
console.log(`Model: ${response.model}`);

The call goes to POST /platform/v1/llm/complete on the gateway. The gateway routes to whichever LLM adapter is configured in the server's kb.config.json, runs the completion, and returns a typed LLMResponse.

With options

TypeScript
const response = await platform.llm.complete('Summarize this text...', {
  temperature: 0.3,
  maxTokens: 200,
  systemPrompt: 'You are a concise summarizer.',
});

LLMOptions is a small set of generic LLM parameters — model, temperature, max tokens, stop sequences, system prompt. See Overview → Quick API reference.

Step 4 — Use the cache

TypeScript
// Store a value with a 60-second TTL
await platform.cache.set('query:my-unique-key', { data: 'cached result' }, 60_000);
 
// Retrieve it later
const cached = await platform.cache.get<{ data: string }>('query:my-unique-key');
if (cached) {
  console.log(cached.data);
}
 
// Delete
await platform.cache.delete('query:my-unique-key');

cache.get returns T | null. Use the generic to type the return value; the client doesn't validate the shape — it trusts the server.

Step 5 — Search a vector store

TypeScript
// If the gateway has a configured vectorStore adapter:
const results = await platform.vectorStore.search({
  query: 'authentication middleware',
  limit: 5,
});
 
console.log(`Found ${results.length} results`);
 
// Get total document count:
const total = await platform.vectorStore.count();
console.log(`Index has ${total} documents`);

The search query is a plain object — the shape depends on the vector store adapter you're talking to. Qdrant accepts { vector, filter, limit }; other adapters may differ. Check the server-side adapter's docs.

Step 6 — Emit telemetry

TypeScript
// Fire-and-forget events (batched, flushed every 5s or at 50 events)
platform.telemetry.event('user.action', { action: 'clicked-button', buttonId: 'cta-1' });
platform.telemetry.metric('page_load_ms', 342);
platform.telemetry.log('info', 'Processing completed', { items: 42 });
 
// Force a flush if you need to see events immediately
await platform.telemetry.flush();

Telemetry calls are synchronous — they just push into an internal buffer. The buffer flushes every 5 seconds or when it reaches 50 events, whichever comes first. Call flush() explicitly if you need events to be sent immediately (e.g. right before a process exit).

For unbuffered, immediate events, use track() or identify() instead:

TypeScript
// Goes straight to the server
await platform.telemetry.track('subscription.upgraded', { plan: 'pro' });
await platform.telemetry.identify('user-123', { email: 'alice@example.com' });

See Telemetry for the batching model and when to use which.

Step 7 — Use the generic call for anything else

When you need to hit an adapter or method the typed proxies don't cover:

TypeScript
// Trigger a workflow run:
const run = await platform.call('workflows', 'run', {
  workflowId: 'nightly-cleanup',
  inputs: { dryRun: false },
});
console.log(`Run started: ${run.id}`);
 
// Call a sorted-set operation the CacheProxy doesn't expose:
await platform.call('cache', 'zadd', 'queue', Date.now(), 'job-42');
const due = await platform.call<string[]>('cache', 'zrangebyscore', 'queue', 0, Date.now());

call<T>(adapter, method, ...args) is typed on the return value (you provide the generic). The args are untyped — pass whatever the server-side method expects.

See Workflows for the full workflow trigger pattern.

Step 8 — Shutdown cleanly

In long-running processes (servers, daemons), call shutdown() before exit to flush buffered telemetry:

TypeScript
process.on('SIGTERM', async () => {
  console.log('Shutting down, flushing telemetry...');
  await platform.shutdown();
  process.exit(0);
});

Skipping this means the last few telemetry events are lost when the process exits.

Complete example

Putting it all together — a minimal backend handler:

TypeScript
import { KBPlatform } from '@kb-labs/platform-client';
 
const platform = new KBPlatform({
  endpoint: process.env.KB_GATEWAY_URL!,
  apiKey: process.env.KB_API_KEY!,
  defaultTags: { source: 'my-backend', env: process.env.NODE_ENV ?? 'dev' },
  onError: (err) => console.error('[platform-client]', err),
});
 
export async function summarize(text: string, userId: string): Promise<string> {
  // Check cache first
  const cacheKey = `summary:${hashString(text)}`;
  const cached = await platform.cache.get<string>(cacheKey);
  if (cached) {
    platform.telemetry.event('summary.cache.hit', { userId });
    return cached;
  }
 
  // Call LLM
  const start = Date.now();
  const response = await platform.llm.complete(
    `Summarize in one sentence:\n\n${text}`,
    { temperature: 0.3, maxTokens: 100 },
  );
 
  const summary = response.content.trim();
 
  // Cache for an hour
  await platform.cache.set(cacheKey, summary, 3_600_000);
 
  // Emit metrics
  platform.telemetry.metric('summary.duration_ms', Date.now() - start);
  platform.telemetry.event('summary.generated', {
    userId,
    tokens: response.usage.promptTokens + response.usage.completionTokens,
    model: response.model,
  });
 
  return summary;
}
 
// Clean up on exit
process.on('SIGTERM', async () => {
  await platform.shutdown();
  process.exit(0);
});
 
function hashString(s: string): string {
  // Any hash function — e.g. built-in crypto.createHash('sha256')
  return Buffer.from(s).toString('base64').slice(0, 16);
}

Three calls — one cache read, one LLM call, one cache write — plus background telemetry. 50 lines, no dependencies, works in any Node runtime.