Эта страница ещё не переведена на русский.Помочь с переводом на GitHub →

Testing Plugins

Обновлено 7 апреля 2026 г.

Unit tests, integration tests, and mocking platform services in plugin code.

Plugins are regular TypeScript packages — they're tested with whatever test runner you already use (Vitest is the convention in the KB Labs monorepo). The SDK ships @kb-labs/sdk/testing with helpers for constructing test contexts and mocking platform services, so handler code is testable without booting the full platform.

This guide covers three layers: unit tests for pure logic, unit tests for handler logic with mocked platform services, and integration tests against a real platform instance.

For the full reference of test utilities, see SDK → Testing. This page is the practical workflow.

The test layers

┌─────────────────────────────────────────┐
│  Integration tests                      │
│  (real platform, real adapters, slow)   │
├─────────────────────────────────────────┤
│  Handler unit tests                     │
│  (createTestContext + mocks, fast)      │
├─────────────────────────────────────────┤
│  Pure logic unit tests                  │
│  (no platform at all, instant)          │
└─────────────────────────────────────────┘

Most of your tests should be at the bottom two layers. Integration tests are slow and brittle — save them for the critical paths.

Setup

Add Vitest to your plugin package:

JSON

{
  "devDependencies": {
    "vitest": "^3.0.0",
    "@vitest/ui": "^3.0.0"
  },
  "scripts": {
    "test": "vitest run",
    "test:watch": "vitest",
    "test:ui": "vitest --ui"
  }
}

vitest.config.ts:

TypeScript

import { defineConfig } from 'vitest/config';
 
export default defineConfig({
  test: {
    environment: 'node',
    globals: false,
    include: ['src/**/*.test.ts'],
  },
});

Keep tests next to the code they test: src/cli/commands/hello.ts → src/cli/commands/hello.test.ts.

Layer 1 — Pure logic tests

If your handler can be decomposed into pure functions, test those directly. No platform, no context, no mocks.

TypeScript

// src/lib/parse-scope.ts
export function parseScope(raw: string): string[] {
  return raw.split(',').map(s => s.trim()).filter(Boolean);
}
 
// src/lib/parse-scope.test.ts
import { describe, it, expect } from 'vitest';
import { parseScope } from './parse-scope';
 
describe('parseScope', () => {
  it('splits comma-separated values', () => {
    expect(parseScope('a,b,c')).toEqual(['a', 'b', 'c']);
  });
 
  it('trims whitespace', () => {
    expect(parseScope('a , b , c')).toEqual(['a', 'b', 'c']);
  });
 
  it('drops empty entries', () => {
    expect(parseScope('a,,b, ,c')).toEqual(['a', 'b', 'c']);
  });
});

Pure tests run in milliseconds and have no side effects. If a function can be tested this way, it should be.

Layer 2 — Handler unit tests

Handlers call platform services through hooks. You can't test them as pure functions because they need a context and a platform singleton. The SDK's createTestContext + mock builders swap both out with fakes.

TypeScript

// src/cli/commands/hello.test.ts
import { describe, it, expect, afterEach } from 'vitest';
import { testCommand, mockLLM } from '@kb-labs/sdk/testing';
import helloHandler from './hello';
 
describe('hello:greet', () => {
  let cleanup: () => void;
 
  afterEach(() => cleanup?.());
 
  it('returns a canned greeting without --ai', async () => {
    const result = await testCommand(helloHandler, {
      flags: { name: 'Alice' },
    });
    cleanup = result.cleanup;
 
    expect(result.exitCode).toBe(0);
    expect(result.result).toEqual({
      greeting: 'Hello, Alice!',
      source: 'deterministic',
    });
  });
 
  it('uses LLM when --ai is passed', async () => {
    const llm = mockLLM()
      .onAnyComplete()
      .respondWith('Hi Alice, great to see you!');
 
    const result = await testCommand(helloHandler, {
      flags: { name: 'Alice', ai: true },
      platform: { llm },
    });
    cleanup = result.cleanup;
 
    expect(result.exitCode).toBe(0);
    expect(result.result?.source).toBe('llm');
    expect(result.result?.greeting).toContain('Alice');
    expect(llm.complete).toHaveBeenCalled();
  });
 
  it('falls back to canned greeting when LLM is unavailable and --ai is passed', async () => {
    // No llm in the platform override — useLLM() returns undefined
    const result = await testCommand(helloHandler, {
      flags: { name: 'Alice', ai: true },
    });
    cleanup = result.cleanup;
 
    expect(result.exitCode).toBe(0);
    expect(result.result?.source).toBe('deterministic');
  });
});

Three things to internalize:

testCommand wraps your handler in a test context and runs it.
platform.llm in options becomes what useLLM() returns. You control the mock; the handler sees it as the real thing.
cleanup() in afterEach resets the platform singleton between tests. Without it, state leaks between tests.

See SDK → Testing for the full API of testCommand, createTestContext, and every mock builder.

Mocking the cache

TypeScript

import { testCommand, mockCache } from '@kb-labs/sdk/testing';
 
it('caches results', async () => {
  const cache = mockCache();
 
  const result = await testCommand(handler, {
    flags: { query: 'test' },
    platform: { cache },
  });
  cleanup = result.cleanup;
 
  // First call should miss and set
  expect(cache.get).toHaveBeenCalledWith('query:test');
  expect(cache.set).toHaveBeenCalledWith(
    'query:test',
    expect.any(Object),
    60_000,
  );
 
  // Value persists in the mock's in-memory store
  expect(await cache.get('query:test')).toBeDefined();
});

mockCache() is a real in-memory ICache implementation. get/set work as expected; the spies let you assert on call counts and arguments.

Mocking storage

TypeScript

import { mockStorage } from '@kb-labs/sdk/testing';
 
it('writes a report', async () => {
  const storage = mockStorage();
  const result = await testCommand(handler, {
    flags: { output: 'reports/summary.md' },
    platform: { storage },
  });
  cleanup = result.cleanup;
 
  expect(storage.write).toHaveBeenCalled();
  expect(await storage.read('reports/summary.md')).not.toBeNull();
});

Mocking tool calls

For handlers that use LLM tool-calling:

TypeScript

import { mockLLM, mockTool } from '@kb-labs/sdk/testing';
 
it('handles tool calls', async () => {
  const searchTool = mockTool('search')
    .withInput({ query: 'hello' })
    .returning({ results: [{ id: 1 }] });
 
  const llm = mockLLM()
    .onAnyChatWithTools()
    .respondWith({
      content: 'Calling search',
      toolCalls: [{ id: '1', name: 'search', input: { query: 'hello' } }],
    });
 
  const result = await testCommand(handler, {
    flags: { query: 'hello' },
    platform: { llm },
  });
  cleanup = result.cleanup;
 
  expect(llm.chatWithTools).toHaveBeenCalled();
});

Layer 3 — Integration tests

When you need to test real adapter behavior — actually talking to Qdrant, actually calling OpenAI, actually writing to SQLite — spin up a real platform instance.

Option A — Test workspace with `kb.config.json`

Create a test-specific workspace:

tests/fixtures/workspace/
├── .kb/
│   └── kb.config.json       # test config with in-memory adapters
└── packages/
    └── your-plugin/         # linked via marketplace

Test config:

JSON

{
  "platform": {
    "adapters": {
      "llm": null,
      "cache": null,
      "storage": "@kb-labs/adapters-fs"
    },
    "adapterOptions": {
      "storage": { "basePath": ".kb/test-storage" }
    },
    "execution": { "mode": "in-process" }
  }
}

null explicitly installs the NoOp adapter for tokens you don't want to configure. in-process keeps everything in the test process so you can assert on internal state without IPC.

Run your handler against this workspace from your test file:

TypeScript

import { createServiceBootstrap } from '@kb-labs/core-runtime';
import { join } from 'path';
 
beforeAll(async () => {
  await createServiceBootstrap({
    appId: 'test',
    repoRoot: join(__dirname, 'fixtures/workspace'),
  });
});

Option B — Live services + HTTP assertions

For higher-fidelity integration tests, run the REST API as a subprocess and hit it over HTTP:

TypeScript

import { spawn } from 'child_process';
 
beforeAll(async () => {
  const proc = spawn('node', ['path/to/rest-api/dist/index.js'], {
    env: { ...process.env, PORT: '15050' },
  });
 
  // Wait for /health
  for (let i = 0; i < 30; i++) {
    try {
      await fetch('http://localhost:15050/api/v1/health');
      return;
    } catch {
      await new Promise(r => setTimeout(r, 200));
    }
  }
  throw new Error('REST API did not start');
});
 
it('calls the plugin over HTTP', async () => {
  const response = await fetch('http://localhost:15050/api/v1/plugins/hello/greet', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer dev-token' },
    body: JSON.stringify({ name: 'Alice' }),
  });
 
  expect(response.status).toBe(200);
  const data = await response.json();
  expect(data.greeting).toBeDefined();
});

Integration tests are slow (service startup takes seconds) but give you confidence that the full plugin → REST → platform → adapter chain works end-to-end.

What NOT to test

Don't test that the SDK works. defineCommand has its own tests; you don't need to verify that it wraps your handler correctly.
Don't test the hooks. useLLM / useCache / useStorage have their own tests. Mock them and assert on your handler's behavior with the mock's return value.
Don't test Ant Design components. If you're writing Studio page tests, mock the hooks and assert on the data flow, not on the rendered HTML.
Don't test the host guard. defineCommand throws if called from the wrong host — that's tested in the SDK. You don't need to re-verify it.

Running tests

Bash

pnpm test            # one-shot
pnpm test:watch      # watch mode
pnpm test:ui         # Vitest UI

In CI:

YAML

- run: pnpm install --frozen-lockfile
- run: pnpm build
- run: pnpm test
- run: pnpm type-check
- run: pnpm lint

Test organization

Structure I recommend for plugin repos:

src/
├── cli/
│   └── commands/
│       ├── hello.ts
│       └── hello.test.ts             ← handler unit tests
├── lib/
│   ├── parse-scope.ts
│   └── parse-scope.test.ts           ← pure logic unit tests
└── rest/
    └── handlers/
        ├── greet.ts
        └── greet.test.ts             ← REST handler unit tests
tests/
├── integration/
│   ├── fixtures/
│   │   └── workspace/
│   └── greet.test.ts                 ← integration tests
└── e2e/
    └── smoke.test.ts                 ← optional smoke tests against real env

Unit tests live next to the code. Integration tests live in a separate directory because they share fixtures and have different lifecycle (longer startup, network deps).

Testing Plugins

The test layers

Setup

Layer 1 — Pure logic tests

Layer 2 — Handler unit tests

Mocking the cache

Mocking storage

Mocking tool calls

Layer 3 — Integration tests

Option A — Test workspace with kb.config.json

Option B — Live services + HTTP assertions

What NOT to test

Running tests

Test organization

What to read next

Option A — Test workspace with `kb.config.json`