schema-stream

v3.2.2

Published

5 months ago

<div align="center"> <h1>schema-stream</h1> </div> <br />

Downloads

63,736

0High
0Medium
0Low

dimitrikennedy

schema-stream is a foundational streaming JSON parser that enables immediate data access through structured stubs. Built on Zod schema validation, it provides type-safe parsing and progressive data access for JSON streams.

Key Features

🔄 Stream JSON data with immediate access to partial results
🔑 Path completion tracking for complex objects
📝 Default value support via schema or explicit defaults
🌳 Deep nested object and array support
⚡ Zero dependencies except Zod
🔍 TypeScript types inferred from schema

Installation

# npm
npm install schema-stream zod

# pnpm
pnpm add schema-stream zod

# bun
bun add schema-stream zod

Basic Usage

import { SchemaStream } from 'schema-stream';
import { z } from 'zod';

// Define your schema
const schema = z.object({
  users: z.array(z.object({
    name: z.string(),
    age: z.number()
  })),
  metadata: z.object({
    total: z.number(),
    page: z.number()
  })
});

// Create parser with optional defaults
const parser = new SchemaStream(schema, {
  metadata: { total: 0, page: 1 }
});

// Track completion paths
parser.onKeyComplete(({ completedPaths }) => {
  console.log('Completed:', completedPaths);
});

// Parse streaming data
const stream = parser.parse();
response.body.pipeThrough(stream);

// Read results with full type inference
const reader = stream.readable.getReader();
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  
  const result = JSON.parse(decoder.decode(value));
  // result is fully typed as z.infer<typeof schema>
  console.log(result);
}

Validation and Error Handling

schema-stream focuses solely on streaming JSON parsing with type stubs - it intentionally does not perform full Zod schema validation during parsing. This design choice enables:

Faster parsing without validation overhead
Immediate access to partial data
Flexibility for downstream validation

For full schema validation and error handling, consider using:

zod-stream: Adds validation, OpenAI integration, and structured error handling
instructor: Complete solution for validated LLM extraction

Example of schema-stream being used by zod-stream:

const streamParser = new SchemaStream(response_model.schema, {
  typeDefaults: {
    string: null,
    number: null,
    boolean: null
  },
  onKeyComplete: ({ activePath, completedPaths }) => {
    _activePath = activePath;
    _completedPaths = completedPaths;
  }
});

// Create parser with validation stream
const parser = streamParser.parse({
  handleUnescapedNewLines: true
});

// Add validation in transform stream
const validationStream = new TransformStream({
  transform: async (chunk, controller) => {
    try {
      const parsedChunk = JSON.parse(decoder.decode(chunk));
      const validation = await schema.safeParseAsync(parsedChunk);
      
      controller.enqueue(encoder.encode(JSON.stringify({
        ...parsedChunk,
        _meta: {
          _isValid: validation.success,
          _activePath,
          _completedPaths
        }
      })));
    } catch (e) {
      controller.error(e);
    }
  }
});

// Chain streams
stream
  .pipeThrough(parser)
  .pipeThrough(validationStream);

Real-World Examples

Progressive UI Updates

const schema = z.object({
  analysis: z.object({
    sentiment: z.string(),
    keywords: z.array(z.string()),
    summary: z.string()
  }),
  metadata: z.object({
    processedAt: z.string(),
    wordCount: z.number()
  })
});

const parser = new SchemaStream(schema, {
  // Show loading states initially
  defaultData: {
    analysis: {
      sentiment: "analyzing...",
      keywords: ["loading..."],
      summary: "generating summary..."
    }
  },
  onKeyComplete({ activePath, completedPaths }) {
    // Update UI loading states based on completion
    updateLoadingStates(activePath, completedPaths);
  }
});

Nested Data Processing

const schema = z.object({
  users: z.array(z.object({
    id: z.string(),
    profile: z.object({
      name: z.string(),
      email: z.string(),
      preferences: z.object({
        theme: z.string(),
        notifications: z.boolean()
      })
    }),
    activity: z.array(z.object({
      timestamp: z.string(),
      action: z.string()
    }))
  }))
});

const parser = new SchemaStream(schema);

// Track specific paths for business logic
parser.onKeyComplete(({ activePath, completedPaths }) => {
  const path = activePath.join('.');
  
  // Process user profiles as they complete
  if (path.match(/users\.\d+\.profile$/)) {
    processUserProfile(/* ... */);
  }
  
  // Process activity logs in batches
  if (path.match(/users\.\d+\.activity\.\d+$/)) {
    batchActivityLog(/* ... */);
  }
});

API Reference

`SchemaStream`

class SchemaStream<T extends ZodObject<any>> {
  constructor(
    schema: T,
    options?: {
      defaultData?: NestedObject;
      typeDefaults?: {
        string?: string | null | undefined;
        number?: number | null | undefined;
        boolean?: boolean | null | undefined;
      };
      onKeyComplete?: (info: {
        activePath: (string | number | undefined)[];
        completedPaths: (string | number | undefined)[][];
      }) => void;
    }
  )

  // Create a stub instance of the schema with defaults
  getSchemaStub<T extends ZodRawShape>(
    schema: SchemaType<T>, 
    defaultData?: NestedObject
  ): z.infer<typeof schema>;

  // Parse streaming JSON data
  parse(options?: {
    stringBufferSize?: number;
    handleUnescapedNewLines?: boolean;
  }): TransformStream;
}

Constructor Options

schema: Zod schema defining the structure of your data
options:
- defaultData: Initial values for schema properties
- typeDefaults: Default values for primitive types
- onKeyComplete: Callback for tracking parsing progress

Working with Defaults

There are two ways to provide default values in schema-stream:

1. Via Zod Schema

const schema = z.object({
  // Default via schema
  count: z.number().default(0),
  status: z.string().default('pending'),
  settings: z.object({
    enabled: z.boolean().default(true)
  }),
  tags: z.array(z.string()).default(['default'])
});

const parser = new SchemaStream(schema);

2. Via Constructor Options

// Global type defaults
const parser = new SchemaStream(schema, {
  typeDefaults: {
    string: "",      // Default for all strings
    number: 0,       // Default for all numbers
    boolean: false   // Default for all booleans
  }
});

// Specific property defaults
const parser = new SchemaStream(schema, {
  defaultData: {
    count: 100,
    status: 'ready',
    settings: {
      enabled: true
    }
  }
});

Priority order:

Explicit defaultData values
Zod schema defaults
Global typeDefaults
null (if no other default is found)

Completion Tracking

Track the progress of parsing with path information:

const parser = new SchemaStream(schema, {
  onKeyComplete({ activePath, completedPaths }) {
    // activePath: Current path being processed
    // completedPaths: Array of all completed paths
    console.log('Currently parsing:', activePath);
    console.log('Completed paths:', completedPaths);
  }
});

Parse Options

stringBufferSize: Size of the buffer for string values (default: 0)
handleUnescapedNewLines: Handle unescaped newlines in JSON (default: true)

Schema Stub Utility

Create a typed stub of your schema with defaults:

const schema = z.object({
  users: z.array(z.object({
    name: z.string(),
    age: z.number()
  }))
});

const parser = new SchemaStream(schema);
const stub = parser.getSchemaStub(schema, {
  users: [{ name: "default", age: 0 }]
});
// stub is fully typed as z.infer<typeof schema>

Integration with Island AI

schema-stream is designed as a foundational package that other tools build upon:

zod-stream: Adds validation and OpenAI integration

// Example of zod-stream using schema-stream
const zodStream = new ZodStream();
const extraction = await zodStream.create({
  completionPromise: stream,
  response_model: { 
    schema: yourSchema,
    name: "Extract" 
  }
});

instructor: High-level extraction

const client = Instructor({
  client: oai,
  mode: "TOOLS"
});
  
const result = await client.chat.completions.create({
  response_model: { schema: yourSchema }
  // ...
});

stream-hooks: React hooks for JSON streams
llm-polyglot: Universal LLM client
evalz: LLM output evaluation

Contributing

We welcome contributions! Check out:

Credits: Internal JSON parser logic adapted from streamparser-json.