TypeScript Patterns I Actually Use in Production AI Apps

TypeScript and AI SDKs have an awkward relationship. The SDKs give you types, but the actual structure of what an LLM returns — especially with tool calls, streaming chunks, and structured output — can feel like you're fighting the type system more than it's helping you.

After shipping several production AI features, I've landed on a set of patterns that keep things sane. None of these are revolutionary. They're just the decisions I had to make and now don't think about anymore.

Type Your LLM Outputs at the Boundary

The single most important habit: define what you expect the LLM to return before you write any prompt code, and validate it at the boundary.

Don't do this:

const result = await llm.complete(prompt)
const data = JSON.parse(result) // `any` all the way down

Do this:

import { z } from 'zod'
 
const ClinicalNoteSchema = z.object({
  subjective: z.string(),
  objective: z.string(),
  assessment: z.string(),
  plan: z.string(),
  confidence: z.enum(['high', 'medium', 'low']),
})
 
type ClinicalNote = z.infer<typeof ClinicalNoteSchema>
 
async function generateNote(transcript: string): Promise<ClinicalNote> {
  const raw = await llm.complete(buildPrompt(transcript))
 
  // Parse and validate — throws if the LLM returned something unexpected
  return ClinicalNoteSchema.parse(JSON.parse(raw))
}

The schema is your contract with the model. When the model breaks it (and it will, occasionally), you find out immediately at the parse step instead of three function calls later when something explodes trying to access .confidence.toUpperCase() on undefined.

If you're using the Vercel AI SDK or Anthropic's SDK, use generateObject with a Zod schema directly — it handles the JSON coercion and retries for you:

import { generateObject } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
 
const { object } = await generateObject({
  model: anthropic('claude-3-5-sonnet-20241022'),
  schema: ClinicalNoteSchema,
  prompt: buildPrompt(transcript),
})
 
// `object` is fully typed as ClinicalNote

Discriminated Unions for Streaming State

Streaming responses have three meaningful states: loading, streaming (partial content), and complete. Most codebases represent this as two or three separate booleans, which leads to impossible states like isLoading: true, isComplete: true.

A discriminated union makes the states explicit and exhaustive:

type StreamState<T> =
  | { status: 'idle' }
  | { status: 'streaming'; partial: string }
  | { status: 'complete'; data: T }
  | { status: 'error'; error: Error }
 
// In a React component
function useStream<T>(schema: z.ZodSchema<T>) {
  const [state, setState] = useState<StreamState<T>>({ status: 'idle' })
 
  const run = async (prompt: string) => {
    setState({ status: 'streaming', partial: '' })
    try {
      let accumulated = ''
      for await (const chunk of streamCompletion(prompt)) {
        accumulated += chunk
        setState({ status: 'streaming', partial: accumulated })
      }
      const parsed = schema.parse(JSON.parse(accumulated))
      setState({ status: 'complete', data: parsed })
    } catch (error) {
      setState({ status: 'error', error: error as Error })
    }
  }
 
  return { state, run }
}

Now in the component, the switch is exhaustive and TypeScript will tell you if you forget a case:

function NoteDisplay({ state }: { state: StreamState<ClinicalNote> }) {
  switch (state.status) {
    case 'idle':
      return <StartButton />
    case 'streaming':
      return <PartialText text={state.partial} />
    case 'complete':
      return <NoteCard note={state.data} />   // `state.data` is typed as ClinicalNote
    case 'error':
      return <ErrorMessage error={state.error} />
  }
}

Type-Safe Tool Definitions

Tool calls are where types get murky fast. The LLM decides which tool to call and with what arguments — you need to validate those arguments the same way you'd validate user input.

import { z } from 'zod'
import { tool } from 'ai'
 
// Define tools with their input schemas
const tools = {
  searchDocuments: tool({
    description: 'Search the knowledge base for relevant documents',
    parameters: z.object({
      query: z.string().describe('The search query'),
      maxResults: z.number().int().min(1).max(10).default(5),
      filters: z.object({
        tags: z.array(z.string()).optional(),
        dateAfter: z.string().datetime().optional(),
      }).optional(),
    }),
    execute: async ({ query, maxResults, filters }) => {
      // Arguments are fully typed here — no casting needed
      return searchKnowledgeBase(query, { maxResults, filters })
    },
  }),
 
  createNote: tool({
    description: 'Save a structured note to the database',
    parameters: ClinicalNoteSchema,
    execute: async (note) => {
      // `note` is typed as ClinicalNote
      return db.notes.create(note)
    },
  }),
}

This approach keeps the input validation co-located with the tool definition, and TypeScript infers the execute argument types from the schema. No manual casting anywhere.

A Generic Retry Wrapper for Flaky LLM Calls

LLM APIs fail. Rate limits, timeouts, occasional 500s — in production you need retries. This is the wrapper I copy into every project:

interface RetryOptions {
  maxAttempts?: number
  initialDelayMs?: number
  shouldRetry?: (error: unknown) => boolean
}
 
async function withRetry<T>(
  fn: () => Promise<T>,
  options: RetryOptions = {}
): Promise<T> {
  const {
    maxAttempts = 3,
    initialDelayMs = 500,
    shouldRetry = () => true,
  } = options
 
  let lastError: unknown
 
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await fn()
    } catch (error) {
      lastError = error
      if (attempt === maxAttempts - 1 || !shouldRetry(error)) throw error
 
      const delay = initialDelayMs * Math.pow(2, attempt) // Exponential backoff
      await new Promise((resolve) => setTimeout(resolve, delay))
    }
  }
 
  throw lastError
}
 
// Usage
const note = await withRetry(
  () => generateNote(transcript),
  {
    maxAttempts: 3,
    shouldRetry: (error) => {
      // Retry rate limits and server errors, not invalid inputs
      if (error instanceof APIError) return error.status >= 429
      return false
    },
  }
)

Keep AI Logic Out of Components

This one is more architecture than TypeScript, but TypeScript enforces it nicely. AI calls shouldn't live in React components — they should live in service functions with clear input and output types. The component just calls the service and renders state.

// lib/ai/clinical-notes.ts
export async function generateClinicalNote(
  transcript: string,
  options: { model?: string; temperature?: number } = {}
): Promise<ClinicalNote> {
  // All the LLM wiring, retry logic, and validation lives here
}
 
// app/notes/page.tsx
function NotesPage() {
  const { state, run } = useStream(ClinicalNoteSchema)
 
  return (
    <button onClick={() => run(transcript)}>
      Generate Note
    </button>
  )
}

When you need to swap models, change prompts, or add caching, you do it in one place. The component stays dumb and the service stays testable.

None of these patterns are surprising if you've written TypeScript for a while. The shift is applying the same rigour you'd give to a payment service or auth system to LLM code — which historically gets treated as "just a string in, string out" until it becomes a maintenance problem.

TypeScript Patterns I Actually Use in Production AI Apps

Type Your LLM Outputs at the Boundary

Discriminated Unions for Streaming State

Type-Safe Tool Definitions

A Generic Retry Wrapper for Flaky LLM Calls

Keep AI Logic Out of Components

Related Posts

Why Your LLM App Feels Slow (And It's Not the Model)

Building HIPAA-Compliant AI Features: What the Tutorials Skip

The Hidden Cost of Context Windows: Managing Tokens in Production