Afzal Zubair
BlogAbout
#llms

Posts tagged "llms"

4 articles

The Hidden Cost of Context Windows: Managing Tokens in Production
AILLMsPython

The Hidden Cost of Context Windows: Managing Tokens in Production

128k tokens sounds like infinite space until you're paying $0.40 per conversation and users are hitting limits mid-session. Here's how I actually manage context in long-running AI applications.

March 18, 20265 min read
Prompts Are Code: How I Manage Them Like a Senior Engineer
AILLMsPrompt Engineering

Prompts Are Code: How I Manage Them Like a Senior Engineer

A prompt buried in a string literal is a bug waiting to happen. Here's how I version, test, and deploy prompts with the same rigour I'd apply to any production code.

February 10, 20265 min read
RAG is Not Magic: Honest Lessons from Production Retrieval Systems
AILLMsRAG

RAG is Not Magic: Honest Lessons from Production Retrieval Systems

Every RAG demo looks impressive. Production RAG is a different story. Here's what actually breaks, why naive chunking destroys quality, and how I structure retrieval pipelines that hold up under real load.

January 28, 20266 min read
Getting Started with AI and LLMs in Your Web App
AILLMsNext.js

Getting Started with AI and LLMs in Your Web App

Learn how to integrate large language models into your Next.js application using the Vercel AI SDK, with streaming responses and a clean API design.

December 15, 20253 min read
Afzal Zubair

AI & full-stack engineering. Thoughts on LLMs, voice AI, and modern web development.

Navigation

  • Home
  • Blog
  • About

© 2026 Afzal Zubair. Built with Next.js & Tailwind CSS.