Hey there!

If it's not in prod, it doesn't exist.

Latency is the silent killer of user trust.

RAG is easy. Good RAG is hard.

Context engineering > prompt engineering.

Agents are just while loops with ambition.

Trace everything. Assume nothing.

I'm Ziyan, building AI systems that ship to production

GenAI Engineer specializing in LLM systems, RAG pipelines, multi-agent orchestration, and production-grade AI infrastructure on AWS and GCP.

Resume

Engineering Benchmarks

Performance metrics from production AI systems

2+Years Experience•

100K+RAG Embeddings•

30xAPI Latency Reduced•

5+LLM Providers Integrated

API Latency Achieved
Embeddings in Production RAG
Latency Reduction (30s to <1s)

Manual Work Eliminated
CGPA in BS Software Engineering
End-to-End AI Systems Delivered

How I Build AI Systems

From prototype to production-grade infrastructure

Scope

Understand constraints: latency budgets, cost limits, scaling requirements, and data characteristics.

Design

Choose the right architecture: which models, embedding strategies, chunking approaches, and orchestration patterns.

Build

Fast prototypes to validate before committing to production code. Iterate on retrieval quality and accuracy.

Deploy

Ship with observability on AWS/GCP. Docker, CI/CD pipelines, and event-driven architectures.

Optimize

Improve based on production data: reduce latency, cut costs, improve retrieval accuracy and reduce hallucinations.

1. Scope

Understand constraints: latency budgets, cost limits, scaling requirements, and data characteristics.

2. Design

Choose the right architecture: which models, embedding strategies, chunking approaches, and orchestration patterns.

3. Build

Fast prototypes to validate before committing to production code. Iterate on retrieval quality and accuracy.

4. Deploy

Ship with observability on AWS/GCP. Docker, CI/CD pipelines, and event-driven architectures.

5. Optimize

Improve based on production data: reduce latency, cut costs, improve retrieval accuracy and reduce hallucinations.

Featured Projects

Deep dives into products I've shipped from 0 to 1

Active

RAG-Based Knowledge Retrieval System

Scalable Retrieval-Augmented Generation system with 100K+ embeddings using LangChain, FastAPI, and Weaviate. Optimized retrieval accuracy with chunking strategies and context window management.

Active

Multi-Agent AI Automation System

Orchestrated multi-agent workflows using LangGraph and LangChain for complex task execution. Integrated LLMs with external systems via APIs, messaging, and automation platforms.

Document AI and OCR Processing Pipeline on AWS

Active

Document AI & OCR Pipeline

End-to-end document processing system using AWS Lambda, SQS, S3, and OCR engines. Implements pipelines for document ingestion, classification, validation, and structured data extraction.

My Journey

Building production AI systems from day one

AI Context Engineer

Xperion (Remote)

Jan 2026 – Present

•Designed AI-driven document processing pipelines using AWS Lambda, SQS, and event-driven architecture
•Built workflows for document ingestion, classification, validation, and structured data extraction
•Implemented context-aware retrieval and grounding techniques to improve output accuracy
•Developed human-in-the-loop systems for validation, correction, and exception handling
•Integrated AWS services including DynamoDB, S3, and OpenSearch for scalable data storage
•Automated large-scale receipt processing workflows, significantly reducing manual effort

AI Engineer (GenAI / Backend Systems)

Texagon

Jan 2024 – Present

•Built a large-scale RAG system with 100K+ embeddings, reducing query latency to <30 seconds
•Reduced API latency from ~30s to <1s through system optimization and edge deployment
•Built and orchestrated multi-agent AI systems using LangChain and LangGraph across multiple LLM providers
•Developed scalable backend services using FastAPI, PostgreSQL, and cloud platforms (AWS, GCP)
•Built inference pipelines and model serving workflows using APIs and open-source LLMs (Ollama, vLLM)
•Developed a scalable license plate recognition system supporting concurrent users with auto-scaling

Core Skills

Product leadership capabilities I bring to every team

Product Strategy

Product VisioningRoadmappingFeature PrioritizationPRD WritingMarket ResearchCompetitive AnalysisGo-to-Market PlanningBusiness Case DevelopmentOKR PlanningMVP Scoping

User & Growth

User ResearchUX StrategyA/B TestingOnboarding OptimizationRetention StrategyEngagement DesignConversion OptimizationFunnel AnalysisCohort Analysis

Stakeholder Leadership

Cross-Functional LeadershipExecutive CommunicationStakeholder ManagementTeam AlignmentSprint PlanningRoadmap ExecutionFeedback LoopsDecision Frameworks

Data & Insights

SQL AnalyticsDashboard BuildingPerformance MetricsStatistical AnalysisExperiment DesignHypothesis TestingData-Driven Decisions

AI Product Development

AI Product StrategyLLM Application DesignRAG System DesignPrompt EngineeringModel Selection StrategyAI UX PatternsResponsible AI

Tech Stack

Tools and platforms I use to ship AI products

React

Next.js

TypeScript

JavaScript

Python

Node.js

Tailwind

Vite

OpenAI

Anthropic

Gemini

Hugging Face

LangChain

Storybook

Replit

Retool

Supabase

PostgreSQL

MongoDB

Redis

Prisma

GraphQL

FastAPI

Vercel

AWS

React

Next.js

TypeScript

JavaScript

Python

Node.js

Tailwind

Vite

OpenAI

Anthropic

Gemini

Hugging Face

LangChain

Storybook

Replit

Retool

Supabase

PostgreSQL

MongoDB

Redis

Prisma

GraphQL

FastAPI

Vercel

AWS

Docker

Kubernetes

Nginx

GitHub

Git

Mixpanel

Slack

Google Analytics

Tableau

Figma

Notion

Linear

Airtable

Zapier

Stripe

Trello

Jira

Postman

TensorFlow

PyTorch

Jupyter

NumPy

Pandas

Framer

Docker

Kubernetes

Nginx

GitHub

Git

Mixpanel

Slack

Google Analytics

Tableau

Figma

Notion

Linear

Airtable

Zapier

Stripe

Trello

Jira

Postman

TensorFlow

PyTorch

Jupyter

NumPy

Pandas

Framer

AI Platforms

OpenAIAnthropicGeminiElevenLabsWhisperRunway ML

Product Analytics

MixpanelPostHogAmplitudeGoogle AnalyticsHotjarFullStory

Product Operations

NotionLinearJiraConfluenceCodaAirtableFigma

Prototyping & Testing

FigmaV0Bolt.newLovableReplitVoiceFlowUserTesting

Data & SQL

RetoolTableauRedshiftPostgreSQLSupabaseBigQuery

AI Coding Assistants

CursorClaude CodeAntigravityGitHub CopilotWindsurfV0

Let's Build Something

Looking to build production AI systems or need a GenAI engineer? Let's talk architecture.

Let's chat directly

I'm Ziyan, building AI systems that ship to production

Engineering Benchmarks

How I Build AI Systems