Hi, I'm Ahmed
I design and ship
backends for
complex systems.
Software Engineer — Backend · Platform · AI Systems.
I prototype in code; I care about latency budgets, reliability, and clean system design.
Experience
5+ years building production web systems
Specializes in
Frontend systems • Design systems • Distributed architecture • AI product UX
Open to
Senior Frontend • Staff Engineer • Platform Engineer roles
Production-grade ML pipeline for real-time financial transaction fraud detection with model training, drift monitoring, and scoring API.
Multi-tenant SaaS orchestrating AI workflows and integrations at scale.
Backend platform powering dynamic site generation and multi-tenant publishing.
Conversational AI backend with RAG pipeline and vector search.
High-throughput streaming pipeline for ingesting, transforming, and routing millions of events per second.
Token-bucket and sliding-window rate limiting as a standalone service, supporting multi-region consistency.
Research implementation of eight real-world schema evolution scenarios across a three-service microservices architecture, covering PostgreSQL migrations, REST API versioning, and event schema evolution with backward compatibility patterns.
CDC pipeline that streams database changes into an event log, supports consumers, replay, and schema evolution with a demo consumer that builds projections.
A tool for testing and evaluating RAG retrieval pipelines by comparing chunking strategies, embedding models, and reranking methods using metrics like Precision@K and nDCG.
Production-quality research system comparing six idempotency strategies for a payments API domain, built with FastAPI, PostgreSQL, Redis, and RabbitMQ.
Production-grade research system for evaluating, benchmarking, and mitigating hallucinations in enterprise LLM applications with multiple RAG variants and guardrail frameworks.
Centralized configuration service used across internal systems for managing application settings and feature flags.
Platform simulating microservice failures to evaluate retries, circuit breakers, bulkheads, and idempotency. Measures reliability, latency, and duplicate prevention to guide resilient system design.
Comprehensive empirical study examining how structural design decisions in Terraform infrastructure-as-code affect long-term maintainability, drift susceptibility, and change management complexity.
Production-ready unified API gateway for routing requests across multiple LLM providers with built-in rate limiting, response caching, cost tracking, and OpenTelemetry observability.
How I work
Prototype in code, not mockups
I reach for a coded prototype before a finished design. It surfaces real constraints earlier.
Latency is a feature
I track render budgets and hydration cost. Perf profiling is part of the design review, not a post-launch task.
Write for the next engineer
Readable code, documented trade-offs, decision logs. Systems outlive their authors.
Projects
High-throughput streaming pipeline for ingesting, transforming, and routing millions of events per second.
Token-bucket and sliding-window rate limiting as a standalone service, supporting multi-region consistency.
Research implementation of eight real-world schema evolution scenarios across a three-service microservices architecture, covering PostgreSQL migrations, REST API versioning, and event schema evolution with backward compatibility patterns.
CDC pipeline that streams database changes into an event log, supports consumers, replay, and schema evolution with a demo consumer that builds projections.
A tool for testing and evaluating RAG retrieval pipelines by comparing chunking strategies, embedding models, and reranking methods using metrics like Precision@K and nDCG.
Production-quality research system comparing six idempotency strategies for a payments API domain, built with FastAPI, PostgreSQL, Redis, and RabbitMQ.
Production-grade research system for evaluating, benchmarking, and mitigating hallucinations in enterprise LLM applications with multiple RAG variants and guardrail frameworks.
Centralized configuration service used across internal systems for managing application settings and feature flags.
Platform simulating microservice failures to evaluate retries, circuit breakers, bulkheads, and idempotency. Measures reliability, latency, and duplicate prevention to guide resilient system design.
Comprehensive empirical study examining how structural design decisions in Terraform infrastructure-as-code affect long-term maintainability, drift susceptibility, and change management complexity.
Production-ready unified API gateway for routing requests across multiple LLM providers with built-in rate limiting, response caching, cost tracking, and OpenTelemetry observability.
Focus
Event-Driven Architecture
Async events, queues, reliable delivery. Designing systems where out-of-order messages and partial failures are expected.
Distributed Systems
Retries, idempotency, circuit breakers, eventual consistency. Building for failure rather than hoping for success.
Multi-Tenant SaaS
Schema-per-tenant isolation, billing metering, usage tracking, scoped workspaces.
AI Infrastructure
Ingestion pipelines, vector search, RAG architectures, model orchestration. Latency and cost budgets matter.
Platform Engineering
Developer tooling, build systems, CI/CD abstractions, internal platforms that reduce toil.
Production Reliability
Observability, SLOs, alerting, incident response. Systems fail — the question is how they fail.
Writing
Technical papers and deep-dives on systems I've built and problems I've solved in production.
Hierarchical Chunking Strategies for Production RAG Systems: Balancing Retrieval Precision and Context Coherence
2024Internal Technical Report
Retrieval-Augmented Generation systems degrade in precision as knowledge bases grow. This paper examines chunking strategies — fixed-size, paragraph-level, and hierarchical parent–child — across corpora of varying size and domain density. We introduce a re-ranking layer using cross-encoder models and show it recovers precision lost at scale while remaining compatible with standard vector-search backends. Benchmarks are run against a golden dataset of 2,400 support queries across four enterprise tenants.
Multi-Tenant Event Sourcing at Scale: Schema Isolation, Replay Semantics, and Operational Lessons
2024Internal Technical Report
Event sourcing in multi-tenant SaaS systems introduces tension between tenant isolation and operational simplicity. We describe our experience migrating a 30-tenant workflow platform from a shared event log to a namespace-isolated architecture, covering schema-per-tenant trade-offs, aggregate snapshot strategies to bound replay time, and the tooling required to safely replay tenant event streams without cross-tenant interference.
Exactly-Once Delivery in Heterogeneous Sink Pipelines: Lessons from a High-Throughput Kafka Consumer Fleet
2025Internal Technical Report
Exactly-once semantics in streaming pipelines are well-studied within a single system but become subtle when events must be durably committed to multiple heterogeneous sinks — analytics stores, billing aggregators, and alerting systems — in a single logical transaction. We detail the rebalance-listener pattern, idempotency key design, and per-sink commit protocols that enabled zero duplicate charges across 40M+ daily events on a Kafka-backed pipeline.
Clock-Independent Rate Limiting: Eliminating Skew Drift in Distributed Token-Bucket Implementations
2025Internal Technical Report
Token-bucket rate limiters that compute refill amounts using client-side timestamps accumulate systematic drift when hosts have clock skew. This paper quantifies the drift under realistic NTP conditions and proposes using authoritative server-side timestamps — specifically Redis server time via Lua scripts — to eliminate client clock dependence entirely. We compare bucket accuracy across five implementations under 50ms and 200ms of injected skew.
About
Former IC / lead at B2B platforms. I work end-to-end: interaction models, design systems, perf profiling.
My work spans event-driven architectures, multi-tenant SaaS products, and AI-powered systems. I care about systems that scale cleanly, fail gracefully, and are a pleasure for teams to operate.
I've worked across the full lifecycle of production systems — initial design through to observability and incident response.
React + TypeScript (SSR, hydration budgets) • Systems design for client data flows • Design systems governance.
Stack
Languages & Runtimes
Node.js + TypeScript (primary), Python for ML pipelines, Go for high-throughput services.
Frontend
React + TypeScript (SSR, hydration budgets), Next.js, design systems governance.
Databases
PostgreSQL as default, Redis for ephemeral state and rate-limiting, vector DBs (Pinecone, Weaviate) for similarity search.
Infrastructure
Docker-first local dev, Kubernetes for orchestration, AWS / GCP for managed services.
AI / ML
OpenAI API integration, LangChain for RAG orchestration, embedding pipelines, hallucination mitigation patterns.
Queues & Observability
BullMQ + RabbitMQ for async work, Kafka for streaming. Prometheus + Grafana for metrics; OpenTelemetry traces.
Contact
Let's talk.
Building something in backend infrastructure, platform engineering, or AI systems? I'd like to hear about it.
I respond within a few days.