Principal Engineer · AI-Accelerated Systems

I modernize the systems businesses run on — at AI speed.

Hi, I'm Jeremy — a Principal Engineer pairing 25 years of systems architecture with applied ML and AI-accelerated delivery: production RAG systems, prompt engineering, and high-velocity modernization for Amazon, Starbucks, T-Mobile, and F5.

25years shipping
12+ML models benchmarked
6 mo → 1 wkdeploy readiness
Jeremy Veleber
Available now

Where would you like to start?

Four quick doors in — take whichever fits why you're here.

About

A builder who's spent two decades making complex systems calmer.

I move fluidly between Python, Java/Spring Boot, C#, Node.js, and Go — choosing the right tool for the problem instead of forcing every solution into one stack. That flexibility has paid off across e-commerce, telecom, and cloud infrastructure.

My sweet spot is modernization: refactoring legacy monoliths into microservices, migrating on-prem systems to cloud-native architectures, and weaving in modern observability and deployment practices — without the downtime that usually comes with it.

These days I lead with applied ML and AI-accelerated engineering — building production RAG systems with ChromaDB, sentence-transformers, and MMR-tuned retrieval; engineering prompts that lift output quality 20–60%; and using AI to translate and migrate legacy codebases with 100% data parity. Old-school rigor, new-school speed.

Selected work

Things I've built recently.

A mix of production platforms and developer tooling. Open any card for the full case study.

C
Production RAG · Open source

claude-code-search

A production RAG system that lets Claude Code retrieve exactly the right code with minimal token usage.

PythonChromaDBsentence-transformersMMRRAG
Read the case study
Problem

Claude Code burns tokens reading whole codebases to find where functionality lives — broad scans with no guarantees.

Solution

A ChromaDB vector store chunked by class and function, with retrieval tuned by Maximal Marginal Relevance (MMR) to balance semantic relevance against diversity — so Claude never reads redundant or irrelevant code. A file watcher keeps the index live on every change.

Stack

ChromaDB persistent index, sentence-transformers embeddings benchmarked across 12+ HuggingFace models (semantic accuracy, inference speed, memory footprint), Python, CLI/API integration for Claude Code workflows.

Outcome

Open source with an active developer user base, and a major win in token efficiency. Evaluating LangChain for agent orchestration and self-learning indexes for future iterations.

A
Open source · GitHub

api-gateway-hub

A Backend-for-Frontend API aggregator with Redis caching and resilient retry logic.

PythonFastAPIRedisPostgreSQL
Read the case study
Problem

Frontends calling many external APIs create latency bottlenecks and expose keys — with no caching, retries, or rate-limit protection.

Solution

A BFF service aggregating three external APIs with intelligent Redis caching, exponential-backoff retry, and stale-cache fallback on failure.

Stack

FastAPI 0.115 async, Redis 7.2 with tiered TTLs, httpx + tenacity for retries, PostgreSQL 16 for request logging.

Outcome

One unified endpoint, a 95%+ cache hit rate via cache-aside with stale fallback, and 84% test coverage using testcontainers.

E
Open source · GitHub

authkit

A production-ready authentication microservice with JWT and OAuth2 social login.

TypeScriptNode.jsPostgreSQLOAuth2
Read the case study
Problem

Hand-rolled auth invites vulnerabilities; managed services like Auth0 add cost and vendor lock-in.

Solution

A self-hosted service with a dual-token pattern (15-min JWT + 7-day refresh), OAuth2 social login via Passport.js, and rate-limited endpoints — a drop-in Auth0/Clerk replacement.

Stack

Node 20 + Express 5 in strict-mode TypeScript, bcrypt + Zod validation, express-rate-limit, PostgreSQL 16 with cascade integrity.

Outcome

100+ tests over 70% coverage with real-DB testcontainers, Docker-Compose ready, and no auth vendor costs.

E
Open source · GitHub

eventstream-api

A real-time event aggregator streaming GitHub, HackerNews, and Reddit via Server-Sent Events.

PythonFastAPIPostgreSQLSSE
Read the case study
Problem

Real-time dashboards mean polling many APIs, normalizing different schemas, and pushing updates — direct client polling is slow and duplicative.

Solution

Background polling of three external APIs, a unified event schema, and SSE streaming to connected clients, with PostgreSQL deduplication by external ID.

Stack

FastAPI 0.115 full-async (SQLAlchemy + httpx), SSE with a broadcast connection manager, PostgreSQL 16, Docker-Compose with k8s manifests.

Outcome

A streaming API over 70% coverage with per-source health checks, rate limiting, and optional API-key auth — ready to scale behind a load balancer.

Experience

Twenty-five years, five chapters.

Feb 2025 — Jun 2026
Latest

Sirrus7

Principal Engineer / Modernization Architect

Led platform modernization for T-Mobile, Starbucks, and an educational game platform — cutting deployment readiness from six months to under a week with a standardized Kubernetes + Istio platform, and developer onboarding from weeks to days.

Aug 2024 — Feb 2025

Ascendion / Starbucks

Senior Manager / Lead Software Design Engineer

Designed a unified data integration platform consolidating product data, and led modernization of transaction processing with zero-downtime migration strategies.

Nov 2018 — Apr 2024

Nortal

Lead Software Design Engineer

Modernized enterprise systems into hybrid legacy/modern operational models, migrating hundreds of APIs to modern frameworks with zero downtime.

2015 — 2018

F5 Networks

Senior Java Developer

Redesigned state-machine software for improved responsiveness, built advanced content-caching systems, and strengthened CI/CD pipelines.

2010 — 2014

Amazon

Senior Software Design Engineer

Designed a scalable photo pipeline bridging legacy systems with modern web platforms, and built BI platforms on Kinesis, Redshift, and EMR.

Toolkit

What I reach for.

AI & ML Engineering

RAG systemsChromaDBsentence-transformersMMR retrievalPrompt engineeringEmbeddings

Languages & Backend

PythonJava / Spring BootC#Node.jsGo

Cloud & Infra

AWSGCPAzureKubernetesDockerIstio

Data & DevOps

PostgreSQLMongoDBKafkaRedisCI/CDTerraform
Open to roles & freelance

Let's build something solid.

Hiring for a senior or principal backend role — or have a modernization problem that needs a steady hand? I'd love to hear about it.