
Real Profitable HFT Market Making Lives or Dies by Infrastructure – The Brutal Truth
Real Profitable HFT Market Making Lives or Dies by Infrastructure
You backtested your shiny Avellaneda-Stoikov (or Guéant-Lehalle-Fernandez, or whatever flavour) model and got a Sharpe of 15. Congrats!
Now try running it live with 500 μs latency while Jane Street is operating at 87 nanoseconds in the same book.
You’re not competing. You’re free alpha for everyone faster than you.
In high-frequency market making, the model is maybe 10–20 % of the battle. The other 80–90 % is pure infrastructure and devops on steroids.
Here’s the no-BS breakdown of what actually matters in 2025.
> FIG: Market Making: Avellaneda–Stoikov model
The Latency Hierarchy of Pain
| Latency Range | Who lives here | What it feels like | Profitable MM possible? |
|---|---|---|---|
| < 150 ns (tick-to-trade) | Top-tier HFT shops (Jane Street, Jump, Citadel, HRT, Flow Traders) | You are the predator | Yes |
| 150 ns – 1 μs | Serious independent firms & prop teams | You can still eat, but you have to be smart | Yes, selectively |
| 1 μs – 50 μs | Retail “HFT” bots, most crypto snipers | You’re mostly eating crumbs and getting picked | Only on illiquid venues |
| 50 μs – 1 ms | Traditional algo desks | Adverse selection hell | No (you lose money) |
| > 1 ms | Your laptop + Python + Binance API | Free money for everyone else | LOL no |
If you’re not in the first two rows on liquid instruments… just don’t.
The Real Stack – What Winners Actually Use
| Component | Why it exists | Cost (rough) | Latency saved | Mandatory for profit? |
|---|---|---|---|---|
| Colocation / Proximity | Be physically next to the matching engine | $10k–$100k+/month | 1–10 μs round-trip | Yes |
| FPGA everything | Parse feeds, calculate quotes, risk-check in hardware | $100k–$2M+ dev cost | 50–300 ns tick-to-trade | Yes for top tier |
| Kernel bypass (Solarflare Onload, EFVI, DPDK) | Skip Linux kernel networking stack | Free–$20k/year | 500–1500 ns per packet | Yes |
| Microwave / Laser links | Light travels faster in air than glass (CHI↔NY route) | $300k–$1M+/year | 2–3 ms saved cross-country | For cross-venue arb |
| Custom NICs / SmartNICs | Inline pre-trade risk checks in silicon | $15k–$50k per card | Avoids CPU bounce | Yes for safety |
| Raw UDP / Exchange binary protocols | No FIX overhead, direct binary parsing | – | Tens of μs saved | Yes |
| Deterministic OS (Linux + tuned realtime kernel) | No random GC pauses or scheduler hiccups | – | Predictability | Yes |
| Queue position tracking | Know exactly where you are in the LOB queue | Custom code + exchange depth feed | Changes quoting logic completely | YES |
Real 2025 example stack for a profitable independent shop on Nasdaq/ CME:
> FIG: Tick-to-trade
Total tick-to-trade: 80–150 nanoseconds on a good day.
Language Choice – Where Python Dies
- Python → research, backtesting, crypto toys only
- C++ / Rust → production quoting engine (if not on FPGA)
- Verilog / SystemVerilog → the real winners write the entire strategy in hardware
A single garbage collection pause of 200 μs just wiped out your entire day’s P&L.
The Hidden Killer: Queue Position & Adverse Selection
Even with perfect latency, if you don’t track your exact position in the price-time priority queue, your model is lying to you.
Example:
- You think you’re first in line at the bid → quote aggressively
- Actually you’re 50th → every informed seller hits you first → you get run over
Top shops track every add/cancel on the wire and maintain their own shadow book with sub-microsecond accuracy.
Without this, AS (or any model) overestimates profits by 5–20× in real markets.
Crypto vs Traditional – Slightly Different Rules
Crypto is more forgiving because:
- 24/7 markets (no end-of-day inventory panic)
- Higher volatility → wider spreads → more room for latency slop
- Many venues are still slow (Binance spot can be profitable with 5–20 ms)
But the big boys (Wintermute, Jump Crypto, Cumberland) are already running the exact same FPGA/microwave/colocation game on centralized exchanges and on-chain (MEV, Solana Jito bundles, etc.).
Bottom Line – Can a Solo Dev Win?
Yes… but only in niches:
- Illiquid altcoins
- Emerging L2s / new perp DEXs
- Geographic arbitrage (e.g., Korea premium)
- On-chain market making where latency is measured in block times
On BTC/USD or ES futures? Forget it unless you have $5M–$50M war chest for infrastructure and a team of ex-Citadel FPGA wizards.
Final Reality Check
Beautiful model + mediocre infrastructure = consistent losses
Decent model + god-tier infrastructure = printing money
Choose wisely where you spend your next 12 months.
Next post: “How I built a sub-millisecond market maker in Rust for under $10k” (spoiler: it makes $3/day on some random altcoin… but it’s fun).
Stay fast, or stay home 🚀
// RELATED_ARCHIVES

> Nov 2025 · 7 min read
The Cloud Is Just Someone Else’s Computer – Time to Go Local-First!
What if your app worked perfectly offline, synced magically, and you actually owned the data? Spoiler: it’s not sci-fi anymore.

> Nov 2025 · 8 min read
Orbital Data Centers - AI's Cosmic Power Plug?
Earth's data centers are guzzling power like a bad CI/CD loop—enter orbital ones for unlimited solar juice and zero drama. SpaceX vs. Blue Origin: Who's blasting off first?

> Nov 2025 · 5 min read
Google Just Dropped Antigravity – The IDE That Literally Defies Physics (and My Coffee Addiction)
Google’s new “agent-first” IDE powered by Gemini 3 is here, and it’s so autonomous my code now writes itself while I stare at the ceiling. First impressions from a very confused DevOps guy.

> Nov 2025 · 6 min read
When Cloudflare Sneezed Yesterday, Half the Internet Caught a Cold – The November 18, 2025 Outage
A "routine config change" turned into a global comedy of errors, taking down ChatGPT, X, Spotify and friends. Let's dive into the tech details (and laugh a bit so we don't cry).