Skip to main content

LLMs

LLMs

3 articles tagged with “LLMs

AI as a Worker, Not an Engineer: The Hidden Ceilings Nobody Talks About

AI as a Worker, Not an Engineer: The Hidden Ceilings Nobody Talks About

A distinguished-architect deep dive into why AI coding agents are exceptional workers but not engineers — exposing the hidden limitations of LLMs, agents, benchmarks, hardware, and the governance gap that separates patch production from engineering accountability.

Hazem Ali
Hazem Ali··1 hr read
When Your LLM Trips the MMU: Page Faults, TLB Shootdowns, and the Hidden Virtual-Memory Tax of AI Inference

When Your LLM Trips the MMU: Page Faults, TLB Shootdowns, and the Hidden Virtual-Memory Tax of AI Inference

A distinguished-architect deep dive into GPU virtual memory internals, MMU fault pipelines, TLB shootdown mechanics, page-table walks, Unified Memory/HMM coherence, ATS, and why page migration turns your p99 into a hardware problem nobody on the team budgeted for.

Hazem Ali
Hazem Ali··45 minutes read
Kernel Dynamics: The Real Bottleneck of AI

Kernel Dynamics: The Real Bottleneck of AI

Why LLM inference speed is dominated by kernel execution, memory traffic, and runtime scheduling — not raw FLOPS. A deep technical guide to prefill vs decode, the Roofline model, memory walls, FlashAttention, KV cache paging, warp mechanics, and GPU pipeline design.

Hazem Ali
Hazem Ali··35 min read