<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Hazem Ali — Blog</title>
    <link>https://drhazemali.com/blog</link>
    <description>Deep technical articles on AI architecture, LLM runtime behavior, production systems, and performance engineering by Hazem Ali.</description>
    <language>en-US</language>
    <managingEditor>hazem@drhazemali.com (Hazem Ali)</managingEditor>
    <webMaster>hazem@drhazemali.com (Hazem Ali)</webMaster>
    <lastBuildDate>Thu, 26 Feb 2026 19:50:57 GMT</lastBuildDate>
    <atom:link href="https://drhazemali.com/feed.xml" rel="self" type="application/rss+xml" />
    <image>
      <url>https://drhazemali.com/images/hazem-ali.png</url>
      <title>Hazem Ali — Blog</title>
      <link>https://drhazemali.com</link>
    </image>
    
    <item>
      <title><![CDATA[The Silent Collapse: Deep-Stack Hardware–Software Failure Modes That Corrupt AI Systems Without a Trace]]></title>
      <link>https://drhazemali.com/blog/the-silent-collapse-deep-stack-hardware-software-failure-modes</link>
      <guid isPermaLink="true">https://drhazemali.com/blog/the-silent-collapse-deep-stack-hardware-software-failure-modes</guid>
      <description><![CDATA[A distinguished-architect deep dive into the 12 most dangerous failure modes in AI infrastructure — from silent data corruption in GPU silicon to compiler cache poisoning, memory allocator drift, and kernel-launch corruption. Includes x86/PTX assembly analysis, Mermaid flow diagrams, a full comparative triage matrix, and a 12-month engineering roadmap with new observability primitives.]]></description>
      <pubDate>Thu, 26 Feb 2026 00:00:00 GMT</pubDate>
      <author>Hazem Ali</author>
      <category>AI Infrastructure</category>
      <category>GPU</category>
      <category>Silent Data Corruption</category>
      <category>CUDA</category>
      <category>Memory Architecture</category>
      <category>Hardware Security</category>
      <category>Compilers</category>
      <category>Observability</category>
      <category>Systems Architecture</category>
      <category>Zero Trust</category>
    </item>
    <item>
      <title><![CDATA[From Silicon to Pixels: Why No AI Agent Can Ship a Production Browser — A 35-Million-Line Engineering Autopsy]]></title>
      <link>https://drhazemali.com/blog/from-silicon-to-pixels-why-no-ai-agent-can-ship-a-production-browser</link>
      <guid isPermaLink="true">https://drhazemali.com/blog/from-silicon-to-pixels-why-no-ai-agent-can-ship-a-production-browser</guid>
      <description><![CDATA[A distinguished-architect's silicon-to-pixel dissection of why production browsers remain categorically beyond AI agent capabilities. Spanning GPU command buffer validation, TDR fault recovery, seccomp-BPF syscall confinement, the Unicode bidirectional algorithm, OpenType GPOS shaping tables, QUIC transport internals, WebAssembly sandboxing, accessibility tree construction, image decoder attack surfaces, and the formal verification boundaries that separate plausible code generation from provably correct systems software. Grounded in peer-reviewed research, hardware specifications, W3C/WHATWG conformance data, and two decades of shipping systems that survive adversarial production.]]></description>
      <pubDate>Mon, 23 Feb 2026 00:00:00 GMT</pubDate>
      <author>Hazem Ali</author>
      <category>AI</category>
      <category>Browser Engineering</category>
      <category>Systems Architecture</category>
      <category>GPU</category>
      <category>Security</category>
      <category>Rendering</category>
      <category>AI Agents</category>
      <category>Low-Level Systems</category>
      <category>WebAssembly</category>
      <category>Accessibility</category>
      <category>Networking</category>
      <category>QUIC</category>
    </item>
    <item>
      <title><![CDATA[AI as a Worker, Not an Engineer: The Hidden Ceilings Nobody Talks About]]></title>
      <link>https://drhazemali.com/blog/ai-as-worker-not-engineer</link>
      <guid isPermaLink="true">https://drhazemali.com/blog/ai-as-worker-not-engineer</guid>
      <description><![CDATA[A distinguished-architect deep dive into why AI coding agents are exceptional workers but not engineers — exposing the hidden limitations of LLMs, agents, benchmarks, hardware, and the governance gap that separates patch production from engineering accountability.]]></description>
      <pubDate>Sat, 21 Feb 2026 00:00:00 GMT</pubDate>
      <author>Hazem Ali</author>
      <category>AI</category>
      <category>LLMs</category>
      <category>Software Engineering</category>
      <category>AI Agents</category>
      <category>GPU</category>
      <category>Benchmarks</category>
      <category>Architecture</category>
      <category>Governance</category>
    </item>
    <item>
      <title><![CDATA[QSAF: Qorvex Security AI Framework]]></title>
      <link>https://drhazemali.com/blog/qsaf-qorvex-security-ai-framework</link>
      <guid isPermaLink="true">https://drhazemali.com/blog/qsaf-qorvex-security-ai-framework</guid>
      <description><![CDATA[A comprehensive AI security framework featuring 63 controls across 9 domains — covering prompt injection, role manipulation, plugin abuse, output risk, behavioral anomaly detection, payload integrity, RAG monitoring, data governance, and cross-environment defense.]]></description>
      <pubDate>Sun, 15 Feb 2026 00:00:00 GMT</pubDate>
      <author>Hazem Ali</author>
      <category>AI Security</category>
      <category>LLM</category>
      <category>Framework</category>
      <category>QSAF</category>
      <category>Prompt Injection</category>
      <category>RAG</category>
      <category>Data Governance</category>
      <category>Cybersecurity</category>
    </item>
    <item>
      <title><![CDATA[When Your LLM Trips the MMU: Page Faults, TLB Shootdowns, and the Hidden Virtual-Memory Tax of AI Inference]]></title>
      <link>https://drhazemali.com/blog/when-your-llm-trips-the-mmu</link>
      <guid isPermaLink="true">https://drhazemali.com/blog/when-your-llm-trips-the-mmu</guid>
      <description><![CDATA[A distinguished-architect deep dive into GPU virtual memory internals, MMU fault pipelines, TLB shootdown mechanics, page-table walks, Unified Memory/HMM coherence, ATS, and why page migration turns your p99 into a hardware problem nobody on the team budgeted for.]]></description>
      <pubDate>Thu, 12 Feb 2026 00:00:00 GMT</pubDate>
      <author>Hazem Ali</author>
      <category>LLMs</category>
      <category>GPU</category>
      <category>Virtual Memory</category>
      <category>CUDA</category>
      <category>Inference</category>
      <category>MMU</category>
      <category>Page Faults</category>
      <category>Systems Architecture</category>
    </item>
    <item>
      <title><![CDATA[Kernel Dynamics: The Real Bottleneck of AI]]></title>
      <link>https://drhazemali.com/blog/kernel-dynamics-the-real-bottleneck-of-ai</link>
      <guid isPermaLink="true">https://drhazemali.com/blog/kernel-dynamics-the-real-bottleneck-of-ai</guid>
      <description><![CDATA[Why LLM inference speed is dominated by kernel execution, memory traffic, and runtime scheduling — not raw FLOPS. A deep technical guide to prefill vs decode, the Roofline model, memory walls, FlashAttention, KV cache paging, warp mechanics, and GPU pipeline design.]]></description>
      <pubDate>Sun, 01 Feb 2026 00:00:00 GMT</pubDate>
      <author>Hazem Ali</author>
      <category>LLMs</category>
      <category>GPU</category>
      <category>Kernel Optimization</category>
      <category>Memory Architecture</category>
      <category>CUDA</category>
      <category>Inference</category>
      <category>FlashAttention</category>
    </item>
  </channel>
</rss>