Co-Authors: Dr. Yasir Mehmood, Dr. Muhammad Zeeshan Baig, Dr. Muhammad Aatif, Dr. Muhammad Aziz Ul Haq, Kamal Noor, Hazem Ali, Nadeem Shahzad, Jamel Abed
What is QSAF?
The Qorvex Security AI Framework (QSAF) is a comprehensive security control framework designed to protect AI systems — particularly those powered by Large Language Models (LLMs) — against emerging threats such as prompt injection, role manipulation, plugin abuse, and data leakage.
QSAF introduces 63 controls across 9 domains, categorized into three enforcement types:
- 37 Auditable — Designed for compliance verification and forensic review
- 22 Real-Time Agent-Based — Enforced dynamically during inference via intelligent agents
- 4 Hybrid — Combining both auditable and real-time characteristics
The 9 Domains
- Prompt Injection Protection
- Role & Context Manipulation
- Plugin Abuse Monitoring
- Output Risk & Response Control
- Behavioral Anomaly Detection
- Payload Integrity & Signing
- Source Attribution & RAG Monitoring
- Data Governance & Retention
- Cross-Environment Defense
Domain 1: Prompt Injection Protection
This domain addresses the growing threat of prompt injection attacks — where adversarial inputs manipulate LLM behavior. Controls range from static pattern matching to dynamic LLM-based analysis.
- QSAF-PI-001: Static pattern blacklist detection (Auditable) — Detects known injection patterns using static blacklists.
- QSAF-PI-002: Dynamic LLM prompt analysis (Real-Time) — Uses a secondary LLM to analyze prompts for injection attempts.
- QSAF-PI-003: Input tokenization anomaly detection (Real-Time) — Identifies unusual tokenization patterns in inputs.
- QSAF-PI-004: Prompt boundary enforcement (Auditable) — Enforces boundaries to prevent prompt leakage.
- QSAF-PI-005: Injection risk scoring engine (Auditable) — Scores prompts for injection risk.
- QSAF-PI-006: Recursive prompt unrolling (Auditable) — Unrolls recursive prompts to detect nested injections.
- QSAF-PI-007: Prompt injection simulation (red-team) logging (Auditable) — Logs red-team simulation results for prompt injection.
Domain 2: Role & Context Manipulation
This domain protects against attempts to manipulate the AI's role or context. Controls include enforcement of system prompts, context window monitoring, and identity verification.
- QSAF-RC-001: System prompt override detection (Real-Time) — Detects attempts to override system prompts.
- QSAF-RC-002: Role boundary enforcement (Auditable) — Enforces defined role boundaries.
- QSAF-RC-003: Context window integrity monitor (Real-Time) — Monitors context window integrity for tampering.
- QSAF-RC-004: Identity assertion for LLM personas (Auditable) — Asserts identity for LLM-based personas.
- QSAF-RC-005: Multi-turn context drift scoring (Auditable) — Scores context drift across multi-turn conversations.
- QSAF-RC-006: System prompt versioning & audit log (Auditable) — Versions and audits system prompt changes.
- QSAF-RC-007: Context reset trigger & session isolation (Real-Time) — Triggers context resets and isolates sessions.
Domain 3: Plugin Abuse Monitoring
This domain monitors third-party plugin interactions for abuse. Controls include permission auditing, execution sandboxing, and data exfiltration detection.
- QSAF-PA-001: Plugin permission audit trail (Auditable) — Logs plugin permission changes for auditing.
- QSAF-PA-002: Plugin execution sandboxing (Real-Time) — Sandboxes plugin execution to prevent abuse.
- QSAF-PA-003: Input/output schema validation for plugins (Auditable) — Validates plugin input/output schemas.
- QSAF-PA-004: Data exfiltration via plugin detection (Real-Time) — Detects data exfiltration through plugins.
- QSAF-PA-005: Plugin rate limiter (Real-Time) — Rate-limits plugin calls to prevent abuse.
- QSAF-PA-006: Unauthorized plugin call logging (Auditable) — Logs unauthorized plugin call attempts.
- QSAF-PA-007: Plugin trust score (Auditable) — Assigns trust scores to plugins based on behavior history.
Domain 4: Output Risk & Response Control
This domain manages the risk associated with AI-generated outputs. Controls include content filtering, hallucination detection, watermarking, and sensitivity scoring.
- QSAF-OR-001: Content filter for jailbreak or illegal content (Real-Time) — Filters out jailbreak attempts or illegal content in responses.
- QSAF-OR-002: Flag hallucinated facts in responses (Real-Time) — Identifies potentially inaccurate or fabricated facts.
- QSAF-OR-003: Token-based watermarking for response traceability (Auditable) — Embeds traceable watermarks in AI responses.
- QSAF-OR-004: Sensitivity scoring of LLM responses (Real-Time) — Assigns sensitivity scores to responses based on content.
- QSAF-OR-005: Block or reroute risky content (Real-Time) — Blocks or redirects high-risk responses to moderators.
- QSAF-OR-006: Prompt-response correlation analysis (Hybrid) — Analyzes correlation between prompts and responses for consistency.
- QSAF-OR-007: Tone and sentiment deviation tracking (Real-Time) — Monitors deviations in response tone and sentiment.
Domain 5: Behavioral Anomaly Detection
This domain detects anomalous user or system behavior. All controls are real-time, leveraging agent-based monitoring for immediate response.
- QSAF-BA-001: Session entropy score (Real-Time) — Measures session entropy to detect irregular behavior.
- QSAF-BA-002: Prompt embedding drift detector (Real-Time) — Tracks drift in prompt embeddings to identify anomalies.
- QSAF-BA-003: Response volatility monitor (Real-Time) — Monitors volatility in AI responses for unexpected changes.
- QSAF-BA-004: Repeated intent mutation heuristic (Real-Time) — Detects repeated attempts to alter intent maliciously.
- QSAF-BA-005: Time-based usage anomalies (Real-Time) — Identifies unusual usage patterns based on time.
- QSAF-BA-006: Plugin execution pattern deviance (Real-Time) — Detects deviations in plugin execution patterns.
- QSAF-BA-007: Unified behavioral risk score (Real-Time) — Aggregates behavioral metrics into a unified risk score.
Domain 6: Payload Integrity and Signing
This domain ensures the integrity of prompts and responses through cryptographic measures. All controls are auditable to support compliance verification, except for one hybrid control.
- QSAF-PY-001: Prompt hash signing (Auditable) — Signs prompts with cryptographic hashes for integrity.
- QSAF-PY-002: Response payload signing (Auditable) — Signs AI responses to ensure authenticity.
- QSAF-PY-003: Plugin request signature enforcement (Auditable) — Enforces signatures on plugin requests.
- QSAF-PY-004: Signature verification middleware (Auditable) — Verifies signatures through middleware before processing.
- QSAF-PY-005: Nonce/replay token control (Auditable) — Uses nonces to prevent replay attacks.
- QSAF-PY-006: Hash chain lineage (Auditable) — Maintains a hash chain for tracking payload lineage.
- QSAF-PY-007: Invalid signature escalation (Hybrid) — Escalates invalid signatures for review and action.
Domain 7: Source Attribution & RAG Monitoring
This domain ensures accurate source attribution in Retrieval-Augmented Generation (RAG) systems. Controls are a mix of auditable and real-time mechanisms.
- QSAF-SA-001: Track document source in RAG systems (Auditable) — Logs document sources used in RAG responses.
- QSAF-SA-002: Compare LLM response to top-K retrieved docs (Real-Time) — Verifies response alignment with retrieved documents.
- QSAF-SA-003: Calculate hallucination likelihood score (Real-Time) — Scores responses for potential hallucinations.
- QSAF-SA-004: Flag mismatch between retrieval and output (Real-Time) — Flags discrepancies between retrieved data and outputs.
- QSAF-SA-005: Log and alert for non-attributable responses (Auditable) — Logs and alerts on responses lacking attribution.
- QSAF-SA-006: Embed source trust rating into response (Auditable) — Embeds trust ratings for sources in responses.
- QSAF-SA-007: Auto-disable RAG pipeline upon anomaly (Real-Time) — Disables RAG pipeline when anomalies are detected.
Domain 8: Data Governance & Retention
This domain enforces data governance and retention policies. All controls are auditable to ensure compliance with regulations like GDPR.
- QSAF-DG-001: Prompt & response TTL policies (Auditable) — Enforces time-to-live policies for prompts and responses.
- QSAF-DG-002: Embedding store expiry rules (Auditable) — Sets expiry rules for stored embeddings.
- QSAF-DG-003: Data classification tagging (Auditable) — Tags data based on sensitivity classifications.
- QSAF-DG-004: Log retention governance (Auditable) — Governs retention periods for system logs.
- QSAF-DG-005: Right-to-erase (GDPR) compliance (Auditable) — Ensures compliance with GDPR right-to-erase requests.
- QSAF-DG-006: Retention-aware monitoring (Auditable) — Monitors data retention compliance.
- QSAF-DG-007: Sensitive data auto-deletion (Auditable) — Automatically deletes sensitive data per policy.
Domain 9: Cross-Environment Defense
This domain secures AI systems across multiple environments. Controls are primarily auditable, with some real-time components for dynamic defense.
- QSAF-CE-001: Federated agent sync (Auditable) — Synchronizes agents across federated environments.
- QSAF-CE-002: Tenant-aware log routing (Auditable) — Routes logs based on tenant-specific policies.
- QSAF-CE-003: Isolated risk scoring per tenant (Auditable) — Calculates risk scores isolated by tenant.
- QSAF-CE-004: Cross-node signature validation (Auditable) — Validates signatures across distributed nodes.
- QSAF-CE-005: Shadow agent heartbeat detection (Real-Time) — Detects shadow agent activity via heartbeats.
- QSAF-CE-006: Coordinated alert response (Hybrid) — Coordinates alert responses across environments.
- QSAF-CE-007: Multi-cloud policy synchronization (Auditable) — Synchronizes security policies across cloud platforms.



