Blog | Promptfoo

Building a Security Scanner for LLM Apps

We built a GitHub Action that scans pull requests for LLM-specific vulnerabilities. Learn why traditional security tools miss these issues and how we trace data flows to find prompt injection risks..

Dane Schneider · Dec 16, 2025

Featured

Red Teaming

Why Attack Success Rate (ASR) Isn't Comparable Across Jailbreak Papers Without a Shared Threat Model

Attack Success Rate (ASR) is the most commonly reported metric for LLM red teaming, but it changes with attempt budget, prompt sets, and judge choice. Here's how to interpret ASR and report it so results are comparable..

Michael D'Angelo · Dec 12, 2025

Latest Posts

AI Policy

How AI Regulation Changed in 2025

Why AI compliance questions multiplied in 2025.

Michael D'AngeloDec 15, 2025

Red Teaming

GPT-5.2 Initial Trust and Safety Assessment

Day-0 red team results for GPT-5.2.

Michael D'AngeloDec 11, 2025

Security Vulnerability

Your model upgrade just broke your agent's safety

Model upgrades can change refusal, instruction-following, and tool-use behavior.

Guangshuo ZangDec 8, 2025

Feature Announcement

Real-Time Fact Checking for LLM Outputs

Promptfoo now supports web search in assertions, so you can verify time-sensitive information like stock prices, weather, and case citations during testing..

Michael D'AngeloNov 28, 2025

AI Security

How to replicate the Claude Code attack with Promptfoo

A recent cyber espionage campaign revealed how state actors weaponized Anthropic's Claude Code - not through traditional hacking, but by convincing the AI itself to carry out malicious operations..

Ian WebsterNov 17, 2025

Security Vulnerability

Will agents hack everything?

The first state-level AI cyberattack raises hard questions: Can we stop AI agents from helping attackers? Should we?.

Dane SchneiderNov 14, 2025

Security Vulnerability

When AI becomes the attacker: The rise of AI-orchestrated cyberattacks

Google's November 2025 discovery of PROMPTFLUX and PROMPTSTEAL confirms Anthropic's August threat intelligence findings on AI-orchestrated attacks.

Michael D'AngeloNov 10, 2025

Technical Guide

Reinforcement Learning with Verifiable Rewards Makes Models Faster, Not Smarter

RLVR trains reasoning models with programmatic verifiers instead of human labels.

Michael D'AngeloOct 24, 2025

AI Safety

Top 10 Open Datasets for LLM Safety, Toxicity & Bias Evaluation

A comprehensive guide to the most important open-source datasets for evaluating LLM safety, including toxicity detection, bias measurement, and truthfulness benchmarks..

Ian WebsterOct 6, 2025

Security Vulnerability

Testing AI’s “Lethal Trifecta” with Promptfoo

Learn what the lethal trifecta is and how to use promptfoo red teaming to detect prompt injection and data exfiltration risks in AI agents..

Ian WebsterSep 28, 2025

AI Safety

Autonomy and agency in AI: We should secure LLMs with the same fervor spent realizing AGI

Exploring the critical need to secure LLMs with the same urgency and resources dedicated to achieving AGI, focusing on autonomy and agency in AI systems..

Tabs FakierSep 2, 2025