LangWatch Launches Open-Source AI Red-Teaming Tool

LangWatch has officially introduced LangWatch Scenario, an open-source framework designed to strengthen AI security through automated red-teaming and penetration testing. Notably, the Amsterdam-based company developed this tool to help organisations that deploy AI applications in production environments identify vulnerabilities that traditional testing methods often overlook.

To begin with, LangWatch Scenario focuses on evaluating AI agents such as customer service chatbots and data analytics platforms. These systems, especially in industries like banking, insurance, and software, frequently process sensitive information and support critical operations. Therefore, ensuring their resilience against advanced cyber threats has become increasingly essential.

Unlike conventional testing approaches that rely on single prompts or isolated penetration tests, LangWatch Scenario introduces a more dynamic methodology. Specifically, it simulates multi-turn attack scenarios, where attackers gradually build trust with AI systems over time. As a result, this approach mirrors real-world cyberattack strategies, where malicious actors engage in extended conversations before attempting to extract sensitive data or trigger unsafe responses.

Furthermore, the framework executes a sequence of structured scenarios. Initially, it begins with low-risk interactions and progressively advances toward more complex queries and authority-based prompts. Simultaneously, a secondary model monitors the interaction and adjusts the attack path in real time. Consequently, this adaptive mechanism enables the system to uncover hidden vulnerabilities that may only surface after multiple conversational exchanges.

According to LangWatch, such weaknesses often remain undetected in traditional testing environments because AI systems may only fail after prolonged interaction. Addressing this gap, Rogerio Chaves, Co-founder and Chief Technology Officer at LangWatch, said, “An AI agent that rejects every single prompt gives you a false sense of security. In practice, cybercriminals do not work with a single direct question. They have dozens of relaxed conversations, build trust, and when the agent is in a cooperative mode after twenty turns, a request that would have been rejected in turn one suddenly becomes no problem at all.”

In addition, the framework incorporates what LangWatch calls the Crescendo strategy—a four-phase escalation process. This process starts with exploratory discussions, transitions into hypothetical scenarios, moves toward authority-based claims such as compliance checks, and ultimately applies direct pressure on the AI system. At each phase, the framework evaluates whether the AI becomes more vulnerable to unsafe actions or data exposure.

Equally important, LangWatch Scenario integrates seamlessly into existing development and continuous integration workflows. This allows teams to conduct ongoing testing as they update AI models, refine prompts, and introduce new features. Consequently, organisations can shift from one-time security assessments to continuous risk evaluation.

Meanwhile, the launch arrives amid growing scrutiny around AI-related risks. While public discussions often highlight concerns like deepfakes, misinformation, and privacy, LangWatch emphasises the hidden risks within internal AI systems used by businesses. These include internal assistants, customer-facing bots, and analytics tools that interact directly with proprietary data and systems.

Moreover, LangWatch revealed that companies such as Backbase, Buy It Direct, Ask Vinny, Visma, Skai, and PagBank are already leveraging its broader platform and are now expanding into automated red-team testing. Although the company has not disclosed specific commercial details, the open-source nature of Scenario is expected to drive wider adoption across engineering and security communities.

Finally, Manouk Draisma, Co-founder and Chief Executive Officer at LangWatch, highlighted the evolving nature of cyber threats. “It is rarely about a single spectacular hack. It is about patience and context. A cybercriminal who interacts calmly and systematically with an AI agent for twenty minutes can extract sensitive information that a direct attack would never reveal. LangWatch Red-Teaming makes these hidden risks visible before damage occurs,” she said.

Recommended Cyber Technology News:

To participate in our interviews, please write to our CyberTech Media Room at info@intentamplify.com

🔒 Login or Register to continue reading