It’s not hard to imagine. You have a highly intelligent assistant who is fast, multi-lingual, and indefatigable. What happens when that assistant starts to help both the defenders and the attackers? That is what is occurring in cybersecurity today with large language models (LLMs).

These AI systems have been trained on multiple terabytes of text data and are already being used in threat discovery, code analysis, and automated red team (offensive testing). They are also being used sometimes inadvertently, sometimes purposefully, to create malware, simulate hacks, and generate phishing content. Are LLMs good? Bad? A combination of both?

In this chapter, we will explore both sides of the story and hopefully provide some illustrated examples of them. Expect to see real-world data and mostly fun examples to create thought-provoking takeaways, and practical ways of how you and your professional colleagues can responsibly take advantage of LLMs without losing sleep or security.

What Are LLMs and Why Do They Matter in Cybersecurity

LLMs (Large Language Models) like GPT-4o, LLaMA 3, and Claude are AI systems that can recognize and generate human language built from billions of words sourced from across the web. The task of LLMs is to recognize, understand, and generate human language. However, when incorporated into a cybersecurity workflow, the functionalities of LLMs go far beyond summarizing an email or translating a document.

Examples of real-world uses include:

Threat intelligence: LLMs scrape and summarize massive amounts of security reports, blogs, and social feeds to find new threats and indicators of compromise.

Code security: LLMs show developers sensitive vulnerabilities such as SQL injection or insecure authentication patterns, and can sometimes fix them on their own.

Incident response: LLMs leverage the AI’s computing power to rapidly summarize logs and correlate events across multiple instances, driving the root cause analysis faster.

Red teaming: Advanced agents are capable of simulating hacker behavior tailored to that organization, and logically mimicking thought processes to test what defenses that organization may have in place.

These examples are not just thoughts and ideas. Existing tools like Microsoft Security Copilot and Google SeaPlum already leverage LLMs for summarization and triage of threats in real-time.

Recent Data and Research: LLMs Continue to Be Powerful – But Not Perfect

Let’s match the buzz with some facts.

In 2025, a study by Veracode covering 100+ LLMs produced data indicating that nearly 45% of AI-generated code was insecure. The large proportion of applications with cross-site scripting (XSS), log injection attacks, and faulty cryptographic practices was commonplace, particularly in Java and JavaScript.

Recently, Carnegie Mellon and Anthropic researchers created autonomous LLM agents that performed the 22017 Equifaxbreach, from planning to scanning and exploitation, there wasn’t a human in the loop – they even performed the lateral movement and pivoted into internal systems.

A Qwen-2.5 model, which fine-tuned the AI with reinforcement learning, was experimentally making malware that evaded Microsoft Defender 8% of the time.  With a total cost of training of less than $1,600.

LLM cybersecurity as a global market is on fire – $3.88B in 2024 and predicted $6.07 in 2025, and over $36B by 2029 (source: EINPresswire).

So what’s the takeaway?  LLMs can provide value, but also be disturbingly competent in the wrong hands since a practitioner has not put guardrails in place.

Friend: How LLMs Supercharge Defenders

When you’re looking at thousands of log entries, threat alerts, or vulnerability disclosures, human eyes aren’t going to be able to keep up. LLMs can help make the process more efficient in speed, context, and accuracy.

To that end, here’s how LLMs empower defenders:

Triage and alerts: LLMs can screen for aberrations, highlight the most important alerts, and eliminate noise that leads to analyst burnout.

Code reviews: AI assistants like GitHub Copilot Security automatically highlight and patch dangerous snippets of code.

Simulated hacking: LLM agents perform reconnaissance, probe ports, and simulate phishing—all of which show security teams the path attackers might take in the real world.

Anecdote: A prominent fintech recently used an LLM scanner on their microservices architecture to identify a deeply nested logging issue that could have led to the exposure of sensitive PII. Fixing it took them 30 minutes. Manually? It would have taken them days.

LLMs are like high-performing digital bloodhounds—sniffing out threats faster than humans ever could.

Foe: How LLMs Are Being Weaponized 

Of course, the same intelligence can be subverted. Just as a lockpicking expert understands how locks work, bad actors can utilize LLMs to productively extract or even automate hostile cyber activities. 

How are malicious actors leveraging LLMs? 

Malware development – with a few prompts, they can create polymorphic malware that can evolve to bypass traditional antivirus tools. 

Phishing Content – LLM can write grammatically correct emails at any scale in any language, and is refined to fool even the most savvy target. 

Deepfakes and social engineering – threat groups are now utilizing AI to create fake LinkedIn profiles, duplicate voices, and nail job interviews. 

Prompt Injection – bad actors are embedding bad prompts into data inputs, including embedding malicious instructions via embedded prompts to “jailbreak” LLMs to forget their security rules. 

The scariest example? Researchers found that some LLMs “hallucinated” software packages – inventing fake libraries, and if downloaded, they were highly compromised supply chain links. That is now known as “slopsquatting.”

Both: Why LLMs Are Not Black Or White Hat

So are LLMs friend or foe? The answer is complicated. Like fire or electricity, they are a powerful tool. Whether they prove useful or harmful depends entirely upon how we design, select, prompt, deploy, and monitor them.

Remember these practical facts:

LLMs decrease defenders’ time-to-respond but also reduce the defenders’ entry barrier to attacking as well. 

You have to treat them as unreliable; they hallucinate, misunderstand prompts, and can be misdirected if not properly tuned.

Security needs to be advanced and preventative, meaning adversarial testing, zero-trust design, and human validation in the loop present or otherwise.

Within this duality lies an important reductionist truth: LLMs mirror us. Our inputs, our ethics, our watchguards, or lack thereof.

Effective Strategies for Cyber Teams Utilizing LLMs

Are you preparing to deploy LLMs in your organization? Fantastic. Get it right.

Here are strategies to stay safe:

Sanitize all inputs: Never feed user data directly into an LLM.

Use Retrieval-Augmented Generation (RAG): Keep your model fettered to all the feed, CVE data, that you can get your hands on.

Establish role-based access and API logs: Control who queries what, t—and keep a complete audit log.

Run red team simulations regularly: Test the model itself. Try and jailbreak it. Try and confuse it. No better way to discover vulnerabilities.

Apply zero-trust principles to AI systems: There should be no model that unconditionally accesses without proper authentication and observability.

You wouldn’t deploy a firewall without configuration. Don’t deploy LLMs without security constraints.

An Important Conclusion: Proceeding From Here

Think of it this way: giving a teenager superpowers: fast learning, great recall, tremendous curiosity, and…. No impulse control. This is the position we’re in with LLMs in cybersecurity.

They are powerful, but not yet mature. They are competent, but not yet wise. 

If we raise them correctly—with guardrails, ethics by design, and human supervision—they can become trusted defenders. But if we do not attend to the foreshadowed risks? They could not only partner with attackers, but they could actually become the attackers. 

LLMs are not heroes or villains; they are mirrors. What do we want them to mirror? It’s up to us.

FAQs

1. What are the primary security threats with LLMs?

Prompt injection, package hallucination, model hallucination, and agency attack are the main threats. OWASP’s 2025 LLM Top 10 lists prompt injection as the primary threat.

2. Can LLMs be used to automate cyberattacks?

Correct. In a lab environment, agentic LLMs have carried out multi-staged breaches by planning, scanning, exploiting the breach, and adapting, without a command from a human operator.

3. What can I do to secure LLM applications at my organization?

Utilize input validation, RAG for current data, staged authentication, output monitoring, and adversarial examination. Use human oversight, particularly for more sensitive or operational outputs.

4. What does the market for LLMs look like in cybersecurity?

Rapid growth. The market is projected to grow from $3.88B (2024) to $36B+ (2029) through SOC automation, Threat protagonist detection, a nd Altai response platforms.

5. Are there any compliance frameworks for the secure use of LLMs?

Yes. You can use compiled guidelines from NIST, OWASP, UK NCSC, and an internal audit firm to ensure that your AI apps work with zero-trust and secure-by-design principles.

For deeper insights on agentic AI governance, identity controls, and real‑world breach data, visit Cyber Tech Insights.

To participate in upcoming interviews, please reach out to our CyberTech Media Room at sudipto@intentamplify.com.