AI in Cyber Security: Offensive vs Defensive

Introduction

The integration of Artificial Intelligence (AI) into the digital world has fundamentally altered Cyber Security. Over the last decade in particular, the industry has transitioned from a manual signature-based approach to a dynamic environment defined by autonomous cyber operations. As AI models become more sophisticated, they can be leveraged as instruments for increasingly complex exploits.

The current landscape is divided between the automation of exploitation and the automation of response.

The Offensive Shift

On the offensive side, AI has significantly lowered the barrier of entry to sophisticated operations. Through the utilisation of Large Language Models (LLMs) and agentic frameworks, threat actors can execute a large volume of hyper-personalised social engineering campaigns and develop adaptive malware that bypasses legacy detection systems. This has essentially industrialised the attack lifecycle, moving it from specialised human effort to a scalable and often automated process.

The Defensive Response

Conversely, AI has become practically mandatory for the modern Security Operations Centre (SOC). With AI and automation narrowing the window between initial access and data exfiltration, human-led monitoring is no longer sufficient for enterprise-grade protection. AI now has the defensive capability for Behavioural Analytics and Automated Incident Response, allowing for real-time identification and containment of threats. These systems go beyond merely reacting; they use predictive modelling to identify architectural weaknesses before they are exploited.

Vulnerability Management: Race to Remediation

In traditional models of cyber security, vulnerability management was a static, linear process: scan, identify, patch, and repeat. However, the volume of Common Vulnerabilities and Exposures (CVEs) has rendered this approach obsolete. With tens of thousands of new vulnerabilities reported annually, human teams are unable to process the sheer volume of alerts. As they have no way to know which reports require immediate attention without reviewing it takes too long to validate and process critical signals amongst the noise.

AI can cut through the noise by shifting the goal from finding everything to prioritising risk. By filtering our non-critical alerts, AI allows both attacks and defenders to ignore the distractions and race towards the few security gaps that are actually exploitable.

The Supply Chain Vulnerability: Shadow AI

While security teams race to patch software vulnerabilities (CVEs), a new non-technical vulnerability has emerged known as Shadow AI. This refers to the unauthorised use of third-party generative AI tools by employees, creating a massive data privacy blind spot that traditional vulnerability scanners cannot detect.

The Leakage

Employees, driving for efficiency, often paste proprietary code, sensitive strategy documents, or Personal Identifiable Information (PII) into public LLMs for summarisation or debugging.

The Risk

Most public AI models operate on a “learning” basis, meaning input data can be used to retrain future versions of the model. This risk was highlighted, in 2023, when engineers at a major electronics manufacturer (Samsung) inadvertently leaked confidential source code by uploading it to a public chatbot.

Third-Party Integration

Beyond direct employee use, the software supply chain is increasingly embedding AI features. If a third-party vendor processes your data using an unsecured API connection to an external AI provider, your data sovereignty is compromised—meaning your data may be processed in foreign jurisdictions, subjecting it to different laws and privacy standards.

The Mitigation

Vulnerability management for AI requires a shift from “patching” to governance. This involves implementing “sanctioned sandboxes” which are private, enterprise grade AI instances that do not train on user data. Further, more Data Loss Prevention (DLP) policies are being updated to detect and block the transmission of proprietary data to known AI domains.

The Offensive Edge

Threat actors are no longer manually sifting through code to find weaknesses. AI-driven tools are accelerating the “Zero-Day” discovery process through three primary automated vectors:

Smart Fuzzing

Traditional fuzzing involves inputting random data into software to cause unexpected behaviour. AI-enhanced fuzzing can use genetic algorithms to “learn” the structure of the software, generating inputs that are statistically more likely to trigger edge-case errors and deep logic vulnerabilities.

Code Analysis at Scale

LLMs can ingest vast repositories of open-source code, analysing them for known insecure patterns or logic flaws much faster than a human code audit.

Exploit Generation

Once a vulnerability is identified, AI agents can assist in drafting the exploit code, lowering the technical threshold for converting a theoretical vulnerability into a weaponised payload.

The Defensive Evolution

For defenders, the goal is no longer to patch every vulnerability but to patch the right ones immediately. AI transforms Vulnerability management into Risk-Based Vulnerability Management (RBVM).

Contextual Prioritisation

AI models analyse vulnerabilities not just by their CVSS score, but also the context within the environment. AI uses context to rank vulnerabilities by true business risk.

Predictive Patching

Advanced models can predict which vulnerabilities are most likely to be weaponised in the near future based on historical trends and threat intelligence. This allows teams to patch pre-emptively.

Automated Remediation

In mature environments AI is moving beyond notification to action. “Self-healing” systems can isolate affected endpoints or apply virtual patches automatically when a critical vulnerability is detected, mitigating risk whilst a permanent fix is tested.

Adversarial AI

Organisations are increasingly embedding Artificial Intelligence into the core of their defensive infrastructure; the AI models themselves have become high-value targets. This has given rise to Adversarial AI techniques designed to deceive, manipulate or blind machine learning systems.

For the modern CISO, it is crucial to understand that AI is not perfect; it is software, susceptible to its own unique class of vulnerabilities. These attacks do not exploit code bugs in the traditional sense but rather exploit the mathematical logic and learning patterns of the target model.

Data Poisoning

Data poisoning attacks occur during the AI’s training phase. If an attacker can inject malicious data into the training set, they can subtly alter the model’s behaviour. This is effectively the “Trojan horse” of cyber security – creating a sleeper agent within the system.

The Mechanism

An attacker inserts samples into the training data that are labelled incorrectly.

The Impact

An endpoint protection system might look at a malicious script, however because of the inserted noise, it classifies it as a harmless system update with 99% confidence. A real-world demonstration of this occurred when stickers were placed on stop signs, causing autonomous vehicle vision systems to misidentify them as speed limit signs.

Model Extraction and Inversion

These attacks target the confidentiality of the model and the data used to create it.

Model Extraction

By querying an AI system repeatedly and analysing the outputs, an attacker can essentially reverse engineer the model. This allows them to create a “clone” of a proprietary defence model, which they can then use offline to test their malware until they find something that bypasses it.

Model Inversion

This technique reconstructs the training data from the model itself. If a model was trained on sensitive PII (Personally Identifiable Information) or confidential medical records, inversion attacks can effectively cause a data breach without ever accessing the underlying database.

Large Language Model (LLM) Injection

With the rise of generative AI, Prompt Injection has emerged as a primary vector for exploitation. There are two main methods of injection:

Direct Injection

“Jailbreaking” the model by using specific prompts that override its safety alignment. This may consist of asking the AI to ignore previous instructions and reveal secrets.

Indirect Injection

The AI processes a document or webpage containing hidden, malicious instructions. When the LLM summarises the document, it executes the hidden command, potentially sending data to an external server for manipulating the summary to spread misinformation.

Strategic Implementation: The Need for AI Security

The emergence of these threats necessitates a new domain of security operations: AI Security (AISec). Organisations cannot simply deploy “black box” models and assume they are secure. Security teams must now implement:

Input Sanitisation: Ensuring data fed into LLMs is scrutinised for injection patterns.

Robustness Testing: Subjecting models to adversarial attacks during development.

Drift Detection: Monitoring models in production for sudden changes in behaviour that might indicate poisoning.

The Future of Offensive and Defensive AI in Cyber Security

AI will almost certainly continue to develop and provide increasing amounts of assistance to cyber threat actors, making more sophisticated attacks more efficient, effective and accessible. Thus, leading to an increase in frequency and intensity of cyber threats.

Conversely, defensive AI is developing in parallel, significantly increasing system resilience by enabling existing teams to operate with the speed and scale of a much larger workforce.