Harnessing Artificial Intelligence for a Superior Web Application Firewall

Akamai Wave Blue

Written by

Maxim Zavodchik and Tom Emmons

February 03, 2025

Maxim Zavodchik

Written by

Maxim Zavodchik

Maxim Zavodchik is an experienced security research leader with a proven track record in establishing, growing, and defining strategic vision for Threat Research and Data Science teams in Web Application Security and API Protection. When he’s not protecting life online, you can find him being a super dad and/or watching Studio Ghibli movies.

Tom Emmons

Written by

Tom Emmons

Tom Emmons is a data enthusiast who leads a team focused on machine learning and automation at Akamai. His areas of security expertise are in DDoS and application security.

Rather than merely reacting to threats, we are using AI to stay ahead of attackers.
Rather than merely reacting to threats, we are using AI to stay ahead of attackers.

Artificial intelligence (AI) has been a blessing for attackers. AI not only enables attackers to increase the speed and innovation of their initial attacks on a company, but also allows more sophisticated attackers to pivot rapidly when their attacks are detected, evolving and launching new attacks in minutes instead of hours or days.

The good news is that AI and machine learning (ML) aren’t just being used by attackers. Akamai has long been at the forefront of ML and advanced heuristic innovations since our inception and is continuously experimenting with AI and ML to stay ahead of adversaries and eliminate vulnerabilities.

The shift toward AI-driven web application security solutions

Security leaders increasingly expect their security services vendors to incorporate ML and AI in fortifying cybersecurity defenses. By integrating AI into their security solutions, organizations enhance their ability to detect and respond to a wide spectrum of web application cyberthreats, including those beyond the OWASP Top 10. These organizations also gain a competitive edge in the cybersecurity face-off.

This shift toward AI-driven web application security solutions marks a significant evolution from traditional rule-based systems, enabling more dynamic, proactive, and efficient defense mechanisms for application security and more. AI offers an advanced, adaptive approach to security. Because AI can analyze vast datasets to identify and respond to potential threats with remarkable accuracy, security solutions that incorporate AI have better chances to detect zero-day threats and automate protection updates in real-time — lessening the burden on security teams.

This blog post explores the technical details of how an AI-driven web application firewall (WAF) stands apart by offering unmatched, automated protection and insights to ensure that your applications remain secure.

WAF static detections are not enough

Traditional cloud-based WAFs don’t detect or stop new attacks quickly, which puts organizations at unnecessary risk. Attackers can do significant damage in the time between initial detection and the delayed mitigation.

This is because other providers’ web application firewalls rely heavily on static WAF rulesets and signature-based detection mechanisms — essentially hardcoding pattern recognition ability. This legacy approach increases risk by introducing detection gaps whenever the smallest change is made by an attacker.  Detection methods relying on hardcoded pattern recognition must be manually updated with each new pattern detection. 

Although effective against known threats, these systems struggle to handle sophisticated zero-day attacks, or dynamic and polymorphic attack vectors. For example, regex-based rules — a cornerstone of many legacy systems — are inherently limited and prone to vulnerabilities. They require meticulous manual curation and can lead to false positives when they’re overly broad.

Consider the following example:

  • SQL injection pattern

‘UNION SELECT+user_pass+FROM+wp_users+WHERE+ID=
  • Benign pattern

Select from the following menu options where available

If a regex rule looks for select.*from.*where both patterns might be matched, resulting in a false positive. This makes it difficult to craft forward-looking regexes that aren’t susceptible to such issues. Zero-day attacks, in particular, often necessitate the creation of additional rules, which can lead to delays in threat mitigation.

Artificial intelligence benefits

In response to the challenges of static detections, modern WAFs use AI to automate detections and mitigation. There are several ways AI does this, including:

  • Pattern recognition
  • Adaptive learning
  • Zero-day and anomaly detection
  • Enhanced response time

Pattern recognition

Machine learning algorithms are adept at identifying patterns and anomalies in data that traditional systems miss. For example, they can augment or replace regex-based approaches by identifying nuanced patterns that static rules cannot. AI-based algorithms are also particularly good at seeing patterns in massive datasets that might be too granular for static techniques to see. Pattern recognition is important for both positive security (recognizing what good inputs look like) and negative security (recognizing a bad input).

Adaptive learning

AI-enabled systems can continuously adapt, optimize, and learn as attacks emerge and automatically improve detection and response capabilities. Akamai App & API Protector is continuously learning from the Akamai platform’s trillions of requests and uses its improved knowledge to make recommendations for your specific protections (seen in our adaptive self-tuning) while also applying automatic updates for web app and API protection.

Zero-day and anomaly detection

AI models identify deviations from normal behavior, often in ways that static models can’t. For instance, a sudden change in web traffic patterns could indicate an attack, even if it doesn’t match known signatures. Negative security, in particular, benefits significantly from AI, because traditionally the rulesets and signatures defined what a bad input looks like, meaning that something that was bad but not seen before would not trigger a mitigation. AI algorithms can recognize a new behavior as bad even if it’s not been seen before.

Enhanced response time

Artificial intelligence processes and analyzes data faster than humanly possible. Akamai’s use of AI speeds up the alerts on detections and also speeds up the time to deliver automatic updates and proactive, adaptive recommendations to your account. Our newly available rapid threat detection capability enables swift deployment of protections against emerging threats and high-profile CVEs. Customers can then validate their protections with our CVE lookup tool.

Multilayer machine learning for web security

Akamai employs a multilayer strategy to secure web applications effectively. This separation of stages offers several advantages, such as pinpointing specific text patterns responsible for triggering alerts and accurately classifying multi-vector attacks. This multilayer approach detects known attack patterns with low false positives and can learn new patterns during training without explicit instruction.

The multilayer strategy:

  • Tokenizes HTTP requests into n-grams with natural language processing (NLP)
  • Transforms n-grams into vectors using well-known algorithms
  • Projects vectors onto the attack space (SQLi, XSS, etc.)
  • Assigns attack probabilities using the gradient boosted trees (GBT) model
  • Classifies attack types (SQLi, XSS, etc.) using a neural network model

Tokenizes HTTP requests into n-grams with natural language processing (NLP)

We tokenize HTTP requests to create key-value pairs and extract n-grams from each value (e.g., “Give,” “the,” “cat,” “/etc/password”).

Transforms n-grams into vectors using well-known algorithms

We obtain vector representations of n-grams using well-known techniques like the Skip-gram Word2Vec model, enabling deeper contextual understanding of requests (Figure 1).

Word2Vec model Fig. 1: Graphical representation of the CBOW model and Skip-gram model (Source: https://arxiv.org/pdf/1309.4168v1)

Projects vectors onto the attack space with principal component analysis (PCA)

We extract attack signatures by projecting the vectors onto the attack space using PCA, a well-known dimensionality reduction technique. This is useful both in data analysis and preprocessing.

Assigns attack probabilities using the gradient boosted trees (GBT) model

Our GBT classifier combines multiple decision trees optimized through cross-validation (Figure 2). This approach improves classification accuracy and assigns attack probabilities more effectively.

Our GBT classifier combines multiple decision trees optimized through cross-validation (Figure 2).
Our GBT classifier combines multiple decision trees optimized through cross-validation (Figure 2). Fig. 2: The equations represent the iterative learning process of a GBT model, where weak decision trees are sequentially optimized to minimize loss

Classifies attack types (SQLi, XSS, etc.) using a neural network classifier

We use a multiclass classifier to assign attack types (e.g., Structured Query Language injection (SQLi), cross-site scripting (XSS), local file inclusion (LFI). Our multilayered neural network model ensures accurate classification by selecting the attack group with the highest probability.

Extending AI-based app and API security: AI-driven Client Reputation integration

Although cloud-based WAFs are the core of web app and API security, protecting apps and APIs from malicious traffic requires a broader set of AI capabilities.

A key example of expanded solutions within Akamai’s broad portfolio is our AI-driven Client Reputation system, powered by XGBoost for shared IP address detection and specialized models for categories like web attackers, DDoS attacks bots, scrapers, and scanners. By gathering insights from historical attacks across industries, these models provide faster, more extensive, and more accurate cross-product identification of attacker infrastructure.

AI is only as good as the data it trains on – and the safety protocols guiding it

Akamai's artificial intelligence and machine learning models are trained on billions of daily events, ensuring unmatched accuracy and speed. Our AI and ML platform adheres to MLOps best practices — enabling consistent, scalable, and secure deployment of models. 

However, AI needs guardrails and confirmation of its findings. Our approach to AI implementation is built on four layers that are designed to ensure both accuracy and safety.

  1. Research — Facilitates experimentation, model persistence, and performance validation

  2. Production — Includes pre-processing transformations, feature stores, and model version control

  3. KPI — Monitors model performance using extensive visualization tools

  4. Automated retraining — Ensures models remain effective by retraining with validated datasets

Stay secure. Stay ahead.

At Akamai, we are committed to shaping the future of cybersecurity by leveraging generative artificial intelligence and machine learning to redefine how web applications are secured. Rather than merely reacting to threats, we are using AI to stay ahead of attackers, proactively prevent threats, and ensure that our customers remain protected from ever-evolving risks.



Akamai Wave Blue

Written by

Maxim Zavodchik and Tom Emmons

February 03, 2025

Maxim Zavodchik

Written by

Maxim Zavodchik

Maxim Zavodchik is an experienced security research leader with a proven track record in establishing, growing, and defining strategic vision for Threat Research and Data Science teams in Web Application Security and API Protection. When he’s not protecting life online, you can find him being a super dad and/or watching Studio Ghibli movies.

Tom Emmons

Written by

Tom Emmons

Tom Emmons is a data enthusiast who leads a team focused on machine learning and automation at Akamai. His areas of security expertise are in DDoS and application security.