App and API Security

Or Zuckerman is a Senior Security Researcher at Akamai, working in various security fields, including incident response, forensic research, and detection method development.

Written by

Ido Sakazi

Ido Sakazi is a Data Scientist and Senior Security Researcher at Akamai, focusing on cyber and network security data science projects. His expertise is in leveraging advanced data analysis and machine learning techniques to detect and mitigate potential threats.

Written by

Yarin Ozery

Yarin Ozery is a Security Researcher and Software Engineer at Akamai, working on innovative data-driven solutions to various security problems. His passion lies in developing end-to-end scalable high-performance security solutions.

Written by

Gil Goren

Gil Goren is a Security Researcher Lead at Akamai, leading the threat hunting research and development in Akamai Hunt. His expertise is around Data Analysis, Incident response and Detection engineering.

The Akamai Hunt team continues to apply machine learning, among many other mechanisms, to protect customers’ critical environments – with the aim of uncovering the most evasive threats.

Introduction
Anomalous neighbors detection using graph neural networks
Graph neural networks in practice
Process spreading anomaly detection with TF-IDF
Process spreading in practice

Introduction

Lateral movement poses a significant risk to organizations, as it allows attackers to navigate through a network undetected, increasing the chances of data breaches, unauthorized access, and system compromise. To compound this issue, it’s exceedingly difficult to detect. Traditional security measures are often ill-equipped for the task, leaving organizations exposed to sophisticated attacks.

This blog post will explore how machine learning can be used to detect the ways in which adversaries move throughout victim networks during the course of their campaigns. We’ll show how Akamai Hunt applies machine learning to detect, investigate, and remediate anomalous attack behaviors and advanced threats that often evade security tools.

Latest tools for enhanced threat detection

As part of our ongoing effort to introduce effective threat-hunting tools, the Akamai Hunt team recently developed two powerful machine learning models designed to detect the lateral movement of the most evasive threats.

Graph neural networks

Detecting assets with anomalous neighbors is a key way to uncover compromises and threats. For this, we employ graph neural networks (GNNs) to learn how an enterprise’s assets interact. We then use link prediction to forecast potential edges between nodes based on the established patterns.

This allows us to compare the actual connections with the expected connections, such that any new edge that significantly deviates from the expected connections is flagged as anomalous. Assets that establish multiple unusual or suspicious connections are then flagged as potentially anomalous neighbors.

Organizations can leverage the combined strength of Akamai Guardicore Segmentation and Akamai Hunt to form a robust security ecosystem that addresses the risks associated with lateral movement and many more.

Don’t trust your neighbor … at least by default

The term anomalous neighbors refers to network assets that exhibit unusual or suspicious behavior compared with their neighboring assets. An asset with anomalous neighbors may be an infected asset that is trying to communicate with other assets; a behavior often observed when an attacker attempts lateral movement.

GNNs are a valuable tool for detecting anomalous nodes and edges within networks and used to model and analyze relationships between entities, learn their baseline behavior, and identify deviations.

Traditionally, GNNs were primarily used for static graphs in which the structure and features remained constant. However, recent advancements have expanded their capabilities to handle dynamic (temporal) graphs— and when coupled with sequence encoding methods, such as recurrent neural networks (RNNs), they are able to dynamically explore relationships over time and track graph changes.

Modeling lateral movement with temporal link prediction

One significant application of GNNs is temporal link prediction, which can be particularly relevant in the context of lateral movement detection. Temporal link prediction aims to predict future or missing links in a temporal graph based on its historical information. It addresses questions such as, "Will a connection between two entities occur in the future?" or "Is there a missing relationship between two nodes at a particular time step?"

To model lateral movement as a temporal link problem, we capture snapshots of network logs at regular intervals. Each snapshot represents the interactions between entities in the network. We then create a graph for each snapshot, where the nodes represent the entities observed and the edges represent the interactions between them.

We can construct a graph Gt=(Vt,Et) that abstracts the snapshot at time t such that the nodes Vt, are the entities observed in the snapshot and the edges Et are the interactions between the entities in the snapshot (Figure 1).

Fig. 1: Temporal graph networks

The goal of the temporal link prediction model is to learn what “normal” activity looks like based on historical snapshots. It considers both the structure of the snapshot graphs and how they change over time. Once we have this baseline, the model can predict future links by assigning likelihood scores to potential edges.

Although GNNs do a good job of learning the structure of a single snapshot graph, to consider how the graph changes over time, we must combine them with sequence encoders such as RNNs.

The model can be enriched by adding node features based on the entities, such as the role of the entity within the network (domain controller server, client endpoint, etc.) as well as edge features based on the interactions between the entities. For example, ports, related processes, and the ratio of failed connections to the total number of connections between the two entities allow for more accurate and informative detections.

GNNs in practice

To showcase the efficacy of our method, we simulated real scenarios within computer networks. For example, we examined the connections of a Windows Server 2008 R2 machine (win2008r2-1) on two different days in 2023: June 15 and June 16 (Figure 2).

Fig. 2: Neighbors connections of win2008r2-1 on June 15 and June 16

As you can see on the left panel of the screen, on June 15, win2008r2-1 had only one internal neighbor, namely DC-01, indicating a stable network environment.

However, things got interesting on June 16 when we saw a significant change in the connectivity pattern of win2008r2-1. We observed multiple internal connections and new neighbors from various labels and sections within the organization. These changes suggested the possibility of lateral movement and a potential security breach.

By using the GNN-based anomalous neighbors detection method, Akamai Hunt swiftly identified abnormal connection patterns and offered proactive mitigation with Akamai Guardicore Segmentation.

Detecting process spreading anomaly with TF-IDF

The detection of suspicious processes spreading over the network is crucial to ensure the security and integrity of a system. Suspicious processes refer to those running on a system that exhibit unusual or potentially malicious behavior. These processes may attempt to propagate themselves across the network, exploit vulnerabilities, access unauthorized resources, or engage in other activities indicative of malicious intent.

To analyze and detect suspicious processes, we used term frequency–inverse document frequency (TF-IDF analysis, a statistical technique that allows us to assess the significance of processes within a collection of assets. High scores are assigned to processes relatively unique to a specific asset. By employing TF-IDF to identify processes with high scores, we can effectively prioritize our investigation efforts and go for the highest-score processes first.

TF-IDF was first introduced in natural language processing, which evaluates the importance of words in documents or corpus. It can be used in various domains like information retrieval, sentiment analysis, text classification, the recommendation system, and social networks.

Using TF-IDF to detect process spreading

TF-IDF for a word in a document is calculated by multiplying two metrics.

Term frequency — How many times a word appears in a document
Inverse document frequency — How common or rare a word is in the entire document set

Here’s how we applied the TF-IDF method for detecting process spreading across the network:

The process represents the word.

The asset represents the document.

TF-IDF(p,a) = TF(p,a)*IDF(p)

TF - measures the frequency of a process within an asset.

TF-IDF method for detecting process spreading across the network

IDF - measures the rarity of a process within the entire assets.

Here’s how we applied the TF-IDF method for detecting process spreading across the network:

Advantages for detecting processes with TF-IDF

TF-IDF analysis offers several advantages for detecting processes spreading over the network.

Social Media Analysis
Researchers have employed the TF-IDF method in the social network domain to analyze messages spreading on Twitter. Since our objective of identifying processes that spread through the network shares similarities with this approach, we also chose to use the TF-IDF method.
Applies to various data sources
TF-IDF is a versatile technique that can be applied to various data formats, including network logs, flow data, and process execution records. Moreover, it scales well to large datasets, making it suitable for real-time or retrospective analysis of network processes. As our Akamai Hunt team collects data from various sources, this method proves extremely useful.
Weights scores
TF-IDF quantifies the significance of processes by assigning weights based on their frequency and occurrence within the network data. These weightings generate interpretable scores that help IT managers understand why specific processes are identified as potentially malicious or suspicious. This clear and detailed explanation facilitates a deeper understanding of the potential threats present in the network environment, enabling informed decision-making and effective response strategies.
Captures process frequency
TF-IDF’s term frequency component allows us to quantify the occurrence of each process in network data, capturing the frequency with which they appear. Assigning a higher weight to more frequent processes helps highlight potential anomalies or patterns that may require further investigation.
Detects uncommon processes
TF-IDF's inverse document frequency component effectively identifies rare or infrequently occurring processes. By emphasizing these uncommon processes, we can uncover suspicious activities that have previously gone unnoticed, improving overall network security.

Process spreading in practice

In our simulations, we focused on all connections on June 17 and compared them with the week before (June 10 – June 16).

We ran our TF-IDF algorithm and received the following output (Table).

Table: TF-IDF output, June 17

The table clearly shows a very high anomaly score for our simulation, the spreading process called worm.exe. We delved deeper into the simulation by examining the spread of a malicious process on June 17. The visualization of the process spread allowed organizations to identify the affected assets and take immediate actions to prevent further damage. We can see the spread of the process starting from asset win2016-2 and propagating to all other assets in Figure 3.

Fig. 3: The spread of worm.exe from win2016-2, June 17

Akamai Hunt’s process spreading detection method leverages the power of TF-IDF analysis to systematically identify and prioritize potentially risky or malicious processes. This approach helps organizations enhance overall network security and safeguard critical information.

Conclusion

The practical application of machine learning in cybersecurity becomes apparent as we see how GNNs and TF-IDFs can help detect real threats when traditional detection mechanisms fail. Through simulations of real-life scenarios, we demonstrated how these methods can enhance security and provide valuable insights into potential threats.

The anomalous neighbors detection method proved effective in identifying abnormal connection patterns that may go unnoticed while combining GNN models.
The process spreading anomaly detection method, using TF-IDF analysis, excelled in identifying and understanding suspicious processes propagating throughout the network.

The Akamai Hunt team continues to apply machine learning, among many other mechanisms, to protect customers’ critical environments — with the aim of uncovering the most evasive threats.

Find out more

To learn more about our managed threat hunting service, visit our Akamai Hunt page.

Learn more

Written by

Or Zuckerman, Ido Sakazi, Yarin Ozery, and Gil Goren

October 20, 2023

Written by

Or Zuckerman

Or Zuckerman is a Senior Security Researcher at Akamai, working in various security fields, including incident response, forensic research, and detection method development.

Written by

Ido Sakazi

Written by

Yarin Ozery

Written by

Gil Goren

The AI-Powered Reboot: Rethinking Defense for Web Apps and APIs

April 22, 2025

Threat actors are now deploying AI-generated kill chains that automate the entire attack lifecycle. Learn how to protect your organization.

by Tricia Howard

A seemingly routine use of one of our external services ultimately led to the discovery of several severe attack opportunities.

Sharing Is (Not) Caring: How Shared Credentials Open the Door to Breaches

April 14, 2025

A routine examination of a third-party testing tool led to the discovery of plaintext credentials (including private keys) and several attack opportunities.

by Tomer Peled

A critical remote code execution vulnerability chain, which is believed to affect many Unix-like hosts, was disclosed on September 26, 2024.

Critical Linux RCE Vulnerability in CUPS — What We Know and How to Prepare

April 08, 2025

Get guidance and preparation recommendations for a critical Linux remote code execution (RCE) vulnerability in Common UNIX Printing System (CUPS).

by Akamai Security Intelligence Group

Security

App and API Security

Zero Trust Security

Bot & Abuse Protection

INFRASTRUCTURE SECURITY

Cloud Computing

Content Delivery

APPLICATION PERFORMANCE

MEDIA DELIVERY

EDGE APPLICATIONS

MONITORING, REPORTING, AND TESTING

CLOUD COMPUTING

SECURITY

CONTENT DELIVERY

Library

Detect and Remediate Attacks: Practical Applications for Machine Learning

Introduction

Latest tools for enhanced threat detection

Graph neural networks

Don’t trust your neighbor … at least by default

Modeling lateral movement with temporal link prediction

GNNs in practice

Detecting process spreading anomaly with TF-IDF

Using TF-IDF to detect process spreading

Advantages for detecting processes with TF-IDF

Process spreading in practice

Conclusion

Find out more

Related Blog Posts

The AI-Powered Reboot: Rethinking Defense for Web Apps and APIs

Sharing Is (Not) Caring: How Shared Credentials Open the Door to Breaches

Critical Linux RCE Vulnerability in CUPS — What We Know and How to Prepare

Products

company

Careers

newsroom

Legal & compliance

Glossary