Cloud GPUs on Akamai

Access on‑demand NVIDIA GPUs with predictable pricing, low egress costs, and fast provisioning to accelerate AI training and inference, HPC, rendering, and media workloads.

See current hourly rates and regions on the Akamai Cloud pricing page. View GPU pricing
New: NVIDIA RTX PRO 6000 Blackwell Server Edition for high‑throughput inference. Join the waitlist
Get started fast with docs, API/CLI, and Terraform. View documentation

Why GPUs on Akamai Cloud

Dedicated GPU performance for consistent throughput on AI and HPC workloads
Low, predictable egress — save up to 90% vs. typical hyperscalers ($0.005/GB in most regions)
On‑demand billing with no long‑term contracts and easy scale up/down
Deploy close to users and data to reduce latency and improve real‑time experiences

Available GPU options

NVIDIA RTX PRO 6000 Blackwell Server Edition
96 GB GDDR7; designed for high‑concurrency, edge‑proximate inference and agentic AI
Access via NVIDIA AI Enterprise toolchain (NIM, NeMo) on Akamai Inference Cloud
Availability is rolling out globally. Join the waitlist and see the product overview on the NVIDIA RTX PRO 6000 Blackwell data sheet
NVIDIA RTX 4000 Ada Generation
Balanced price/performance for ML inference, analytics, and media processing
Built‑in media engines: 2x encode, 2x decode, 1x AV1 encode/decode per card
Plans start at $0.52/hour. See pricing
NVIDIA Quadro RTX 6000
6,144 CUDA cores; dual encode/decode engines with AV1 support
Great for transcoding, rendering, and visualization

Need larger clusters or edge‑native inference? Akamai Inference Cloud supports high‑performance inference stacks, including multi‑GPU configurations (up to 8x RTX PRO 6000 Blackwell Server Edition GPUs per node), NVIDIA BlueField DPUs, and high‑memory/NVMe profiles to optimize TTFT and TPS. Explore Akamai Inference Cloud

Performance and cost highlights

Up to 60% lower latency and up to 3x higher throughput for AI workloads compared to equivalent hyperscaler GPUs
Up to 86% lower inference cost demonstrated with Stable Diffusion on Akamai Cloud
Edge‑aware routing helps deliver more consistent real‑time responses for interactive apps and agents

Learn how to optimize cost and performance. Download the AI inference cost optimization white paper

Platform capabilities

Compute and orchestration
Dedicated GPU VMs; add GPU nodes to Akamai Kubernetes Engine (LKE)
Pre‑integrated CNCF tooling on App Platform for LKE: KServe, Kubeflow Pipelines, vLLM, and NVIDIA NIM/NeMo on Inference Cloud
Storage and data
Block and Object Storage; automated Backups and snapshots
Networking and security
Private networking, load balancing (NodeBalancers), and Cloud Firewall
Low‑cost egress ($0.005/GB in most regions)
Tooling and support
Provision via UI, API, CLI, and Terraform
Full docs to install the NVIDIA CUDA toolkit and drivers. Install CUDA
24/7/365 support by email and phone

Provisioning: from zero to GPU in minutes

Create an Akamai Cloud account. Sign up
Choose a region and GPU plan (RTX PRO 6000 Blackwell, RTX 4000 Ada, or Quadro RTX 6000). See pricing and regions
Select your OS image and deploy the instance.
Install NVIDIA drivers and CUDA (or use a prepared image). Follow the CUDA guide
Attach Block Storage and configure backups/snapshots as needed.
Configure networking (private networking, firewall rules, and load balancer).
Optional: Add GPU node pools to an LKE cluster for scalable inference/training.
Deploy your application and monitor performance via the UI, API, or your observability stack.

Popular workloads

AI training and fine‑tuning, batch and real‑time inference, RAG, and agentic systems
HPC and scientific computing
Rendering, 3D visualization, and media transcoding (AV1 support)

Next steps

Compare plans and regions. See GPU pricing
Spin up your first GPU instance. Create an account
Build low‑latency, edge‑native inference. Explore Akamai Inference Cloud
Talk with our team about workload sizing and architecture. Book an AI consultation

Have a specific model or pipeline in mind? We can help map it to the right GPU, storage, and network profile so you can deploy with confidence.