Cloud GPUs on Akamai
Access on‑demand NVIDIA GPUs with predictable pricing, low egress costs, and fast provisioning to accelerate AI training and inference, HPC, rendering, and media workloads.
- See current hourly rates and regions on the Akamai Cloud pricing page. View GPU pricing
- New: NVIDIA RTX PRO 6000 Blackwell Server Edition for high‑throughput inference. Join the waitlist
- Get started fast with docs, API/CLI, and Terraform. View documentation
Why GPUs on Akamai Cloud
- Dedicated GPU performance for consistent throughput on AI and HPC workloads
- Low, predictable egress — save up to 90% vs. typical hyperscalers ($0.005/GB in most regions)
- On‑demand billing with no long‑term contracts and easy scale up/down
- Deploy close to users and data to reduce latency and improve real‑time experiences
Available GPU options
- NVIDIA RTX PRO 6000 Blackwell Server Edition
- 96 GB GDDR7; designed for high‑concurrency, edge‑proximate inference and agentic AI
- Access via NVIDIA AI Enterprise toolchain (NIM, NeMo) on Akamai Inference Cloud
-
Availability is rolling out globally. Join the waitlist and see the product overview on the NVIDIA RTX PRO 6000 Blackwell data sheet
-
NVIDIA RTX 4000 Ada Generation
- Balanced price/performance for ML inference, analytics, and media processing
- Built‑in media engines: 2x encode, 2x decode, 1x AV1 encode/decode per card
-
Plans start at $0.52/hour. See pricing
-
NVIDIA Quadro RTX 6000
- 6,144 CUDA cores; dual encode/decode engines with AV1 support
- Great for transcoding, rendering, and visualization
Need larger clusters or edge‑native inference? Akamai Inference Cloud supports high‑performance inference stacks, including multi‑GPU configurations (up to 8x RTX PRO 6000 Blackwell Server Edition GPUs per node), NVIDIA BlueField DPUs, and high‑memory/NVMe profiles to optimize TTFT and TPS. Explore Akamai Inference Cloud
Performance and cost highlights
- Up to 60% lower latency and up to 3x higher throughput for AI workloads compared to equivalent hyperscaler GPUs
- Up to 86% lower inference cost demonstrated with Stable Diffusion on Akamai Cloud
- Edge‑aware routing helps deliver more consistent real‑time responses for interactive apps and agents
Learn how to optimize cost and performance. Download the AI inference cost optimization white paper
Platform capabilities
- Compute and orchestration
- Dedicated GPU VMs; add GPU nodes to Akamai Kubernetes Engine (LKE)
- Pre‑integrated CNCF tooling on App Platform for LKE: KServe, Kubeflow Pipelines, vLLM, and NVIDIA NIM/NeMo on Inference Cloud
- Storage and data
- Block and Object Storage; automated Backups and snapshots
- Networking and security
- Private networking, load balancing (NodeBalancers), and Cloud Firewall
- Low‑cost egress ($0.005/GB in most regions)
- Tooling and support
- Provision via UI, API, CLI, and Terraform
- Full docs to install the NVIDIA CUDA toolkit and drivers. Install CUDA
- 24/7/365 support by email and phone
Provisioning: from zero to GPU in minutes
- Create an Akamai Cloud account. Sign up
- Choose a region and GPU plan (RTX PRO 6000 Blackwell, RTX 4000 Ada, or Quadro RTX 6000). See pricing and regions
- Select your OS image and deploy the instance.
- Install NVIDIA drivers and CUDA (or use a prepared image). Follow the CUDA guide
- Attach Block Storage and configure backups/snapshots as needed.
- Configure networking (private networking, firewall rules, and load balancer).
- Optional: Add GPU node pools to an LKE cluster for scalable inference/training.
- Deploy your application and monitor performance via the UI, API, or your observability stack.
Popular workloads
- AI training and fine‑tuning, batch and real‑time inference, RAG, and agentic systems
- HPC and scientific computing
- Rendering, 3D visualization, and media transcoding (AV1 support)
Next steps
Have a specific model or pipeline in mind? We can help map it to the right GPU, storage, and network profile so you can deploy with confidence.