GPU compute

GPU VMs and GPU-enabled managed Kubernetes. One governance model.

GPU VMs across five providers. GPU-enabled managed Kubernetes across three hyperscalers. Pre-validated NVIDIA/CUDA images. Your data scientists self-serve. Your governance model covers everything they provision.

GPU compute — VMs and managed Kubernetes

Full control or fully managed. Same platform. Same governance.

GPU Virtual Machines

•GPU configurations across AWS, GCP, Azure, emma, and Nebius
•Pre-validated NVIDIA driver-optimized ML/DL images
•Standard VM lifecycle: create, delete, start, stop
•Same governance, cost view, and interface as all other emma VMs

Explore GPU VMs →

GPU managed Kubernetes

•Managed GPU K8s clusters across AWS, Azure, and GCP
•Pre-validated CUDA container images — no driver conflicts
•Provision via UI or CLI in under 5 minutes
•Centralized cost and utilization dashboards per project/team

Explore GPU mk8s →

What changes in your team's workflow

The same task. A different experience.

Before emma

✕Data scientists wait days for GPU VMs or clusters
✕Driver conflicts endemic
✕Different workflow per cloud
✕GPU resources outside governance perimeter
✕Cost siloed per provider. No unified inventory

After emma

✓GPU VM or K8s cluster in minutes
✓Pre-validated images — no driver debugging
✓Governance applied automatically
✓Self-service within guardrails
✓Unified cost per team, project, cloud

Multi-provider GPU sourcing

Source GPU capacity across providers — from a single control plane.

emma surfaces GPU availability, pricing, and instance types across all connected providers. Your team compares options, provisions from the best source, and governs everything through one interface — instead of navigating five provider consoles to find the right GPU.

Cross-provider comparison

Compare GPU instance types, availability, and pricing across AWS, GCP, Azure, emma, and Nebius before provisioning.

Spot and on-demand

Surface spot GPU capacity across providers. When spot dries up on one provider, find the next option — without leaving the platform.

Unified cost attribution

GPU spend attributed per team, project, and provider. One cost view regardless of where the GPU workload runs.

Available GPUs

Your GPUs. At your fingertips

Access the latest NVIDIA and AMD accelerators through emma's unified platform.

NVIDIA H200

140 / 280 / 420 / 560 GB & 1.1 TB

LLM training, 100B+ models

NVIDIA H100

40 GB

Training & inference up to 70B

NVIDIA A100

40 GB / 80 GB

Fine-tuning, batch inference

NVIDIA L40S

48 GB GDDR6

Video AI, graphics + inference

NVIDIA L4

24 GB GDDR6

Cost-effective inference

NVIDIA A10G

24 GB GDDR6

Inference, rendering

NVIDIA T4

16 GB GDDR6

Lightweight inference

AMD Radeon Pro V520

16 GB HBM2

Graphics, rendering

      NVLink-ready · Multi-GPU configurations · Reserved instances available · Pricing varies by provider and region
    

Cross-provider GPU comparison view — instance types, pricing, and availability

Which one do you need?

GPU VMs vs GPU managed Kubernetes — a decision guide.

Both are governed by the same platform. The choice depends on how you're operating today. If you already run workloads on Kubernetes, add GPUs to your existing environment with emma's Managed Kubernetes. If not, start directly with GPU VMs.

	GPU VMs	GPU managed Kubernetes
Best for	Training runs, experimentation, full-stack control	Containerized workloads, orchestration, team scaling
Abstraction level	You manage the VM — OS, packages, runtime	Cluster is managed — you deploy containers
Providers	AWS, GCP, Azure, emma, Nebius	AWS (EKS), Azure (AKS), GCP (GKE) (3)
Provisioning speed	Minutes (VM wizard or API)	Under 5 minutes (UI or CLI)
Images	Pre-validated NVIDIA driver-optimized OS	Pre-validated CUDA container images
Monitoring	3 GPU metrics (utilization, vRAM)	9 GPU metrics (incl. power, temp, clock)
Inference workflows	Yes — governed templates deploy to VMs	On the roadmap
Governance	Same RBAC, tagging, cost, audit	Same RBAC, tagging, cost, audit

Not sure? Both are available on the same platform. Many teams use VMs for experimentation and training, mk8s for production inference. emma's governance model covers both.

What teams run on GPU compute

From training to production — every workload governed from the start.

Model training and fine-tuning

Fine-tune foundation models or train custom models on H200, H100, or A100 GPUs. Pre-validated images mean your training job starts on first boot. VMs for full control, mk8s for orchestrated multi-node runs.

Real-time inference

Serve predictions in production on cost-effective GPUs (L40S, L4, T4). Deploy via Inference Workflow templates for governed, repeatable endpoints — or run serving containers on mk8s for Kubernetes-native inference.

GPU-accelerated data processing

Run RAPIDS, Spark on GPU, or custom ETL pipelines. Spin up GPU VMs for the job, shut them down when complete. Cost attributed per team and project — no surprise bills.

AI/ML experimentation

Data science teams self-serve GPU VMs within governance guardrails. Experiment freely without creating shadow infrastructure, ungoverned cloud spend, or driver-debugging tickets.

Beyond provisioning

GPU compute is layer one. Here's what's built on top of it.

Provisioning a GPU is the start. What makes emma different is what happens after — monitoring, networking, deployment, and governance applied to every GPU resource from the moment it boots.

GPU monitoring

Utilization, vRAM, power, temperature — visible in the same interface as CPU and network. No agents. No third-party tools. Available for both VMs and mk8s clusters.

Explore Monitoring →

Cross-Cloud Networking

Training data on one provider, inference on another — connected through emma's 400 Gbps backbone. No VPC peering. No manual config. Up to 70% egress reduction.

Explore Networking →

Inference workflows

Governed templates for deploying inference environments on GPU VMs. Platform teams define the standard. Application teams self-serve. Every deployment is governed and auditable.

Explore Workflows →

Frequently asked questions

GPU compute on emma

Should I use GPU VMs or GPU managed Kubernetes?

It generally depends on your current setup and how you run workloads. GPU VMs give you full control — ideal for training runs, experimentation, and workloads where you manage the full stack. GPU mk8s is better for containerized workloads that need orchestration and cluster-level management. Both are governed by the same platform. Many teams use both.

Which GPU types are available?

NVIDIA H200, H100, A100, L40S, L4, A10G, T4, and AMD Radeon Pro V520. Availability and pricing vary by provider and region.

Can I provision GPU resources via API?

Yes. Both GPU VMs and mk8s clusters can be provisioned via the emma API or the GUI. The same API works across all providers — no provider-specific SDKs required.

Do I need to install NVIDIA drivers?

No. GPU VMs launch with pre-validated, NVIDIA driver-optimized OS images. mk8s clusters use pre-validated CUDA container images. Your team gets a working GPU environment from first boot — no driver debugging.

How does cost attribution work?

GPU spend is attributed per team, project, and provider from the moment of provisioning. One dashboard for all GPU compute — VMs and mk8s — regardless of which cloud the workload runs on.

Is GPU autoscaling available?

GPU node autoscaling for mk8s is not included in the current release. You configure node count at cluster creation. Autoscaling is on the roadmap.

Can I connect GPU workloads across providers?

Yes. GPU VMs and mk8s clusters connect to workloads across providers through emma's 400 Gbps networking backbone. Training on one cloud, inference on another — connected architecturally. Learn more →

What monitoring is available for GPU resources?

GPU VMs: 3 metrics (utilization, vRAM usage, vRAM utilization). mk8s clusters: 9 metrics including power, temperature, and clock speed. No agents required. Learn more →

See GPU provisioning across providers from a single control plane.

45-minute walkthrough. VMs and managed K8s. Infrastructure, running.

Get a demo →