Architecture Reference

The infrastructure layer below your models and above your cloud accounts.

emma sits between your raw cloud accounts and the frameworks your teams use. It provides GPU compute, observability, networking, and governed deployment as an integrated stack — without replacing anything your ML teams already run.

4

operational layers

5

GPU providers

400 Gbps

private backbone

Architecture overview

What emma operates. What emma doesn’t replace.

Your ML frameworks on top. Your cloud accounts underneath. emma governs everything in between.

The four operational layers

Each layer solves a different problem. Together, they eliminate the integration tax.

Layer 1: GPU compute

The substrate. GPU VMs across 5 providers, managed K8s across 3 hyperscalers. Two levels of abstraction — full control (VMs) or fully managed (mk8s) — with the same governance model.

Explore GPU compute →

Layer 2: Cross-Cloud Networking

emma's 400 Gbps private backbone connects GPU workloads across providers. On-demand virtual networks. Private IPs. No VPC peering. Up to 70% egress reduction.

Explore Networking →

Layer 3: GPU monitoring

GPU metrics at VM level and mk8s cluster level — utilization, memory, power, temperature, clock. No agents. No exporters. Metrics appear automatically.

Explore Monitoring →

Layer 4: Inference workflows

Governed templates for deploying inference on GPU VMs. Platform teams define standards. Application teams self-serve. Every deployment versioned and auditable.

Explore Workflows →

What emma doesn't replace

emma is additive. Your ML teams keep their tooling.

Your teams keep using

✓PyTorch, TensorFlow, JAX
✓Kubeflow, MLflow, Weights & Biases
✓Hugging Face, Argo Workflows
✓Helm charts, operators, CI/CD pipelines
✓Standard Kubernetes API
✓Your existing model registries and feature stores

emma handles

→Cross-cloud GPU provisioning (VMs + K8s)
→GPU observability at VM and cluster level
→Cross-cloud networking backbone
→Governed inference deployment templates
→RBAC, tagging, cost attribution, audit trails
→No vendor lock-in. No new dependency. No data path.

Provider coverage

Five GPU VM providers. Three hyperscalers for mk8s. One governance model.

GPU VMs

•AWS, GCP, Azure, emma, Nebius
•Full VM lifecycle: create, delete, start, stop
•Pre-validated NVIDIA driver-optimized images

GPU mk8s

•AWS (EKS), Azure (AKS), GCP (GKE)
•Pre-validated CUDA container images
•Centralized cost dashboards per project/team

Where other tools stop

Most tools solve one layer. emma operates across all four.

The AI infrastructure market is full of point solutions. Each solves one problem well. None solve the integration problem — and that's the problem that blocks your platform team.

Workflow orchestration

Can model a pipeline. Cannot provision a GPU VM. No control plane, no networking, no governed deployment.

AI application layer

Operates above infrastructure. Different budget line. Doesn't address provisioning, networking, or GPU observability.

GPU compute specialists

Raw capacity — often one provider only. No cross-cloud networking. No governance model.

emma doesn't compete with any of these. It operates at a different layer — the infrastructure substrate that all of them need but none of them provide.

How data and workloads flow

A typical multi-cloud AI architecture on emma.

1
Data lives on AWS S3
Your training dataset is on AWS. It stays there. No data migration required.
2
GPU training runs on Nebius
Best GPU pricing for your training job. emma provisions the GPU VM with pre-validated images. The backbone connects it to your data on AWS — private, low-latency.
3
Monitoring tracks the training run
GPU utilization, vRAM, temperature — visible in emma while the job runs. No agents. No separate dashboard.
4
Model artifact moves to Azure
Your serving infrastructure is on Azure. The trained model moves through the backbone — governed, observable, low egress cost.
5
Inference deployed via workflow template
An approved template provisions a GPU VM on Azure, installs the inference server, loads the model. The endpoint is live — governed, monitored, cost-attributed.

Explore more

Explore each layer in detail.

GPU compute

GPU VMs and managed K8s across five providers.

Explore →

GPU monitoring

GPU observability at VM and cluster level. No agents.

Explore →

Cross-Cloud Networking

400 Gbps private backbone connecting GPU workloads.

Explore →

Inference workflows

Governed templates for deploying inference on GPU VMs.

Explore →

Frequently asked questions

Architecture

Does emma replace my ML frameworks?

No. emma operates at the infrastructure layer — below your frameworks. PyTorch, TensorFlow, Kubeflow, MLflow, Hugging Face, and your custom code all run on emma's governed GPU infrastructure.

Does emma sit in my data path?

No. emma manages infrastructure provisioning, networking, monitoring, and deployment. Your data flows between GPU workloads through emma's backbone, but emma doesn't process, store, or inspect your data.

Does using emma create vendor lock-in?

No. emma provisions standard cloud resources — EC2 instances, EKS clusters, Azure VMs. If you stop using emma, your infrastructure continues running on the underlying providers. No proprietary abstractions.

Can I use emma for non-GPU workloads?

Yes. emma is a cloud operations platform for distributed infrastructure — not just GPU. CPU VMs, managed Kubernetes, networking, monitoring, and governance apply to all workload types.

How does governance work across the four layers?

One governance model spans all four layers. RBAC, tagging, cost attribution, and audit trails are applied consistently — whether the resource is a GPU VM, a K8s cluster, a network connection, or an inference deployment.

Which cloud providers does emma support?

emma governs 15+ cloud providers. For GPU: VMs on AWS, GCP, Azure, emma, Nebius. Managed K8s on AWS (EKS), Azure (AKS), GCP (GKE). Networking backbone connects all hyperscalers.

Is emma an IaC tool?

No. emma is a cloud operations platform — it manages the full lifecycle of infrastructure through a unified interface and API. You don't write HCL or YAML to use emma.

See the architecture running.

45-minute demo. GPU provisioning, networking, monitoring, and governed inference — live.

Book a demo →

The infrastructure layer below your models and above your cloud accounts.

Layer 1: GPU compute

Layer 2: Cross-Cloud Networking

Layer 3: GPU monitoring

Layer 4: Inference workflows

Your teams keep using

emma handles

GPU VMs

GPU mk8s

Data lives on AWS S3

GPU training runs on Nebius

Monitoring tracks the training run

Model artifact moves to Azure

Inference deployed via workflow template

GPU compute

GPU monitoring

Cross-Cloud Networking

Inference workflows

See the architecture running.