Skip to content

Supported GPUs

WAVE targets four GPU vendor families through dedicated compiler backends. This page lists the GPUs that have been verified, those expected to work based on architecture compatibility, and the emulator fallback for development without a GPU.

These GPUs have passed the full WAVE spec-verification test suite, confirming correct behavior for all 11 primitive categories.

VendorGPUArchitectureBackendStatus
AppleM4 ProApple GPU (Metal 3)wave-metalVerified
NVIDIAT4Turing (SM 7.5)wave-ptxVerified
AMDMI300XCDNA 3wave-hipVerified

These GPUs share the same architecture family as verified hardware and are expected to work. They have not yet been tested with the full spec-verification suite.

VendorGPUArchitectureBackendStatus
AppleM1Apple GPU (Metal 3)wave-metalExpected
AppleM1 Pro / Max / UltraApple GPU (Metal 3)wave-metalExpected
AppleM2Apple GPU (Metal 3)wave-metalExpected
AppleM2 Pro / Max / UltraApple GPU (Metal 3)wave-metalExpected
AppleM3Apple GPU (Metal 3)wave-metalExpected
AppleM3 Pro / Max / UltraApple GPU (Metal 3)wave-metalExpected
AppleM4Apple GPU (Metal 3)wave-metalExpected
AppleM4 Max / UltraApple GPU (Metal 3)wave-metalExpected
NVIDIARTX 2060-2080 TiTuring (SM 7.5)wave-ptxExpected
NVIDIARTX 3060-3090 TiAmpere (SM 8.6)wave-ptxExpected
NVIDIAA100Ampere (SM 8.0)wave-ptxExpected
NVIDIARTX 4060-4090Ada Lovelace (SM 8.9)wave-ptxExpected
NVIDIAH100Hopper (SM 9.0)wave-ptxExpected
AMDRX 7900 XTXRDNA 3wave-hipExpected
AMDRX 7600RDNA 3wave-hipExpected
AMDMI250XCDNA 2wave-hipExpected
IntelArc A770Xe HPG (Alchemist)wave-syclPending
IntelData Center GPU Max (Ponte Vecchio)Xe HPCwave-syclPending

Status definitions:

  • Verified - all 11 primitive categories pass the spec-verification test suite on this hardware.
  • Expected - same architecture family as a verified GPU; not yet tested.
  • Pending - backend implementation is complete but hardware access for verification is not yet available.

When no supported GPU is detected, WAVE automatically falls back to wave-emu, a CPU-based instruction-level emulator. The emulator executes the same .wbin binary that would run on a GPU, so kernels behave identically regardless of whether they are running on hardware or in emulation.

The emulator is useful for:

  • Development - write and debug kernels on a laptop without a discrete GPU.
  • CI/CD - run the full test suite in cloud environments that lack GPU instances.
  • Correctness testing - compare emulator output against hardware output to detect backend translation bugs.

To force emulator mode even when a GPU is available:

Terminal window
# Environment variable (works with all SDKs)
WAVE_BACKEND=emulator python my_kernel.py
# Python SDK - explicit backend selection
import wave_gpu
device = wave_gpu.device(backend="emulator")
// Rust SDK - explicit backend selection
let dev = wave_sdk::device::with_backend(wave_sdk::Backend::Emulator);

The emulator runs single-threaded by default. Set WAVE_EMU_THREADS to enable multi-threaded emulation for larger workloads:

Terminal window
WAVE_EMU_THREADS=8 WAVE_BACKEND=emulator python my_kernel.py

Every SDK provides a detection function that reports the selected backend:

Terminal window
# CLI
wave-emu --detect
# Python
python -c "import wave_gpu; print(wave_gpu.detect_backend())"
# Rust (in a binary)
wave_sdk::device::detect().map(|d| println!("{:?}", d.backend()));

Possible output values: metal, ptx, hip, sycl, emulator.

Next: Introduction to the ISA - learn how WAVE instructions are encoded and executed.