TL

Decisions

Project-level calls that separate serving architecture from kernel optimization evidence.

Choose a project decision page for the status dashboard, grouped decision matrix, related experiments, and next evidence needed.

Serving infrastructure Architecture Decisions 9 decisions

GPU Inference Decision Lab

Architecture calls derived from EKS/vLLM serving measurements, with each decision tied back to the experiment evidence that supports, rejects, or bounds it.

Domains: Admission + readiness, Long-context scheduling, Cost + autoscaling, Quantization + hardware.

4 supported 2 partial 2 rejected 1 blocked
Kernel optimization Kernel Optimization Decisions 9 decisions

CUDA Kernel Lab

Kernel optimization calls derived from CUDA/Triton benchmark and profiler evidence across A10G and H200, with caveats kept beside the experiment that produced them.

Domains: Fusion wins, Memory/reduction boundaries, Matmul/Tensor Core gaps, Decode replay caveats.

2 supported 1 measured 5 caveated 1 rejected