Skills
Python
End-to-end ML pipelines, diffusion model training, dataset tooling, and distributed systems integration.
C++
High-performance graphics systems, custom data structures, memory management, and real-time rendering engines.
CUDA
Custom kernels, GPU memory optimization, stream compaction, spatial data structures, and performance tuning.
PyTorch
Custom training loops, multi-GPU distributed training, model fine-tuning, and architecture experimentation.
WebGPU
Forward+/clustered rendering, compute pipelines, G-buffer systems, and browser-based GPU acceleration.
ONNX Runtime
Cross-platform deployment, WebGPU/CUDA execution, external data handling, and inference optimization.
WGSL
Compute shaders, custom rendering pipelines, parallel algorithms, and GPU-side data processing.
JavaScript
High-performance WebGPU apps, async systems, visualization tooling, and interactive GPU workflows.