Research Record

Undergraduate Researcher, HPC Forge @ UC Irvine

Qiwen Xiao / UC Irvine / Undergraduate Researcher

Published: 2026

Keywords: HPC, Triton, LLM Inference, GPU Kernels

TL;DR

Studied fused W4A16 INT4 weight-only GEMM in Triton for LLM inference, with implementations and benchmarks for decode and prefill regimes.

Figure

Submission Content

Details coming soon.

Advised by Prof. Aparna Chandramowlishwaran under EECS 199.

This page should eventually include: