OpenQiwen.net

Research Record

Undergraduate Researcher, HPC Forge @ UC Irvine

Qiwen Xiao / UC Irvine / Undergraduate Researcher

Published: 2026

Keywords: HPC, Triton, LLM Inference, GPU Kernels

TL;DR

Studied fused W4A16 INT4 weight-only GEMM in Triton for LLM inference, with implementations and benchmarks for decode and prefill regimes.

Figure
Undergraduate Researcher, HPC Forge @ UC Irvine preview
Submission Content

Details coming soon.

Advised by Prof. Aparna Chandramowlishwaran under EECS 199.

This page should eventually include:

  • problem setting
  • kernel design
  • benchmarking setup
  • main results
  • report link
  • repository link