Skip to content

Pull requests: ROCm/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fixing P3L incompatibility with cython.
#200 opened Sep 20, 2024 by Alexei-V-Ivanov-AMD Loading…
Update run-amd-test.sh
#192 opened Sep 17, 2024 by Alexei-V-Ivanov-AMD Loading…
multi-gpu fused_moe tuning support
#143 opened Aug 16, 2024 by divakar-amd Loading…
[DO NOT MERGE] Vinayak/moe final hashem
#127 opened Aug 11, 2024 by carlushuang Loading…
Add max-batch-size to benchmark_throughput.py
#122 opened Aug 7, 2024 by dllehr-amd Loading…
Add truncate to all files after json dump
#117 opened Aug 2, 2024 by jpvillam-amd Loading…
[Misc] Use main triton branch
#115 opened Aug 1, 2024 by binarman Loading…
Adding SHM broadcast to ROCm/vllm
#113 opened Jul 31, 2024 by Lzy17 Loading…
optimizations for process output step
#104 opened Jul 25, 2024 by sanyalington Loading…
Update QueueLLM
#97 opened Jul 22, 2024 by gyulaz-htec Loading…
Add benchmark_latency_batched.py
#96 opened Jul 22, 2024 by dllehr-amd Loading…
New LLM for MLPerf Server scenario serving
#94 opened Jul 19, 2024 by gyulaz-htec Loading…
Add VLLM_SCHED_PREFILL_KVC_FREEPCT
#89 opened Jul 18, 2024 by sanyalington Loading…
Torchrun api server
#71 opened Jun 27, 2024 by gshtras Loading…
Use tgemm for mi300 only
#48 opened Jun 13, 2024 by ppalaniappan-amd Loading…
Update on naive_attn module
#21 opened May 28, 2024 by seungrokj Loading…
ProTip! What’s not been updated in a month: updated:<2024-08-19.