-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix tool_choice stall on multi-call prompts (#17998)
#19848
opened Mar 4, 2026 by
AjAnubolu
Loading…
[AMD] Use bfloat16 for correction_bias in AITER FP8 path to avoid runtime dtype conversion for dsv3
deepseek
run-ci
#19843
opened Mar 4, 2026 by
inkcherry
Loading…
5 tasks
[NPU][Feature] eliminate dsv3 redundant rotary embed calculation
#19842
opened Mar 4, 2026 by
liupeng374
Loading…
5 tasks
[AMD][Bugfix] Fix get_global_server_args NameError on ROCm in fused_moe
#19840
opened Mar 4, 2026 by
michaelzhang-ai
•
Draft
5 tasks done
[diffusion] runtime: introduce generation service facade
diffusion
SGLang Diffusion
#19839
opened Mar 4, 2026 by
richl9
Loading…
5 tasks
[diffusion] docs: add diffusion runtime architecture boundary guide
documentation
Improvements or additions to documentation
#19838
opened Mar 4, 2026 by
richl9
Loading…
5 tasks
[OPT]Skip the first delayer to maximize the BS of the decoding.
run-ci
#19836
opened Mar 4, 2026 by
chenxu214
Loading…
5 tasks
fix cuda graph capturing error in sm120 mxfp8 triton path
#19835
opened Mar 4, 2026 by
wolfcomos
Loading…
5 tasks
[AMD] CI - Add MI35x nightly tests for kv-cache-fp8 and allreduce-fusion (DeepSeek)
amd
deepseek
#19834
opened Mar 4, 2026 by
yctseng0211
•
Draft
5 tasks
feature: support X-Data-Parallel-Rank header to specific dp-rank.
#19832
opened Mar 4, 2026 by
doujiang24
Loading…
5 tasks
Add support for InstantTensor
documentation
Improvements or additions to documentation
#19830
opened Mar 4, 2026 by
arlo-aisys
Loading…
4 of 5 tasks
[NPU][Bug fix] context parallel bug fix
deepseek
npu
#19820
opened Mar 4, 2026 by
liupeng374
Loading…
5 tasks
Fix torch.compile/dynamo crash with Qwen3 QK-norm in piecewise CUDA g…
#19818
opened Mar 4, 2026 by
jasperjiaguo
Loading…
5 tasks
[diffusion][WIP] support realtime krea diffusion
diffusion
SGLang Diffusion
#19817
opened Mar 4, 2026 by
IPostYellow
•
Draft
5 tasks
[FIX] Disable fused shared experts on pre-Hopper GPUs
deepseek
#19816
opened Mar 4, 2026 by
xueliangyang-oeuler
Loading…
5 tasks
[AMD] Add bf16 MoE weights padding
quant
LLM Quantization
#19814
opened Mar 4, 2026 by
Emmanuel0612
Loading…
5 tasks
[BUGFIX]Releasing redundant memory of w13_weight and nz when the FUSEEP feature is enabled
run-ci
#19813
opened Mar 4, 2026 by
chenxu214
Loading…
5 tasks
Fix Qwen3.5/Qwen3Next MTP EPLB compatibility
#19812
opened Mar 4, 2026 by
AjAnubolu
Loading…
5 tasks
[PD] Add graceful RDMA shutdown for Mooncake/NIXL disaggregation connectors
#19810
opened Mar 4, 2026 by
Kangyan-Zhou
•
Draft
4 tasks
Add priority scheduling for PD disagg with a header support for
x-sglang-request-priority.
model-gateway
run-ci
#19808
opened Mar 4, 2026 by
harrisonlimh
Loading…
1 of 5 tasks
Fix issue 19717 by making q0_indptr uniform strided instead of packed
run-ci
#19807
opened Mar 4, 2026 by
kpham-sgl
Loading…
5 tasks done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.