CVE-2026-5760 -- CVSS 9.8 Vulnerability Briefing
CVE-2026-5760 | CVSS 9.8 (Critical) | Exploit: No known exploit
What Is It
CVE-2026-5760 is a Remote Code Execution vulnerability in SGLang's reranking API endpoint (/v1/rerank), triggered through unsafe rendering of Jinja2 chat templates embedded in malicious model tokenizer configuration files.
Technical Detail
The flaw exists because SGLang renders Jinja2 chat templates from tokenizer.chat_template fields without adequate sandboxing or input sanitization, allowing arbitrary template expressions to execute in the server's runtime context. An attacker who can cause the SGLang server to load a crafted model file containing a malicious tokenizer.chat_template value can achieve full Remote Code Execution on the host system when the /v1/rerank endpoint processes a request. The attack surface includes any deployment where model files are sourced from untrusted repositories or where external parties can influence which model is loaded, a common scenario in shared inference infrastructure and model-serving pipelines.
Exploitation Status
No known exploit code has been publicly observed as of April 27, 2026. This vulnerability is not currently listed in the CISA Known Exploited Vulnerabilities catalog. The exploit maturity is assessed as no known exploit at this time, though the underlying primitive, unsandboxed Jinja2 template injection leading to RCE, is a well-understood and historically exploited class of vulnerability, which lowers the barrier for independent exploit development.
Who Is Targeting This
No specific threat actor attribution at this time. No campaigns or targeted sectors have been confirmed in association with this CVE. Given the nature of the vulnerability and the growing use of SGLang in AI/ML inference deployments, organizations operating public-facing or multi-tenant model serving infrastructure should treat this as an elevated-risk exposure even in the absence of confirmed targeting.
What To Do
Apply any available SGLang patches addressing this vulnerability immediately given the critical CVSS score of 9.8. If a patch is not yet available, restrict access to the /v1/rerank endpoint to trusted internal networks only and enforce strict controls over which model files and repositories the SGLang server is permitted to load. Audit current deployments for any externally sourced or community-contributed model files that include tokenizer.chat_template fields, and treat those files as untrusted until reviewed. Detection should focus on anomalous process spawning from the SGLang server process and unexpected outbound network connections originating from the inference host. Organizations using automated model pipelines that pull from public hubs such as Hugging Face should implement pre-load scanning or template validation before models are served in production.