llm-d

llm-d/llm-d-inference-sim

A light weight vLLM simulator, for mocking out replicas.

43 stars 27 forks 43 watchers Go Apache License 2.0

incubating

3 Open Issues Need Help Last updated: Sep 12, 2025

Open Issues Need Help

View All on GitHub

Print real merged configuration to log on simulator initialization 2 months ago

good first issue

llm-d/llm-d-inference-sim

43

A light weight vLLM simulator, for mocking out replicas.

Go

#incubating

Add retries to connect to ZMQ 2 months ago

good first issue

llm-d/llm-d-inference-sim

43

A light weight vLLM simulator, for mocking out replicas.

Go

#incubating

Support --max-model-len config parameter 3 months ago

AI Summary: Implement a new command-line parameter, `--max-model-len`, in the vLLM simulator. This parameter will define the maximum context window size (in tokens) for the model. Requests exceeding this limit should return a 400 Bad Request error with a specific error message indicating the context length exceeded.

Complexity: 4/5

enhancement good first issue

llm-d/llm-d-inference-sim

43

A light weight vLLM simulator, for mocking out replicas.

Go

#incubating