Skip to content

Conversation

@prernanookala-ai
Copy link

No description provided.

tensor_parallel_size: "{{ .Values.tensor_parallel_size }}"
pipeline_parallel_size: "{{ .Values.pipeline_parallel_size }}"

"Qwen/Qwen2.5-VL-7B-Instruct":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the support for this VL model. For better performance in the xeon include these additional variables and extra command arguments. Also tensor parallel is calculated dynamically based on the system configuration where models are deployed.

configMapValues:
VLLM_CPU_KVCACHE_SPACE: "40"
VLLM_RPC_TIMEOUT: "100000"
VLLM_ALLOW_LONG_MAX_MODEL_LEN: "1"
VLLM_ENGINE_ITERATION_TIMEOUT_S: "120"
VLLM_CPU_NUM_OF_RESERVED_CPU: "0"
VLLM_CPU_SGL_KERNEL: "1"
HF_HUB_DISABLE_XET: "1"
extraCmdArgs:
[
"--block-size",
"128",
"--dtype",
"bfloat16",
"--distributed_executor_backend",
"mp",
"--enable_chunked_prefill",
"--enforce-eager",
"--max-model-len",
"33024",
"--max-num-batched-tokens",
"2048",
"--max-num-seqs",
"256",
]
tensor_parallel_size: "{{ .Values.tensor_parallel_size }}"
pipeline_parallel_size: "{{ .Values.pipeline_parallel_size }}"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions!
I’ve updated xeon-values.yaml to include the additional configMap values and extra command arguments as suggested.
Please let me know if anything else needs adjustment.

@prernanookala-ai
Copy link
Author

Updated the PR per the review comments. Ready for another look.

prernanookala-ai added 2 commits January 30, 2026 18:10
Signed-off-by: prernanookala-ai <prerna.nookala@cld2labs.com>
Signed-off-by: prernanookala-ai <prerna.nookala@cld2labs.com>
@prernanookala-ai prernanookala-ai force-pushed the feature/prerna-model-config branch from af44a36 to 17173c5 Compare January 31, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants