Qwen3.6-27B LiteLLM / Claude Code 配置基準
Backend: vLLM @ http://<VLLM_HOST>:18000/v1
Model: qwen3.6-27b-fp8
Alias 總覽
Section titled “Alias 總覽”| Alias | 用途 | thinking | temp | top_p | max_out | max_in | timeout |
|---|---|---|---|---|---|---|---|
qwen3.6-27b-code-act | Claude Code / OpenCode / Codex 改檔 | off | 0.5 | 0.9 | 16384 | 245760 | 600 |
qwen3.6-27b-code-think | 規劃、debug、架構分析 | on | 0.6 | 0.95 | 24576 | 237568 | 600 |
qwen3.6-27b-stable | OpenClaw、通用分析 | on | 0.6 | 0.95 | 16384 | 245760 | 600 |
qwen3.6-27b-strict | JSON / tool / 風控規則 | off | 0.5 | 0.9 | 8192 | 245760 | 600 |
qwen3.6-27b-fast | 摘要、翻譯、快速問答 | off | 0.7 | 0.8 | 4096 | 245760 | 600 |
qwen3.6-27b-report | 長篇研究、報告 | on | 1.0 | 0.95 | 32768 | 229376 | 900 |
統一價格: input $0.32 / 1M,output $3.20 / 1M
Backend context cap: 262144(所有 alias 滿足 max_in + max_out ≤ 262144)
LiteLLM 配置範例
Section titled “LiteLLM 配置範例”{ "model": "vllm/qwen3.6-27b-fp8", "api_base": "http://<VLLM_HOST>:18000/v1", "custom_llm_provider": "hosted_vllm", "max_tokens": 16384, "max_input_tokens": 245760, "temperature": 0.5, "top_p": 0.9, "timeout": 600, "extra_body": { "top_k": 20, "chat_template_kwargs": { "enable_thinking": false } }}{ "model": "vllm/qwen3.6-27b-fp8", "api_base": "http://<VLLM_HOST>:18000/v1", "custom_llm_provider": "hosted_vllm", "max_tokens": 24576, "max_input_tokens": 237568, "temperature": 0.6, "top_p": 0.95, "timeout": 600, "extra_body": { "chat_template_kwargs": { "enable_thinking": true } }}{ "model": "vllm/qwen3.6-27b-fp8", "api_base": "http://<VLLM_HOST>:18000/v1", "custom_llm_provider": "hosted_vllm", "max_tokens": 32768, "max_input_tokens": 229376, "temperature": 1.0, "top_p": 0.95, "timeout": 900, "stream_timeout": 180, "extra_body": { "chat_template_kwargs": { "enable_thinking": true } }}完整 6 個 alias 的 JSON 見 repo 內 configs/litellm/。
Claude Code .zshrc
Section titled “Claude Code .zshrc”claude-qwen3.6-27b() { unset ANTHROPIC_API_KEY OPENAI_BASE_URL OPENAI_API_KEY unset AWS_BEARER_TOKEN_BEDROCK CLAUDE_CODE_USE_BEDROCK CLAUDE_CODE_USE_VERTEX
export ANTHROPIC_BASE_URL="${LITELLM_BASE_URL}" export ANTHROPIC_API_KEY="${LITELLM_API_KEY}" export ANTHROPIC_AUTH_TOKEN="${LITELLM_API_KEY}"
export ANTHROPIC_MODEL="qwen3.6-27b-code-act" export ANTHROPIC_SMALL_FAST_MODEL="qwen3.6-27b-fast" export ANTHROPIC_DEFAULT_OPUS_MODEL="qwen3.6-27b-code-think" export ANTHROPIC_DEFAULT_SONNET_MODEL="qwen3.6-27b-code-act" export ANTHROPIC_DEFAULT_HAIKU_MODEL="qwen3.6-27b-fast"
export API_TIMEOUT_MS=900000 export CLAUDE_CODE_MAX_OUTPUT_TOKENS=16384 export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
claude "$@"}
alias cct='ANTHROPIC_MODEL=qwen3.6-27b-code-think CLAUDE_CODE_MAX_OUTPUT_TOKENS=24576 claude'alias ccs='ANTHROPIC_MODEL=qwen3.6-27b-stable CLAUDE_CODE_MAX_OUTPUT_TOKENS=16384 claude'alias ccf='ANTHROPIC_MODEL=qwen3.6-27b-fast CLAUDE_CODE_MAX_OUTPUT_TOKENS=4096 claude'部署 Checklist
Section titled “部署 Checklist”-
LiteLLM UI 建立 6 個 alias,參數對齊上表
-
確認
LITELLM_BASE_URL/LITELLM_API_KEY已設於 shell env -
.zshrc加入 function 與三個 alias -
反向代理
proxy_read_timeout ≥ 900s(支援report) -
測試
claude-qwen3.6-27b能連線、cct/ccs/ccf切換有效 -
vLLM 服務確認
--max-model-len 262144已啟用