Skip to content

Instantly share code, notes, and snippets.

@bbrowning
Last active September 3, 2025 17:36
Show Gist options
  • Select an option

  • Save bbrowning/b0b3b36e36eb78e19417d63ca1f5d022 to your computer and use it in GitHub Desktop.

Select an option

Save bbrowning/b0b3b36e36eb78e19417d63ca1f5d022 to your computer and use it in GitHub Desktop.
Running BFCL against models deployed to vLLM

Running BFCL tests against vLLM

Clone the gorilla repo and install BFCL dependencies:

git clone https://github.com/ShishirPatil/gorilla.git
cd gorilla/berkeley-function-call-leaderboard
python -m venv
source venv/bin/activate
pip install -e .

Run Qwen3-0.6B in vLLM:

Adjust args as necessary - this is how I do it locally on my Mac with a CPU-only build of vLLM from latest main as of Sept 3rd, 2025.

vllm serve Qwen/Qwen3-0.6B \
  --max-model-len 8192 \
  --max-num-batched-tokens 8192 \
  --reasoning-parser deepseek_r1 \
  --enable-auto-tool-choice \
  --tool-call-parser hermes

Edit the bfcl model_config.py as needed - see bfcl.diff below for an example. You need to create a MODEL_CONFIG_MAPPING entry for each model you want to test, setting the model_handler to OpenAICompletionsHandler and ensure is_fc_model=True.

<placeholder to ensure you edit the model_config.py when following steps>

Run the live_simple subset of BFCL tests just to ensure it's working against your vLLM and deployed model. Make sure OPENAI_BASE_URL is set to the proper path for your vLLM server:

OPENAI_BASE_URL="http://localhost:8000/v1" \
OPENAI_API_KEY="fake" \
bfcl generate --model Qwen/Qwen3-0.6B --test-category live_simple
diff --git a/berkeley-function-call-leaderboard/bfcl_eval/constants/model_config.py b/berkeley-function-call-leaderboard/bfcl_eval/constants/model_config.py
index 2466966..fb77b41 100644
--- a/berkeley-function-call-leaderboard/bfcl_eval/constants/model_config.py
+++ b/berkeley-function-call-leaderboard/bfcl_eval/constants/model_config.py
@@ -2065,3 +2065,19 @@ MODEL_CONFIG_MAPPING = {
# Uncomment to get the supported_models.py file contents
# print(repr(list(MODEL_CONFIG_MAPPING.keys())))
+
+
+MODEL_CONFIG_MAPPING = {
+ "Qwen/Qwen3-0.6B": ModelConfig(
+ model_name="Qwen/Qwen3-0.6B",
+ display_name="Qwen3-0.6B (FC) (vLLM)",
+ url="https://huggingface.co/Qwen/Qwen3-0.6B",
+ org="Qwen",
+ license="apache-2.0",
+ model_handler=OpenAICompletionsHandler,
+ input_price=None,
+ output_price=None,
+ is_fc_model=True,
+ underscore_to_dot=False,
+ ),
+}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment