alonsosilvaallende/understanding_function_calling.ipynb

## understanding_function_calling.ipynb
{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/gist/alonsosilvaallende/78b48345cc6c84f9ab2ca9625239db24/understanding_function_calling.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "33dace9d-dca0-41a1-9813-aa75f2ce79f3",
      "metadata": {
        "id": "33dace9d-dca0-41a1-9813-aa75f2ce79f3"
      },
      "source": [
        "Language models take **text** as input and predict which **text** should come next. Taking that into consideration, what does _function calling_ even mean?\n",
        "\n",
        "In this post, I start by providing a basic example to motivate function calling, then I give a slightly more complex example by allowing a small language model to use Python. After that, I explain the *conversational response as a tool* trick. Finally, I explain how to do function calling in WebAssembly (optional but fun if you want to try function calling in the browser)."
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "import torch\n",
        "\n",
        "# Auto select device (CUDA > MPS > CPU)\n",
        "if torch.cuda.is_available():\n",
        "    device = torch.device(\"cuda\")\n",
        "elif hasattr(torch.backends, \"mps\") and torch.backends.mps.is_available():\n",
        "    device = torch.device(\"mps\")\n",
        "else:\n",
        "    device = torch.device(\"cpu\")\n",
        "assert device == torch.device(\"cuda\"), \"In Runtime, Change runtime type to GPU\""
      ],
      "metadata": {
        "id": "2a3Uz7KeH9N0"
      },
      "id": "2a3Uz7KeH9N0",
      "execution_count": 1,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "id": "3bd313ac-564a-40c4-8e2c-1ca5ad875f54",
      "metadata": {
        "id": "3bd313ac-564a-40c4-8e2c-1ca5ad875f54"
      },
      "source": [
        "## Basic example"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "dc903bbf-25e0-45c1-9b78-e7018b55876f",
      "metadata": {
        "id": "dc903bbf-25e0-45c1-9b78-e7018b55876f"
      },
      "source": [
        "Let's start with a basic example. Imagine that we ask a language model to perform the multiplication of `1234567` times `8765432` whose result is:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "8cd894aa-e1db-488c-affd-60f683161a9c",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:02.564763Z",
          "iopub.status.busy": "2025-07-16T13:33:02.564231Z",
          "iopub.status.idle": "2025-07-16T13:33:02.599138Z",
          "shell.execute_reply": "2025-07-16T13:33:02.598220Z",
          "shell.execute_reply.started": "2025-07-16T13:33:02.564702Z"
        },
        "id": "8cd894aa-e1db-488c-affd-60f683161a9c",
        "outputId": "55666062-6a47-49a6-9add-34cd2d6b7abb",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "10,821,513,087,944\n"
          ]
        }
      ],
      "source": [
        "print(f\"{1234567*8765432:,}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "6631518c-5028-4f91-9dbb-7311fcfde2c1",
      "metadata": {
        "id": "6631518c-5028-4f91-9dbb-7311fcfde2c1"
      },
      "source": [
        "This is the answer of the language model:"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer\n",
        "from threading import Thread\n",
        "\n",
        "model_id = \"Qwen/Qwen3-0.6B\"\n",
        "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
        "model = AutoModelForCausalLM.from_pretrained(model_id).to(device)\n",
        "streamer = TextIteratorStreamer(tokenizer, skip_prompt=True)"
      ],
      "metadata": {
        "id": "DcHESC_1JGat"
      },
      "id": "DcHESC_1JGat",
      "execution_count": 3,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "163b0eb2-9aa8-4ce4-adcb-bf601a91d91c",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:02.601188Z",
          "iopub.status.busy": "2025-07-16T13:33:02.600275Z",
          "iopub.status.idle": "2025-07-16T13:33:34.972004Z",
          "shell.execute_reply": "2025-07-16T13:33:34.971257Z",
          "shell.execute_reply.started": "2025-07-16T13:33:02.601141Z"
        },
        "id": "163b0eb2-9aa8-4ce4-adcb-bf601a91d91c",
        "outputId": "d82097f6-37cb-4238-d333-0d662f61f7a3",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "To find the product of **1234567 × 8765432**, we can use a calculator or a multiplication table. However, since this is a large number, it's best to use a calculator or a computational tool.\n",
            "\n",
            "### Final Answer:\n",
            "$$\n",
            "1234567 \\times 8765432 = \\boxed{1099999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999"
          ]
        }
      ],
      "source": [
        "def generate_response(messages, tools=[], enable_thinking=False):\n",
        "\n",
        "    prompt = tokenizer.apply_chat_template(\n",
        "        messages, tools=tools, tokenize=False, add_generation_prompt=True, enable_thinking=enable_thinking\n",
        "    )\n",
        "\n",
        "    model_inputs = tokenizer(prompt, return_tensors=\"pt\").to(device)\n",
        "\n",
        "    generation_kwargs = dict(\n",
        "        model_inputs,\n",
        "        streamer=streamer,\n",
        "        max_new_tokens=4 * 1024,\n",
        "        do_sample=False,\n",
        "        temperature=1.0,\n",
        "        top_p=1.0,\n",
        "        top_k=50,\n",
        "    )\n",
        "\n",
        "    thread = Thread(target=model.generate, kwargs=generation_kwargs)\n",
        "    thread.start()\n",
        "\n",
        "    assistant_response = \"\"\n",
        "    for chunk in streamer:\n",
        "        if tokenizer.eos_token in chunk or tokenizer.pad_token in chunk:\n",
        "              chunk = chunk.split(tokenizer.eos_token)[0]\n",
        "              chunk = chunk.split(tokenizer.pad_token)[0]\n",
        "        assistant_response += chunk\n",
        "        print(chunk, end=\"\")\n",
        "\n",
        "    thread.join()\n",
        "\n",
        "    return assistant_response\n",
        "\n",
        "user_input = \"What's 1234567 times 8765432?\"\n",
        "messages = [{\"role\": \"user\", \"content\": user_input}]\n",
        "assistant_response = generate_response(messages)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "053f009d-56bc-48db-a4ee-50eb7c919c88",
      "metadata": {
        "id": "053f009d-56bc-48db-a4ee-50eb7c919c88"
      },
      "source": [
        "This model is not particularly good but even GPT-4o-mini fails without using function calling:"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "f3ec4275-3d51-40f0-a0c9-eecda0537cc5",
      "metadata": {
        "id": "f3ec4275-3d51-40f0-a0c9-eecda0537cc5"
      },
      "source": [
        "![](https://github.com/alonsosilvaallende/alonsosilvaallende.github.io/blob/main/blog/posts/2025-07-05-Understanding-Function-Calling/ChatGPT-multiplication.jpg?raw=1)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "66716533-8301-46d0-8af0-c4c9dbba1b1b",
      "metadata": {
        "id": "66716533-8301-46d0-8af0-c4c9dbba1b1b"
      },
      "source": [
        "Here is the [link to that conversation](https://chatgpt.com/s/t_686901e99a648191815170ed3c83083b)."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "801908ff-32f9-4d69-962e-53230d824dd4",
      "metadata": {
        "id": "801908ff-32f9-4d69-962e-53230d824dd4"
      },
      "source": [
        "An analogy of function calling is that to ask a language model directly to perform a computation is similar to ask someone to make that computation in his head. It's hard! However, if we gave him a calculator, he could solve it quite easily. He needs to know only two things:\n",
        "\n",
        "1. which operation (or function) to use\n",
        "2. which numbers (or arguments) to use"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "8ab52489-ab4b-4d90-8615-bdab768d5a9e",
      "metadata": {
        "id": "8ab52489-ab4b-4d90-8615-bdab768d5a9e"
      },
      "source": [
        "The same is true for doing function calling with a language model. It only needs to know which function and which arguments to use."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "818723e6-3807-46a8-bf81-13e11734745b",
      "metadata": {
        "id": "818723e6-3807-46a8-bf81-13e11734745b"
      },
      "source": [
        "Let's provide a description of a function to the language model so it knows what's the function name and which are the arguments the function expects.\n",
        "\n",
        "You can provide the description manually (after all it's just **text**) but I will use Pydantic to do that. Here is the description of the `multiply` function"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "2f348e9c-c6c7-405a-b01d-a329e438d652",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:34.973446Z",
          "iopub.status.busy": "2025-07-16T13:33:34.973211Z",
          "iopub.status.idle": "2025-07-16T13:33:35.524063Z",
          "shell.execute_reply": "2025-07-16T13:33:35.523335Z",
          "shell.execute_reply.started": "2025-07-16T13:33:34.973431Z"
        },
        "id": "2f348e9c-c6c7-405a-b01d-a329e438d652",
        "outputId": "e6e106af-89c9-413d-d0e6-5dd4329331e5",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "{\n",
            "    \"type\": \"function\",\n",
            "    \"function\": {\n",
            "        \"name\": \"multiply\",\n",
            "        \"strict\": true,\n",
            "        \"parameters\": {\n",
            "            \"description\": \"Multiply two integers together.\",\n",
            "            \"properties\": {\n",
            "                \"a\": {\n",
            "                    \"description\": \"First integer\",\n",
            "                    \"title\": \"A\",\n",
            "                    \"type\": \"integer\"\n",
            "                },\n",
            "                \"b\": {\n",
            "                    \"description\": \"Second integer\",\n",
            "                    \"title\": \"B\",\n",
            "                    \"type\": \"integer\"\n",
            "                }\n",
            "            },\n",
            "            \"required\": [\n",
            "                \"a\",\n",
            "                \"b\"\n",
            "            ],\n",
            "            \"title\": \"multiply\",\n",
            "            \"type\": \"object\",\n",
            "            \"additionalProperties\": false\n",
            "        },\n",
            "        \"description\": \"Multiply two integers together.\"\n",
            "    }\n",
            "}\n"
          ]
        }
      ],
      "source": [
        "from pydantic import BaseModel, Field\n",
        "import json\n",
        "from openai import pydantic_function_tool\n",
        "\n",
        "class multiply(BaseModel):\n",
        "    \"\"\"Multiply two integers together.\"\"\"\n",
        "\n",
        "    a: int = Field(..., description=\"First integer\")\n",
        "    b: int = Field(..., description=\"Second integer\")\n",
        "\n",
        "tool = pydantic_function_tool(multiply)\n",
        "print(json.dumps(tool, indent=4))"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "0985bb23-1063-4102-841e-3335a6c2c873",
      "metadata": {
        "id": "0985bb23-1063-4102-841e-3335a6c2c873"
      },
      "source": [
        "After we have provided a function/tool description, we can see if the model knows what to do with it:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "id": "8aaee000-e81d-44fa-a6d2-3cf4f71f3e39",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:35.525387Z",
          "iopub.status.busy": "2025-07-16T13:33:35.524942Z",
          "iopub.status.idle": "2025-07-16T13:33:36.446751Z",
          "shell.execute_reply": "2025-07-16T13:33:36.446065Z",
          "shell.execute_reply.started": "2025-07-16T13:33:35.525344Z"
        },
        "id": "8aaee000-e81d-44fa-a6d2-3cf4f71f3e39",
        "outputId": "b5b2e31f-d79c-46bc-b948-15a91f69cfdd",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "<tool_call>\n",
            "{\"name\": \"multiply\", \"arguments\": {\"a\": 1234567, \"b\": 8765432}}\n",
            "</tool_call>"
          ]
        }
      ],
      "source": [
        "assistant_response = generate_response(messages, tools=[tool])"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "5ab4a8c1-2aea-4c65-a5c1-3aa24d0a61f7",
      "metadata": {
        "id": "5ab4a8c1-2aea-4c65-a5c1-3aa24d0a61f7"
      },
      "source": [
        "Even this small model is able to get correctly:\n",
        "\n",
        "- the function name (`multiply`)\n",
        "- the arguments to provide to the function (`1234567` and `8765432`)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "33cda6b3-d889-4dfd-aa81-2a3ae44e2246",
      "metadata": {
        "id": "33cda6b3-d889-4dfd-aa81-2a3ae44e2246"
      },
      "source": [
        "Notice that we have not yet implemented the function itself (!), which we might or might not do (as we will see in other examples next)."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "262d82a5-ea58-4fea-bc8c-60ad6427c9b8",
      "metadata": {
        "id": "262d82a5-ea58-4fea-bc8c-60ad6427c9b8"
      },
      "source": [
        "We can get the assistant response and simply do the multiplication ourselves. This is important, what we call function calling in reality is the model telling us which function to call and which arguments to provide to that function. It is up to us to run or not that function. In this case, we will run it:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "id": "3c138f21-a482-4d7d-9aa5-045587ff299a",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:36.447373Z",
          "iopub.status.busy": "2025-07-16T13:33:36.447224Z",
          "iopub.status.idle": "2025-07-16T13:33:36.452174Z",
          "shell.execute_reply": "2025-07-16T13:33:36.451560Z",
          "shell.execute_reply.started": "2025-07-16T13:33:36.447360Z"
        },
        "id": "3c138f21-a482-4d7d-9aa5-045587ff299a",
        "outputId": "dfec312f-78df-4781-ab62-5f1b166a6328",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "10,821,513,087,944\n"
          ]
        }
      ],
      "source": [
        "assistant_response_clean = assistant_response.split(\"<tool_call>\")[-1].split(\"</tool_call>\")[0]\n",
        "assistant_response_json = json.loads(assistant_response_clean)\n",
        "\n",
        "def execute_function_call(assistant_response_json):\n",
        "    if assistant_response_json['name'] == 'multiply':\n",
        "        tool_response = assistant_response_json['arguments']['a']*assistant_response_json['arguments']['b']\n",
        "    else:\n",
        "        tool_response = assistant_response_clean\n",
        "    return tool_response\n",
        "\n",
        "tool_response = execute_function_call(assistant_response_json)\n",
        "print(f\"{tool_response:,}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "783cf8a7-3491-472a-ab43-7dd1576add51",
      "metadata": {
        "id": "783cf8a7-3491-472a-ab43-7dd1576add51"
      },
      "source": [
        "This is great. We already got the correct answer! However, we usually want to have a more \"human\" response. To do that we can append two messages (a message for the function call and another for the tool response:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "id": "4689cbf2-80cb-4f5d-a03a-a547bbb18929",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:36.453133Z",
          "iopub.status.busy": "2025-07-16T13:33:36.452841Z",
          "iopub.status.idle": "2025-07-16T13:33:36.491568Z",
          "shell.execute_reply": "2025-07-16T13:33:36.490366Z",
          "shell.execute_reply.started": "2025-07-16T13:33:36.453112Z"
        },
        "id": "4689cbf2-80cb-4f5d-a03a-a547bbb18929"
      },
      "outputs": [],
      "source": [
        "messages.append({\n",
        "    \"role\": \"assistant\",\n",
        "    \"content\": \"\",\n",
        "    \"function_call\": None,\n",
        "    \"tool_calls\": [{\n",
        "        \"name\": assistant_response_json[\"name\"],\n",
        "        \"arguments\": assistant_response_json[\"arguments\"],\n",
        "    }],\n",
        "})\n",
        "\n",
        "messages.append(\n",
        "    {\n",
        "        \"role\": \"tool\",\n",
        "        \"content\": f\"{tool_response}\",\n",
        "    }\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "5bf1f563-cc90-4a10-a04b-713fd395de94",
      "metadata": {
        "id": "5bf1f563-cc90-4a10-a04b-713fd395de94"
      },
      "source": [
        "Since we have full control in this setting, we can actually see what's the **text** the language model is receiving as input:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "id": "9ed40908-724e-43c7-8909-8545dba7e46a",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:36.493814Z",
          "iopub.status.busy": "2025-07-16T13:33:36.492987Z",
          "iopub.status.idle": "2025-07-16T13:33:36.518563Z",
          "shell.execute_reply": "2025-07-16T13:33:36.517550Z",
          "shell.execute_reply.started": "2025-07-16T13:33:36.493776Z"
        },
        "id": "9ed40908-724e-43c7-8909-8545dba7e46a",
        "outputId": "84a5acf2-626e-4bd5-9c64-70bac57290c3",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "<|im_start|>system\n",
            "# Tools\n",
            "\n",
            "You may call one or more functions to assist with the user query.\n",
            "\n",
            "You are provided with function signatures within <tools></tools> XML tags:\n",
            "<tools>\n",
            "{\"type\": \"function\", \"function\": {\"name\": \"multiply\", \"strict\": true, \"parameters\": {\"description\": \"Multiply two integers together.\", \"properties\": {\"a\": {\"description\": \"First integer\", \"title\": \"A\", \"type\": \"integer\"}, \"b\": {\"description\": \"Second integer\", \"title\": \"B\", \"type\": \"integer\"}}, \"required\": [\"a\", \"b\"], \"title\": \"multiply\", \"type\": \"object\", \"additionalProperties\": false}, \"description\": \"Multiply two integers together.\"}}\n",
            "</tools>\n",
            "\n",
            "For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n",
            "<tool_call>\n",
            "{\"name\": <function-name>, \"arguments\": <args-json-object>}\n",
            "</tool_call><|im_end|>\n",
            "<|im_start|>user\n",
            "What's 1234567 times 8765432?<|im_end|>\n",
            "<|im_start|>assistant\n",
            "<tool_call>\n",
            "{\"name\": \"multiply\", \"arguments\": {\"a\": 1234567, \"b\": 8765432}}\n",
            "</tool_call><|im_end|>\n",
            "<|im_start|>user\n",
            "<tool_response>\n",
            "10821513087944\n",
            "</tool_response><|im_end|>\n",
            "<|im_start|>assistant\n",
            "<think>\n",
            "\n",
            "</think>\n",
            "\n",
            "\n"
          ]
        }
      ],
      "source": [
        "prompt = tokenizer.apply_chat_template(\n",
        "    messages,\n",
        "    tools=[tool],\n",
        "    tokenize=False,\n",
        "    add_generation_prompt=True,\n",
        "    enable_thinking=False,\n",
        ")\n",
        "print(prompt)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "4ec4337c-c86b-468d-a0b0-3f54e70566f2",
      "metadata": {
        "id": "4ec4337c-c86b-468d-a0b0-3f54e70566f2"
      },
      "source": [
        "We can then make a call to the model to provide a \"human\" response:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "id": "0a551d55-6fff-42bb-bdff-b66b589b5681",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:36.520134Z",
          "iopub.status.busy": "2025-07-16T13:33:36.519736Z",
          "iopub.status.idle": "2025-07-16T13:33:37.366360Z",
          "shell.execute_reply": "2025-07-16T13:33:37.365415Z",
          "shell.execute_reply.started": "2025-07-16T13:33:36.520098Z"
        },
        "id": "0a551d55-6fff-42bb-bdff-b66b589b5681",
        "outputId": "e27293ae-bfaa-423c-83bc-bdc2aa2dc4c6",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "1234567 × 8765432 = 108215130879441234567 × 8765432 = 10821513087944\n"
          ]
        }
      ],
      "source": [
        "assistant_response = generate_response(messages, tools=[tool])\n",
        "print(assistant_response)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "c6740919-17ce-4da5-9a68-57e34873738f",
      "metadata": {
        "id": "c6740919-17ce-4da5-9a68-57e34873738f"
      },
      "source": [
        "OK, this response is correct and more \"human\". It could be improved but that's because it's a very small model."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "03090f17-e0c4-40df-9e31-ef158f03cd13",
      "metadata": {
        "id": "03090f17-e0c4-40df-9e31-ef158f03cd13"
      },
      "source": [
        "I hope you realize that even a small language model (in this example, with 0.6B parameters) provided with a function/tool can answer better that question than a powerful model such as GPT-4o-mini.\n",
        "\n",
        "That's powerful!"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "d2f27e52-acb9-428c-a87a-f4062819d76f",
      "metadata": {
        "id": "d2f27e52-acb9-428c-a87a-f4062819d76f"
      },
      "source": [
        "## Language Model Using Python"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9e32f01b-c61f-4cd8-803c-918811eed246",
      "metadata": {
        "id": "9e32f01b-c61f-4cd8-803c-918811eed246"
      },
      "source": [
        "In our basic example, we used a very tailored function because I wanted to show that function calling can use several arguments. We can however provided with a Python REPL which will allow us to use a pletora of tools in Python. Here is a description of a Python REPL:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "id": "8b2b26ad-d658-418d-b064-70407da244b7",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:37.367109Z",
          "iopub.status.busy": "2025-07-16T13:33:37.366959Z",
          "iopub.status.idle": "2025-07-16T13:33:37.372427Z",
          "shell.execute_reply": "2025-07-16T13:33:37.371772Z",
          "shell.execute_reply.started": "2025-07-16T13:33:37.367095Z"
        },
        "id": "8b2b26ad-d658-418d-b064-70407da244b7",
        "outputId": "5e04b76c-3261-46b1-d4c5-5fa0cf0fad0a",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "{\n",
            "    \"type\": \"function\",\n",
            "    \"function\": {\n",
            "        \"name\": \"Python_REPL\",\n",
            "        \"strict\": true,\n",
            "        \"parameters\": {\n",
            "            \"description\": \"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
            "            \"properties\": {\n",
            "                \"python_code\": {\n",
            "                    \"description\": \"Valid python command.\",\n",
            "                    \"title\": \"Python Code\",\n",
            "                    \"type\": \"string\"\n",
            "                }\n",
            "            },\n",
            "            \"required\": [\n",
            "                \"python_code\"\n",
            "            ],\n",
            "            \"title\": \"Python_REPL\",\n",
            "            \"type\": \"object\",\n",
            "            \"additionalProperties\": false\n",
            "        },\n",
            "        \"description\": \"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\"\n",
            "    }\n",
            "}\n"
          ]
        }
      ],
      "source": [
        "class Python_REPL(BaseModel):\n",
        "    \"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\"\n",
        "\n",
        "    python_code: str = Field(..., description=\"Valid python command.\")\n",
        "\n",
        "tool = pydantic_function_tool(Python_REPL)\n",
        "print(json.dumps(tool, indent=4))"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "53d01d89-5781-47be-b3b5-07d231b1dd7e",
      "metadata": {
        "id": "53d01d89-5781-47be-b3b5-07d231b1dd7e"
      },
      "source": [
        "We can ask our model to, for example, `make a bar plot`. In this case, the language model couldn't figure out what to do:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "id": "2dd18fa2-688c-4349-95a4-512f46974b06",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:37.373025Z",
          "iopub.status.busy": "2025-07-16T13:33:37.372893Z",
          "iopub.status.idle": "2025-07-16T13:33:38.161201Z",
          "shell.execute_reply": "2025-07-16T13:33:38.160494Z",
          "shell.execute_reply.started": "2025-07-16T13:33:37.373013Z"
        },
        "id": "2dd18fa2-688c-4349-95a4-512f46974b06",
        "outputId": "59e8ab6d-e064-40e4-fe5a-400d9d3725d3",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "I cannot make a bar plot directly, but I can help you create one using Python. Could you please provide the data you want to plot?I cannot make a bar plot directly, but I can help you create one using Python. Could you please provide the data you want to plot?\n"
          ]
        }
      ],
      "source": [
        "messages = [\n",
        "    {\"role\": \"user\", \"content\": \"Make a bar plot\"},\n",
        "]\n",
        "assistant_response = generate_response(messages, tools=[tool])\n",
        "print(assistant_response)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9a083efc-a0f1-43b1-be7d-e55050183e48",
      "metadata": {
        "id": "9a083efc-a0f1-43b1-be7d-e55050183e48"
      },
      "source": [
        "The model was unable to call the tool but we can help it by adding it directly to the prompt (notice that this is equivalent to `add_generation_prompt=True` when we apply the chat template). This is probably [what OpenAI does](https://platform.openai.com/docs/guides/function-calling/function-calling-behavior?api-mode=chat#additional-configurations) with the option `tool_choice=required` and we could even impose which tool to call which is probably equivalent to the forced function option that OpenAI provides). Let's add the tool call directly to the prompt:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "id": "d21ebcca-9fc8-4d23-bdbb-7295fcb51628",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:38.161917Z",
          "iopub.status.busy": "2025-07-16T13:33:38.161771Z",
          "iopub.status.idle": "2025-07-16T13:33:39.493370Z",
          "shell.execute_reply": "2025-07-16T13:33:39.492665Z",
          "shell.execute_reply.started": "2025-07-16T13:33:38.161904Z"
        },
        "id": "d21ebcca-9fc8-4d23-bdbb-7295fcb51628",
        "outputId": "7f70f688-6677-4915-974f-f6a7247bead6",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "<tool_call>\n",
            "{\"name\": \"Python_REPL\", \"arguments\": {\"python_code\": \"import matplotlib.pyplot as plt\\nplt.bar([1, 2, 3], [10, 20, 30])\\nplt.show()\"}}\n",
            "</tool_call>"
          ]
        }
      ],
      "source": [
        "def generate_response_with_tool_call_token(messages, tools=[], enable_thinking=False):\n",
        "\n",
        "    prompt = tokenizer.apply_chat_template(\n",
        "        messages, tools=tools, tokenize=False, add_generation_prompt=True, enable_thinking=enable_thinking\n",
        "    )\n",
        "    prompt += \"<tool_call>\" # added directly to the prompt\n",
        "    model_inputs = tokenizer(prompt, return_tensors=\"pt\").to(device)\n",
        "\n",
        "    generation_kwargs = dict(\n",
        "        model_inputs,\n",
        "        streamer=streamer,\n",
        "        max_new_tokens=4 * 1024,\n",
        "        do_sample=False,\n",
        "        temperature=1.0,\n",
        "        top_p=1.0,\n",
        "        top_k=50,\n",
        "    )\n",
        "\n",
        "    thread = Thread(target=model.generate, kwargs=generation_kwargs)\n",
        "    thread.start()\n",
        "\n",
        "    assistant_response = \"<tool_call>\" # it was added to the prompt\n",
        "    print(assistant_response, end=\"\")\n",
        "    for chunk in streamer:\n",
        "        if tokenizer.eos_token in chunk or tokenizer.pad_token in chunk:\n",
        "              chunk = chunk.split(tokenizer.eos_token)[0]\n",
        "              chunk = chunk.split(tokenizer.pad_token)[0]\n",
        "        assistant_response += chunk\n",
        "        print(chunk, end=\"\")\n",
        "\n",
        "    thread.join()\n",
        "\n",
        "    return assistant_response\n",
        "\n",
        "assistant_response = generate_response_with_tool_call_token(messages, tools=[tool])"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "815f6b53-84e2-4e46-b796-e9cefcb4c22d",
      "metadata": {
        "id": "815f6b53-84e2-4e46-b796-e9cefcb4c22d"
      },
      "source": [
        "This small model was able to provide the code to do a bar plot. It is up to us to run or not that code. In this case, we will run it:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "id": "9fcd891b-058c-4b33-b1d0-f57b7550024d",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:39.494963Z",
          "iopub.status.busy": "2025-07-16T13:33:39.494810Z",
          "iopub.status.idle": "2025-07-16T13:33:39.948150Z",
          "shell.execute_reply": "2025-07-16T13:33:39.947508Z",
          "shell.execute_reply.started": "2025-07-16T13:33:39.494950Z"
        },
        "id": "9fcd891b-058c-4b33-b1d0-f57b7550024d",
        "outputId": "cc27ae14-04c6-4aae-9819-c2fda3339cd7",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 430
        }
      },
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 640x480 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAiMAAAGdCAYAAADAAnMpAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAG+JJREFUeJzt3X9s3HX9wPFXx2gHsnYWWLtmHQzQ8csNnTALigMmYxLCdEZAo0NRlHTEsSiuCYIVk4IaQc0cJMrmr4mibkSQTRiuC7qhFBbGr4XNISOsRVHaUeRY1s/3D8N9ObaOXXvdu+0ej+ST7O4+97lX3/ns+szdtS3LsiwLAIBERqQeAAA4sIkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABIamTqAd6sp6cnnn/++Rg9enSUlZWlHgcA2AdZlsWOHTuirq4uRowo7rWOQRcjzz//fNTX16ceAwDog23btsX48eOLus+gi5HRo0dHxP++mMrKysTTAAD7oqurK+rr6/Pfx4sx6GLk9bdmKisrxQgADDF9+YiFD7ACAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAIKmiYmTx4sUxefLk/K9qb2hoiHvuuSd/+6uvvhqNjY1x+OGHx2GHHRZz5syJjo6Okg8NAAwfRcXI+PHj44Ybboi2trZ46KGH4uyzz44LL7wwHn/88YiIuOqqq+L3v/993HHHHdHa2hrPP/98fPSjHx2QwQGA4aEsy7KsPweorq6Ob3/72/Gxj30sjjzyyFi2bFl87GMfi4iIp556Kk444YRYt25dvO9979un43V1dUVVVVV0dnb6Q3kAMET05/t3nz8zsmvXrrj99tuju7s7Ghoaoq2tLXbu3BkzZszI73P88cfHhAkTYt26db0eJ5fLRVdXV8EGABw4RhZ7h40bN0ZDQ0O8+uqrcdhhh8Xy5cvjxBNPjA0bNkR5eXmMGTOmYP+amppob2/v9XgtLS3R3Nxc9OAA9M3RC+9OPQKJPXPD+alHKFD0KyOTJk2KDRs2xIMPPhhXXHFFzJ07N5544ok+D9DU1BSdnZ35bdu2bX0+FgAw9BT9ykh5eXkcd9xxERExderU+Nvf/hbf+9734qKLLorXXnstXnrppYJXRzo6OqK2trbX41VUVERFRUXxkwMAw0K/f89IT09P5HK5mDp1ahx88MGxevXq/G2bNm2KZ599NhoaGvr7MADAMFXUKyNNTU0xa9asmDBhQuzYsSOWLVsWa9asiVWrVkVVVVVcdtllsWDBgqiuro7Kysq48soro6GhYZ9/kgYAOPAUFSMvvPBCfPrTn47t27dHVVVVTJ48OVatWhUf+tCHIiLipptuihEjRsScOXMil8vFzJkz44c//OGADA4ADA/9/j0jpeb3jAAMLD9Nw0D8NE2S3zMCAFAKYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkioqRlpaWuLUU0+N0aNHx9ixY2P27NmxadOmgn2mT58eZWVlBdsXv/jFkg4NAAwfRcVIa2trNDY2xvr16+Pee++NnTt3xrnnnhvd3d0F+33+85+P7du357dvfetbJR0aABg+Rhaz88qVKwsuL126NMaOHRttbW1x5pln5q8/9NBDo7a2tjQTAgDDWr8+M9LZ2RkREdXV1QXX/+IXv4gjjjgiTj755GhqaopXXnml12Pkcrno6uoq2ACAA0dRr4y8UU9PT8yfPz/OOOOMOPnkk/PXf+ITn4ijjjoq6urq4tFHH42vfvWrsWnTpvjd7363x+O0tLREc3NzX8cAAIa4sizLsr7c8Yorroh77rknHnjggRg/fnyv+91///1xzjnnxObNm+PYY4/d7fZcLhe5XC5/uaurK+rr66OzszMqKyv7MhoAe3H0wrtTj0Biz9xwfsmP2dXVFVVVVX36/t2nV0bmzZsXd911V6xdu3avIRIRMW3atIiIXmOkoqIiKioq+jIGADAMFBUjWZbFlVdeGcuXL481a9bExIkT3/I+GzZsiIiIcePG9WlAAGB4KypGGhsbY9myZXHnnXfG6NGjo729PSIiqqqq4pBDDoktW7bEsmXL4sMf/nAcfvjh8eijj8ZVV10VZ555ZkyePHlAvgAAYGgrKkYWL14cEf/7xWZvtGTJkrj00kujvLw87rvvvrj55puju7s76uvrY86cOXHNNdeUbGAAYHgp+m2avamvr4/W1tZ+DQQAHFj8bRoAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASKqoGGlpaYlTTz01Ro8eHWPHjo3Zs2fHpk2bCvZ59dVXo7GxMQ4//PA47LDDYs6cOdHR0VHSoQGA4aOoGGltbY3GxsZYv3593HvvvbFz584499xzo7u7O7/PVVddFb///e/jjjvuiNbW1nj++efjox/9aMkHBwCGh5HF7Lxy5cqCy0uXLo2xY8dGW1tbnHnmmdHZ2Rk//vGPY9myZXH22WdHRMSSJUvihBNOiPXr18f73ve+0k0OAAwL/frMSGdnZ0REVFdXR0REW1tb7Ny5M2bMmJHf5/jjj48JEybEunXr+vNQAMAwVdQrI2/U09MT8+fPjzPOOCNOPvnkiIhob2+P8vLyGDNmTMG+NTU10d7evsfj5HK5yOVy+ctdXV19HQkAGIL6HCONjY3x2GOPxQMPPNCvAVpaWqK5ublfx4Ch5OiFd6cegcSeueH81CPAoNKnt2nmzZsXd911V/zpT3+K8ePH56+vra2N1157LV566aWC/Ts6OqK2tnaPx2pqaorOzs78tm3btr6MBAAMUUXFSJZlMW/evFi+fHncf//9MXHixILbp06dGgcffHCsXr06f92mTZvi2WefjYaGhj0es6KiIiorKws2AODAUdTbNI2NjbFs2bK48847Y/To0fnPgVRVVcUhhxwSVVVVcdlll8WCBQuiuro6Kisr48orr4yGhgY/SQMA7FFRMbJ48eKIiJg+fXrB9UuWLIlLL700IiJuuummGDFiRMyZMydyuVzMnDkzfvjDH5ZkWABg+CkqRrIse8t9Ro0aFYsWLYpFixb1eSgA4MDhb9MAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABIqugYWbt2bVxwwQVRV1cXZWVlsWLFioLbL7300igrKyvYzjvvvFLNCwAMM0XHSHd3d0yZMiUWLVrU6z7nnXdebN++Pb/98pe/7NeQAMDwNbLYO8yaNStmzZq1130qKiqitra2z0MBAAeOAfnMyJo1a2Ls2LExadKkuOKKK+LFF1/sdd9cLhddXV0FGwBw4Ch5jJx33nnx05/+NFavXh033nhjtLa2xqxZs2LXrl173L+lpSWqqqryW319falHAgAGsaLfpnkrF198cf7f73rXu2Ly5Mlx7LHHxpo1a+Kcc87Zbf+mpqZYsGBB/nJXV5cgAYADyID/aO8xxxwTRxxxRGzevHmPt1dUVERlZWXBBgAcOAY8Rp577rl48cUXY9y4cQP9UADAEFT02zQvv/xywascW7dujQ0bNkR1dXVUV1dHc3NzzJkzJ2pra2PLli1x9dVXx3HHHRczZ84s6eAAwPBQdIw89NBDcdZZZ+Uvv/55j7lz58bixYvj0UcfjZ/85Cfx0ksvRV1dXZx77rlx/fXXR0VFRemmBgCGjaJjZPr06ZFlWa+3r1q1ql8DAQAHFn+bBgBISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJFx8jatWvjggsuiLq6uigrK4sVK1YU3J5lWVx77bUxbty4OOSQQ2LGjBnx9NNPl2peAGCYKTpGuru7Y8qUKbFo0aI93v6tb30rvv/978ctt9wSDz74YLztbW+LmTNnxquvvtrvYQGA4WdksXeYNWtWzJo1a4+3ZVkWN998c1xzzTVx4YUXRkTET3/606ipqYkVK1bExRdf3L9pAYBhp6SfGdm6dWu0t7fHjBkz8tdVVVXFtGnTYt26dXu8Ty6Xi66uroINADhwFP3KyN60t7dHRERNTU3B9TU1Nfnb3qylpSWam5tLOcZeHb3w7v32WAxOz9xwfuoRAHiD5D9N09TUFJ2dnflt27ZtqUcCAPajksZIbW1tRER0dHQUXN/R0ZG/7c0qKiqisrKyYAMADhwljZGJEydGbW1trF69On9dV1dXPPjgg9HQ0FDKhwIAhomiPzPy8ssvx+bNm/OXt27dGhs2bIjq6uqYMGFCzJ8/P775zW/GO97xjpg4cWJ87Wtfi7q6upg9e3Yp5wYAhomiY+Shhx6Ks846K395wYIFERExd+7cWLp0aVx99dXR3d0dl19+ebz00kvx/ve/P1auXBmjRo0q3dQAwLBRdIxMnz49sizr9faysrL4xje+Ed/4xjf6NRgAcGBI/tM0AMCBTYwAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSKnmMfP3rX4+ysrKC7fjjjy/1wwAAw8TIgTjoSSedFPfdd9//P8jIAXkYAGAYGJBKGDlyZNTW1g7EoQGAYWZAPjPy9NNPR11dXRxzzDHxyU9+Mp599tmBeBgAYBgo+Ssj06ZNi6VLl8akSZNi+/bt0dzcHB/4wAfisccei9GjR++2fy6Xi1wul7/c1dVV6pEAgEGs5DEya9as/L8nT54c06ZNi6OOOip+/etfx2WXXbbb/i0tLdHc3FzqMQCAIWLAf7R3zJgx8c53vjM2b968x9ubmpqis7Mzv23btm2gRwIABpEBj5GXX345tmzZEuPGjdvj7RUVFVFZWVmwAQAHjpLHyJe//OVobW2NZ555Jv7yl7/ERz7ykTjooIPikksuKfVDAQDDQMk/M/Lcc8/FJZdcEi+++GIceeSR8f73vz/Wr18fRx55ZKkfCgAYBkoeI7fffnupDwkADGP+Ng0AkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkxAgAkJQYAQCSEiMAQFJiBABISowAAEmJEQAgKTECACQlRgCApMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUmIEAEhKjAAASYkRACApMQIAJCVGAICkBixGFi1aFEcffXSMGjUqpk2bFn/9618H6qEAgCFsQGLkV7/6VSxYsCCuu+66ePjhh2PKlCkxc+bMeOGFFwbi4QCAIWxAYuS73/1ufP7zn4/PfOYzceKJJ8Ytt9wShx56aNx2220D8XAAwBA2stQHfO2116KtrS2ampry140YMSJmzJgR69at223/XC4XuVwuf7mzszMiIrq6uko9WkRE9OReGZDjMnQM1Lm1r5yDOAdJbSDOwdePmWVZ0fcteYz861//il27dkVNTU3B9TU1NfHUU0/ttn9LS0s0Nzfvdn19fX2pR4OIiKi6OfUEHOicg6Q2kOfgjh07oqqqqqj7lDxGitXU1BQLFizIX+7p6Yl///vfcfjhh0dZWVnBvl1dXVFfXx/btm2LysrK/T3qkGf9+s8a9o/16z9r2D/Wr/96W8Msy2LHjh1RV1dX9DFLHiNHHHFEHHTQQdHR0VFwfUdHR9TW1u62f0VFRVRUVBRcN2bMmL0+RmVlpZOoH6xf/1nD/rF+/WcN+8f69d+e1rDYV0ReV/IPsJaXl8fUqVNj9erV+et6enpi9erV0dDQUOqHAwCGuAF5m2bBggUxd+7ceO973xunnXZa3HzzzdHd3R2f+cxnBuLhAIAhbEBi5KKLLop//vOfce2110Z7e3uccsopsXLlyt0+1FqsioqKuO6663Z7W4d9Y/36zxr2j/XrP2vYP9av/wZiDcuyvvwMDgBAifjbNABAUmIEAEhKjAAASYkRACCpQRcjixYtiqOPPjpGjRoV06ZNi7/+9a+97rt06dIoKysr2EaNGrUfpx1c1q5dGxdccEHU1dVFWVlZrFix4i3vs2bNmnjPe94TFRUVcdxxx8XSpUsHfM7Bqtj1W7NmzW7nX1lZWbS3t++fgQeZlpaWOPXUU2P06NExduzYmD17dmzatOkt73fHHXfE8ccfH6NGjYp3vetd8Yc//GE/TDs49WUNPQ8WWrx4cUyePDn/C7kaGhrinnvu2et9nIP/r9j1K9X5N6hi5Fe/+lUsWLAgrrvuunj44YdjypQpMXPmzHjhhRd6vU9lZWVs3749v/3jH//YjxMPLt3d3TFlypRYtGjRPu2/devWOP/88+Oss86KDRs2xPz58+Nzn/tcrFq1aoAnHZyKXb/Xbdq0qeAcHDt27ABNOLi1trZGY2NjrF+/Pu69997YuXNnnHvuudHd3d3rff7yl7/EJZdcEpdddlk88sgjMXv27Jg9e3Y89thj+3HywaMvaxjhefCNxo8fHzfccEO0tbXFQw89FGeffXZceOGF8fjjj+9xf+dgoWLXL6JE5182iJx22mlZY2Nj/vKuXbuyurq6rKWlZY/7L1myJKuqqtpP0w0tEZEtX758r/tcffXV2UknnVRw3UUXXZTNnDlzACcbGvZl/f70pz9lEZH95z//2S8zDTUvvPBCFhFZa2trr/t8/OMfz84///yC66ZNm5Z94QtfGOjxhoR9WUPPg2/t7W9/e/ajH/1oj7c5B9/a3tavVOffoHll5LXXXou2traYMWNG/roRI0bEjBkzYt26db3e7+WXX46jjjoq6uvr37LeKLRu3bqC9Y6ImDlz5l7Xm92dcsopMW7cuPjQhz4Uf/7zn1OPM2h0dnZGRER1dXWv+zgH925f1jDC82Bvdu3aFbfffnt0d3f3+udInIO925f1iyjN+TdoYuRf//pX7Nq1a7ff0lpTU9Pre/CTJk2K2267Le688874+c9/Hj09PXH66afHc889tz9GHvLa29v3uN5dXV3x3//+N9FUQ8e4cePilltuid/+9rfx29/+Nurr62P69Onx8MMPpx4tuZ6enpg/f36cccYZcfLJJ/e6X2/n4IH6uZs32tc19Dy4u40bN8Zhhx0WFRUV8cUvfjGWL18eJ5544h73dQ7urpj1K9X5NyC/Dn5/aWhoKKi1008/PU444YS49dZb4/rrr084GQeCSZMmxaRJk/KXTz/99NiyZUvcdNNN8bOf/SzhZOk1NjbGY489Fg888EDqUYasfV1Dz4O7mzRpUmzYsCE6OzvjN7/5TcydOzdaW1t7/YZKoWLWr1Tn36CJkSOOOCIOOuig6OjoKLi+o6Mjamtr9+kYBx98cLz73e+OzZs3D8SIw05tbe0e17uysjIOOeSQRFMNbaeddtoB/w143rx5cdddd8XatWtj/Pjxe923t3NwX//PD1fFrOGbeR7831+PP+644yIiYurUqfG3v/0tvve978Wtt966277Owd0Vs35v1tfzb9C8TVNeXh5Tp06N1atX56/r6emJ1atX7/W9qjfatWtXbNy4McaNGzdQYw4rDQ0NBesdEXHvvffu83qzuw0bNhyw51+WZTFv3rxYvnx53H///TFx4sS3vI9zsFBf1vDNPA/urqenJ3K53B5vcw6+tb2t35v1+fzr90dgS+j222/PKioqsqVLl2ZPPPFEdvnll2djxozJ2tvbsyzLsk996lPZwoUL8/s3Nzdnq1atyrZs2ZK1tbVlF198cTZq1Kjs8ccfT/UlJLVjx47skUceyR555JEsIrLvfve72SOPPJL94x//yLIsyxYuXJh96lOfyu//97//PTv00EOzr3zlK9mTTz6ZLVq0KDvooIOylStXpvoSkip2/W666aZsxYoV2dNPP51t3Lgx+9KXvpSNGDEiu++++1J9CUldccUVWVVVVbZmzZps+/bt+e2VV17J7/Pm/8N//vOfs5EjR2bf+c53sieffDK77rrrsoMPPjjbuHFjii8hub6soefBQgsXLsxaW1uzrVu3Zo8++mi2cOHCrKysLPvjH/+YZZlz8K0Uu36lOv8GVYxkWZb94Ac/yCZMmJCVl5dnp512WrZ+/fr8bR/84AezuXPn5i/Pnz8/v29NTU324Q9/OHv44YcTTD04vP6jpm/eXl+zuXPnZh/84Ad3u88pp5ySlZeXZ8ccc0y2ZMmS/T73YFHs+t14443Zsccem40aNSqrrq7Opk+fnt1///1phh8E9rR2EVFwTr35/3CWZdmvf/3r7J3vfGdWXl6enXTSSdndd9+9fwcfRPqyhp4HC332s5/NjjrqqKy8vDw78sgjs3POOSf/jTTLnINvpdj1K9X5V5ZlWVbcaykAAKUzaD4zAgAcmMQIAJCUGAEAkhIjAEBSYgQASEqMAABJiREAICkxAgAkJUYAgKTECACQlBgBAJISIwBAUv8HR2DNVRGaD+IAAAAASUVORK5CYII=\n"
          },
          "metadata": {}
        }
      ],
      "source": [
        "assistant_response_clean = assistant_response.split(\"<tool_call>\")[-1].split(\"</tool_call>\")[0]\n",
        "assistant_response_json = json.loads(assistant_response_clean)\n",
        "\n",
        "def execute_function_call(assistant_response_json):\n",
        "    if assistant_response_json['name'] == 'Python_REPL':\n",
        "        tool_response = exec(assistant_response_json['arguments']['python_code'])\n",
        "    else:\n",
        "        tool_response = assistant_response_json\n",
        "    return tool_response\n",
        "\n",
        "execute_function_call(assistant_response_json)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "4c245aac-a9bc-4f05-ace1-eb26f39062a8",
      "metadata": {
        "id": "4c245aac-a9bc-4f05-ace1-eb26f39062a8"
      },
      "source": [
        "By imposing that the language model needs to call a tool, we cannot let the model to respond to a conversational question. For example to `tell us a joke`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "id": "0c737216-ce27-4b3b-954e-7056a72ec053",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:39.949344Z",
          "iopub.status.busy": "2025-07-16T13:33:39.948900Z",
          "iopub.status.idle": "2025-07-16T13:33:40.610275Z",
          "shell.execute_reply": "2025-07-16T13:33:40.609556Z",
          "shell.execute_reply.started": "2025-07-16T13:33:39.949321Z"
        },
        "id": "0c737216-ce27-4b3b-954e-7056a72ec053",
        "outputId": "fa88824c-b3a0-4c19-8444-2a9de299e804",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "<tool_call>\n",
            "{\"name\": \"Python_REPL\", \"arguments\": {\"python_code\": \"print('Hello, world!')\"}}\n",
            "</tool_call>"
          ]
        }
      ],
      "source": [
        "messages = [\n",
        "    {\"role\": \"user\", \"content\": \"Tell me a joke\"},\n",
        "]\n",
        "assistant_response = generate_response_with_tool_call_token(messages, tools=[tool])"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "e1108f58-121e-45b6-963a-a51d750120c8",
      "metadata": {
        "id": "e1108f58-121e-45b6-963a-a51d750120c8"
      },
      "source": [
        "We could use a trick. We could provide a \"tool\" to get back that behavior. Let's give a description of that \"tool\":"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "id": "11ed5f67-a15e-489a-8666-3c4b97f7c640",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:40.611006Z",
          "iopub.status.busy": "2025-07-16T13:33:40.610856Z",
          "iopub.status.idle": "2025-07-16T13:33:40.616089Z",
          "shell.execute_reply": "2025-07-16T13:33:40.615488Z",
          "shell.execute_reply.started": "2025-07-16T13:33:40.610993Z"
        },
        "id": "11ed5f67-a15e-489a-8666-3c4b97f7c640",
        "outputId": "ce342695-1444-4e49-ceaa-62ca731291b7",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "{\n",
            "    \"type\": \"function\",\n",
            "    \"function\": {\n",
            "        \"name\": \"ConversationalResponse\",\n",
            "        \"strict\": true,\n",
            "        \"parameters\": {\n",
            "            \"description\": \"Respond in a conversational manner. Be kind and helpful.\",\n",
            "            \"properties\": {\n",
            "                \"response\": {\n",
            "                    \"description\": \"A conversational response to the user's query\",\n",
            "                    \"title\": \"Response\",\n",
            "                    \"type\": \"string\"\n",
            "                }\n",
            "            },\n",
            "            \"required\": [\n",
            "                \"response\"\n",
            "            ],\n",
            "            \"title\": \"ConversationalResponse\",\n",
            "            \"type\": \"object\",\n",
            "            \"additionalProperties\": false\n",
            "        },\n",
            "        \"description\": \"Respond in a conversational manner. Be kind and helpful.\"\n",
            "    }\n",
            "}\n"
          ]
        }
      ],
      "source": [
        "class ConversationalResponse(BaseModel):\n",
        "    \"\"\"Respond in a conversational manner. Be kind and helpful.\"\"\"\n",
        "\n",
        "    response: str = Field(description=\"A conversational response to the user's query\")\n",
        "tool_conversational_response = pydantic_function_tool(ConversationalResponse)\n",
        "print(json.dumps(tool_conversational_response, indent=4))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "id": "f19663d4-5721-4871-9581-7c8edbcc5318",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:40.616617Z",
          "iopub.status.busy": "2025-07-16T13:33:40.616485Z",
          "iopub.status.idle": "2025-07-16T13:33:42.289695Z",
          "shell.execute_reply": "2025-07-16T13:33:42.288989Z",
          "shell.execute_reply.started": "2025-07-16T13:33:40.616604Z"
        },
        "id": "f19663d4-5721-4871-9581-7c8edbcc5318",
        "outputId": "9ebab9bd-9a8c-4b84-e0ae-291aaef673ba",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "<tool_call>\n",
            "{\"name\": \"ConversationalResponse\", \"arguments\": {\"response\": \"Here's a joke for you: A man in a hat with a hat on it, a hat with a hat on it, and a hat with a hat on it... It's a hat with a hat on it!\"}}\n",
            "</tool_call>"
          ]
        }
      ],
      "source": [
        "assistant_response = generate_response_with_tool_call_token(messages, tools=[tool, tool_conversational_response])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 19,
      "id": "e3cb925e-49d1-4788-8f64-346282b9b193",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2025-07-16T13:33:42.290470Z",
          "iopub.status.busy": "2025-07-16T13:33:42.290320Z",
          "iopub.status.idle": "2025-07-16T13:33:42.294636Z",
          "shell.execute_reply": "2025-07-16T13:33:42.294292Z",
          "shell.execute_reply.started": "2025-07-16T13:33:42.290457Z"
        },
        "id": "e3cb925e-49d1-4788-8f64-346282b9b193",
        "outputId": "9461114b-2482-4825-af0e-121c0ebb6df5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "\"Here's a joke for you: A man in a hat with a hat on it, a hat with a hat on it, and a hat with a hat on it... It's a hat with a hat on it!\""
            ],
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            }
          },
          "metadata": {},
          "execution_count": 19
        }
      ],
      "source": [
        "assistant_response_clean = assistant_response.split(\"<tool_call>\")[-1].split(\"</tool_call>\")[0]\n",
        "assistant_response_json = json.loads(assistant_response_clean)\n",
        "\n",
        "def execute_function_call(assistant_response_json):\n",
        "    if assistant_response_json['name'] == 'Python_REPL':\n",
        "        tool_response = exec(assistant_response_json['arguments']['python_code'])\n",
        "    elif assistant_response_json['name'] == 'ConversationalResponse':\n",
        "        tool_response = assistant_response_json['arguments']['response']\n",
        "    else:\n",
        "        tool_response = assistant_response_json\n",
        "    return tool_response\n",
        "\n",
        "execute_function_call(assistant_response_json)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "5393a3b8-4739-4e27-a572-dc57280675f3",
      "metadata": {
        "id": "5393a3b8-4739-4e27-a572-dc57280675f3"
      },
      "source": [
        "Not clear that it's a good joke but we got back the behavior we were expecting."
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "## WebAssembly\n",
        "\n",
        "For the WebAssembly part, run the code in the [WebAssembly section of the post](https://alonsosilvaallende.github.io/blog/posts/2025-07-05-Understanding-Function-Calling/Understanding_Function_Calling.html#webassembly)"
      ],
      "metadata": {
        "id": "jD5LnbV4QhCA"
      },
      "id": "jD5LnbV4QhCA"
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.12.4"
    },
    "colab": {
      "provenance": [],
      "gpuType": "T4",
      "include_colab_link": true
    },
    "accelerator": "GPU"
  },
  "nbformat": 4,
  "nbformat_minor": 5
}
No results found