Skip to content

Instantly share code, notes, and snippets.

@aurotripathy
Last active December 16, 2024 21:11
Show Gist options
  • Select an option

  • Save aurotripathy/bffa7b2125594b75912dafc78ac16c30 to your computer and use it in GitHub Desktop.

Select an option

Save aurotripathy/bffa7b2125594b75912dafc78ac16c30 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"attachments": {
"image.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tool Calling with Llama 3.1 and Furiosa RNGD\n",
"This notebook highlights the tool-calling capabilities of the Llama 3.1 models running on the \\\n",
"Furiosa RNGD (pronounced Renegade) LLM accelerator card. \\\n",
"Our goal is show the flow of a tool-calling LLM system. \\\n",
"We use the Llama-3.2-8B-Instruct model (running on RNGD) to demonstrate this. \n",
"\n",
"This notebook is inspired by the notebook at the link below \\\n",
"https://github.com/huggingface/huggingface-llama-recipes/blob/main/tool_calling/tool_calling.ipynb\n",
"\n",
"The LLM tool calling capability is important for Enterprise AI Apps. \\\n",
"To aid their generation, LLMs must be able to compute functions, make database-dips,\\\n",
"read rows/cols in spreadsheets, and draw charts & graphs (among other deterministic transformation on private data).\\\n",
"The sequence diagram below captures the flow of the code.\n",
"\n",
"![image.png](attachment:image.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from furiosa_llm import LLM, SamplingParams"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a custom tool that Llama will suggest you use to solve your task\n",
"We'll create a custom tool that adds two integer numbers.\\\n",
"Note, Llama will decide whether to invoke this tool (based on the prompt you provide) \\\n",
"You can create any custom tool; for demo purposes, here, we created one that adds two integers."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# our custom tool\n",
"def add_two_integers(x: int, y: int):\n",
" \"\"\"\n",
" Adds two integers\n",
"\n",
" Args:\n",
" x: An integer\n",
" y: An integer\n",
" \"\"\"\n",
" return x + y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create the System Prompt \n",
"Below, we define the system_prompt.\n",
"We could use a chat template, however we chose to expose the details of the prompt to illustrate the steps.\n",
"\n",
"The system prompt for tool calling as two components:\n",
"\n",
"`system instruction for tool calling`: This is the default system-level prompt. \\\n",
"It describes the tool-calling functionality and outlines its behaviour (including failure modes). \n",
"\n",
"`custom tool spec in JSON format`: This is the lilst of tools (function) the model can access.\\\n",
" In our example there's just one function `add_two_integers` with two input paramters. \\\n",
"\n",
"These two parts are combined to form the full system_prompt."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"system_instruction_for_tool_calling = \"\"\"\\\n",
"<|start_header_id|>system<|end_header_id|>\n",
"\n",
"You are an expert in composing functions. You are given a question and a set of possible functions.\n",
"Based on the question, you will need to make one or more function/tool calls to achieve the purpose.\n",
"If none of the function can be used, point it out. If the given question lacks the parameters required by the function,\n",
"also point it out. You should only return the function call in tools call sections.\n",
"\n",
"If you decide to invoke any of the function(s), you MUST put it in the format of [func_name1(params_name1=params_value1, params_name2=params_value2...), func_name2(params)]\n",
"You SHOULD NOT include any other text in the response.\n",
"\n",
"Here is a list of functions in JSON format that you can invoke.\"\"\"\n",
"\n",
"custom_tool = \"\"\"\\\n",
"[\n",
" {\n",
" \"name\": \"add_two_integers\",\n",
" \"description\": \"Adds two integer numerals\",\n",
" \"parameters\": {\n",
" \"type\": \"dict\",\n",
" \"required\": [\"x\", \"y\"],\n",
" \"properties\": {\n",
" \"x\": {\n",
" \"type\": \"integer\",\n",
" \"description\": \"An integer\"\n",
" },\n",
" \"y\": {\n",
" \"type\": \"integer\",\n",
" \"description\": \"An integer\"\n",
" },\n",
" }\n",
" }\n",
" }\n",
"]\"\"\"\n",
"\n",
"\n",
"system_prompt = f\"{system_instruction_for_tool_calling}{custom_tool}<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\n\"\n",
"\n",
"\n",
"print(f'*** Tool calling system prompt + custom tool spec: ***\\n{system_prompt}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add the user prompt\n",
"This is the question you, the user, will pose to the LLM and expect it to invoke the tool calling capability.\\\n",
"In this example, its \"What is the result of 12322 added to 1242453\"."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"user_prompt = \"What is the result of 12322 added to 1242453\"\n",
"\n",
"prompt = f\"{system_prompt}{user_prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n\"\n",
"\n",
"print(f'*** Prompt after adding the user prompt: ***\\n{prompt}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Use Llama 3.1 to generate the completion based on the prompt\n",
"Now you're ready to execute step 1 of the LLM generation.\\\n",
"This step completes the prompt you just created\\\n",
"and should return the function call with the appropriate paramters filled in."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"path = \"./Llama-3.1-8B-Instruct\"\n",
"llm = LLM.from_artifacts(path)\n",
"\n",
"# step 1 of 2\n",
"sampling_params = SamplingParams(max_tokens=20, top_p=0.3, top_k=100)\n",
"responses = llm.generate([prompt], sampling_params)\n",
"\n",
"for response in responses:\n",
" print(f'*** Response from step 1 ***:\\n{response.outputs[0].text}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Execute the function call locally\n",
"Now that the LLM has handed your app the function call; execute it locally on your server.\\\n",
"Note, the LLM has no means to execute it. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model_tool_call_response = responses[0].outputs[0].text\n",
"tool_call = model_tool_call_response[1:-1] # remove outer square brackets\n",
"print(f'*** Call to be executed locally: ***\\n{tool_call}')\n",
"\n",
"# execute function locally\n",
"executed_response = eval(tool_call)\n",
"print(f'*** Local function execution result: ***\\n{executed_response}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Assemble the entire prompt\n",
"Now that you executed the extracted function locally and executed it, you're ready to construct the entire prompt and elicit a completion.\n",
"\n",
"We can combine all the components into the complete prompt for the final output. This includes:\n",
"\n",
"* The initial system prompt\n",
"* The user input\n",
"* The tool's response (including local execution of the response)\n",
"\n",
"These elements together form the full interaction for the model.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Step 2 of 2\n",
"prompt = f\"{prompt}<|python_tag|>{model_tool_call_response}<|start_header_id|>ipython<|end_header_id|>\\n\\n\"\n",
"prompt = prompt + f'\"{executed_response}\"<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n'\n",
"print(f'***Print the assembled prompt from step 2: ***\\n\\n{prompt}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Finally, let's have the LLM generate a cogent answer to the user question\n",
"From the prompt above, have the LLM generate a cogent answer to the user question,\\\n",
"\"What is the result of 12322 added to 1242453\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Generation in step 2\n",
"print(f'Completing the step 2 prompt...')\n",
"responses = llm.generate([prompt], sampling_params)\n",
"for response in responses:\n",
" print(response.outputs[0].text)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that, in spite of LLM's unpredictable hallucinations, this tool-calling answer is repeatable \\\n",
"which is a requirement in Enterpise AI applications as they go about computing functions \\\n",
"making database-dips, extracting rows/columns/cells from spreadsheets, drawing charts and graphs, etc."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment