Skip to content

Instantly share code, notes, and snippets.

@NassimElH01
Created November 26, 2024 11:52
Show Gist options
  • Select an option

  • Save NassimElH01/22b9dc4552e2571a57974fe38d686ac1 to your computer and use it in GitHub Desktop.

Select an option

Save NassimElH01/22b9dc4552e2571a57974fe38d686ac1 to your computer and use it in GitHub Desktop.
ChatGPT_Structured_Output.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/NassimElH01/22b9dc4552e2571a57974fe38d686ac1/chatgpt_structured_output.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"#Structured output fra prompts"
],
"metadata": {
"id": "wRaQ4yNKXyFA"
}
},
{
"cell_type": "markdown",
"source": [
"###Install OpenAI components"
],
"metadata": {
"id": "R_M3C873eegq"
}
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"id": "mW4Wx25zsIPC",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "f3b40a39-ca7d-430c-c211-5f896ed0bfd0"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Requirement already satisfied: openai in /usr/local/lib/python3.10/dist-packages (1.54.4)\n",
"Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai) (3.7.1)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from openai) (1.9.0)\n",
"Requirement already satisfied: httpx<1,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from openai) (0.27.2)\n",
"Requirement already satisfied: jiter<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from openai) (0.7.1)\n",
"Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from openai) (2.9.2)\n",
"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai) (1.3.1)\n",
"Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai) (4.66.6)\n",
"Requirement already satisfied: typing-extensions<5,>=4.11 in /usr/local/lib/python3.10/dist-packages (from openai) (4.12.2)\n",
"Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (3.10)\n",
"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (1.2.2)\n",
"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (2024.8.30)\n",
"Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (1.0.7)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai) (0.14.0)\n",
"Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (0.7.0)\n",
"Requirement already satisfied: pydantic-core==2.23.4 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (2.23.4)\n"
]
}
],
"source": [
"pip install openai"
]
},
{
"cell_type": "code",
"source": [
"pip install python-dotenv"
],
"metadata": {
"id": "zoG7LD58sV6F",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "edc8a3e6-dc7a-4f05-85c9-3e162b412e0c"
},
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Requirement already satisfied: python-dotenv in /usr/local/lib/python3.10/dist-packages (1.0.1)\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"###Mount Google Drive"
],
"metadata": {
"id": "nuqf1lJ3gHVD"
}
},
{
"cell_type": "code",
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')"
],
"metadata": {
"id": "umDyiesRsj8U",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "50272610-0914-430f-8bee-2662df6844ca"
},
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"from dotenv import load_dotenv"
],
"metadata": {
"id": "Np47jgufsvjC"
},
"execution_count": 12,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###Load .env file"
],
"metadata": {
"id": "mFHzjazGgvq2"
}
},
{
"cell_type": "code",
"source": [
"load_dotenv('drive/My Drive/Colab Notebooks/env')"
],
"metadata": {
"id": "0fFuJL4Vs7Kx",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "47d24a48-f985-4695-fd91-99f98d0ba168"
},
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {},
"execution_count": 13
}
]
},
{
"cell_type": "markdown",
"source": [
"###Initialize Open AI client"
],
"metadata": {
"id": "nnOzY4EkhAQG"
}
},
{
"cell_type": "code",
"source": [
"from openai import OpenAI as openai"
],
"metadata": {
"id": "2gSFihVstLPH"
},
"execution_count": 14,
"outputs": []
},
{
"cell_type": "code",
"source": [
"import os\n",
"openai.api_key = os.getenv(\"OPENAI_API_KEY\")\n",
"openai.api_key"
],
"metadata": {
"id": "EiQPc5lgtUMo",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 37
},
"outputId": "8c9ff34f-ce84-4386-f363-87703b35d460"
},
"execution_count": 15,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'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'"
],
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
}
},
"metadata": {},
"execution_count": 15
}
]
},
{
"cell_type": "code",
"source": [
"client = openai()\n"
],
"metadata": {
"id": "ieGjr5HqtfWv"
},
"execution_count": 16,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###ChatGPT routine"
],
"metadata": {
"id": "G84WlYpIiyzM"
}
},
{
"cell_type": "code",
"source": [
"def chat_completion(prompt, model=\"gpt-4\", temperature=0):\n",
" res = client.chat.completions.create(\n",
" model=model,\n",
" messages=[{\"role\": \"user\", \"content\": prompt}],\n",
" temperature=temperature,\n",
" )\n",
" result = res.choices[0].message.content\n",
" print(result)\n",
" return result\n",
"\n",
"\n",
"\n"
],
"metadata": {
"id": "ayNbA97PvXIM"
},
"execution_count": 17,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###Prompt for structured JSON chat response"
],
"metadata": {
"id": "r3eKRNCvRcBy"
}
},
{
"cell_type": "code",
"source": [
"prompt = \"\"\"\n",
"Give a JSON output with 10 names of animals and number of legs. The output must be accepted\n",
"by json.loads.\n",
"\"\"\"\n",
"\n",
"\n"
],
"metadata": {
"id": "Y-Yk-S4ayy6j"
},
"execution_count": 18,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###Generate chat completion"
],
"metadata": {
"id": "Wi2vlB06AJgA"
}
},
{
"cell_type": "code",
"source": [
"result = chat_completion(prompt, model='gpt-4')"
],
"metadata": {
"id": "0Eh1veJuy5qn",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "9ae3863f-af74-4c41-991e-9431199fdf6e"
},
"execution_count": 19,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"{\n",
" \"animals\": [\n",
" {\n",
" \"name\": \"Dog\",\n",
" \"legs\": 4\n",
" },\n",
" {\n",
" \"name\": \"Cat\",\n",
" \"legs\": 4\n",
" },\n",
" {\n",
" \"name\": \"Spider\",\n",
" \"legs\": 8\n",
" },\n",
" {\n",
" \"name\": \"Chicken\",\n",
" \"legs\": 2\n",
" },\n",
" {\n",
" \"name\": \"Horse\",\n",
" \"legs\": 4\n",
" },\n",
" {\n",
" \"name\": \"Fish\",\n",
" \"legs\": 0\n",
" },\n",
" {\n",
" \"name\": \"Elephant\",\n",
" \"legs\": 4\n",
" },\n",
" {\n",
" \"name\": \"Kangaroo\",\n",
" \"legs\": 2\n",
" },\n",
" {\n",
" \"name\": \"Duck\",\n",
" \"legs\": 2\n",
" },\n",
" {\n",
" \"name\": \"Snake\",\n",
" \"legs\": 0\n",
" }\n",
" ]\n",
"}\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"###Load response into JSON"
],
"metadata": {
"id": "lgpDI2V4AnK9"
}
},
{
"cell_type": "code",
"source": [
"import json\n",
"\n",
"loadedJSON=json.loads(result)"
],
"metadata": {
"id": "dewgMb985Kti"
},
"execution_count": 20,
"outputs": []
},
{
"cell_type": "code",
"source": [
"animals=loadedJSON[\"animals\"]"
],
"metadata": {
"id": "F8i_o7jw8SUQ"
},
"execution_count": 21,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###Displaying data for first animal"
],
"metadata": {
"id": "aZb2Y9wjBga4"
}
},
{
"cell_type": "code",
"source": [
"animals[0]"
],
"metadata": {
"id": "5OTNNNMZHPsR",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "a34d1f24-94d3-48fb-f62d-651489cc1c65"
},
"execution_count": 22,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'name': 'Dog', 'legs': 4}"
]
},
"metadata": {},
"execution_count": 22
}
]
},
{
"cell_type": "markdown",
"source": [
"###Displaying data in a formattet way"
],
"metadata": {
"id": "VukyM8FHB6vz"
}
},
{
"cell_type": "code",
"source": [
"for a in animals:\n",
" print(\"Details on \" + a[\"name\"]+\":\")\n",
" #print(\" Continent: \" + a[\"continent\"])\n",
" print(\" Legs: \" + str(a[\"legs\"]))"
],
"metadata": {
"id": "pO9M0gX_IJc4",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "025a89a0-5a6f-462a-f071-7cc0b291d79a"
},
"execution_count": 23,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Details on Dog:\n",
" Legs: 4\n",
"Details on Cat:\n",
" Legs: 4\n",
"Details on Spider:\n",
" Legs: 8\n",
"Details on Chicken:\n",
" Legs: 2\n",
"Details on Horse:\n",
" Legs: 4\n",
"Details on Fish:\n",
" Legs: 0\n",
"Details on Elephant:\n",
" Legs: 4\n",
"Details on Kangaroo:\n",
" Legs: 2\n",
"Details on Duck:\n",
" Legs: 2\n",
"Details on Snake:\n",
" Legs: 0\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"#Exercises"
],
"metadata": {
"id": "5OxJTp9I9Cqc"
}
},
{
"cell_type": "markdown",
"source": [
"\n",
"\n",
"1. Try to add more animal attributes (e.g. continent, nutrition, ...) to the prompt.\n",
"2. Try vary on the number for animals.\n",
"3. Does it always work?\n",
"4. Briefly, describe you how would incorporate a feature as the one you developed above into an app. Not technically, but conceptually taking into account that the chat response not always may have an apprapiate structure for formatting.\n",
"\n"
],
"metadata": {
"id": "lAont_HU9GiN"
}
},
{
"cell_type": "markdown",
"source": [
"# ###Prompt for structured JSON chat response\n",
"prompt = \"\"\"\n",
"Give a JSON output with 10 names of animals, the number of legs, their continent of origin, and their primary nutrition type. The output must be accepted by json.loads.\n",
"\"\"\""
],
"metadata": {
"id": "zzfd55PTTQsA"
}
},
{
"cell_type": "code",
"source": [
"# prompt: Try vary on the number for animals.\n",
"\n",
"# ###Generate chat completion\n",
"result = chat_completion(prompt, model='gpt-4')\n",
"# ###Load response into JSON\n",
"\n",
"try:\n",
" loadedJSON=json.loads(result)\n",
" animals=loadedJSON[\"animals\"]\n",
" # ###Displaying data in a formattet way\n",
" for a in animals:\n",
" print(\"Details on \" + a[\"name\"]+\":\")\n",
" print(\" Continent: \" + a[\"continent\"])\n",
" print(\" Legs: \" + str(a[\"legs\"]))\n",
" print(\" Nutrition: \" + a[\"nutrition\"])\n",
"except json.JSONDecodeError as e:\n",
" print(f\"Invalid JSON received from the API: {e}\")\n",
" print(f\"Raw API response: {result}\")\n",
"except KeyError as e:\n",
" print(f\"Missing key in JSON response: {e}\")\n",
" print(f\"Raw API response: {result}\")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "WljxNVp2SBCL",
"outputId": "5d1a7792-76fd-4e01-988e-a8857e0d58fa"
},
"execution_count": 27,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"[\n",
" {\n",
" \"name\": \"Elephant\",\n",
" \"legs\": 4,\n",
" \"origin\": \"Africa\",\n",
" \"nutrition\": \"Herbivore\"\n",
" },\n",
" {\n",
" \"name\": \"Kangaroo\",\n",
" \"legs\": 2,\n",
" \"origin\": \"Australia\",\n",
" \"nutrition\": \"Herbivore\"\n",
" },\n",
" {\n",
" \"name\": \"Penguin\",\n",
" \"legs\": 2,\n",
" \"origin\": \"Antarctica\",\n",
" \"nutrition\": \"Carnivore\"\n",
" },\n",
" {\n",
" \"name\": \"Polar Bear\",\n",
" \"legs\": 4,\n",
" \"origin\": \"Arctic\",\n",
" \"nutrition\": \"Carnivore\"\n",
" },\n",
" {\n",
" \"name\": \"Giraffe\",\n",
" \"legs\": 4,\n",
" \"origin\": \"Africa\",\n",
" \"nutrition\": \"Herbivore\"\n",
" },\n",
" {\n",
" \"name\": \"Lion\",\n",
" \"legs\": 4,\n",
" \"origin\": \"Africa\",\n",
" \"nutrition\": \"Carnivore\"\n",
" },\n",
" {\n",
" \"name\": \"Panda\",\n",
" \"legs\": 4,\n",
" \"origin\": \"Asia\",\n",
" \"nutrition\": \"Herbivore\"\n",
" },\n",
" {\n",
" \"name\": \"Koala\",\n",
" \"legs\": 4,\n",
" \"origin\": \"Australia\",\n",
" \"nutrition\": \"Herbivore\"\n",
" },\n",
" {\n",
" \"name\": \"Wolf\",\n",
" \"legs\": 4,\n",
" \"origin\": \"North America\",\n",
" \"nutrition\": \"Carnivore\"\n",
" },\n",
" {\n",
" \"name\": \"Jaguar\",\n",
" \"legs\": 4,\n",
" \"origin\": \"South America\",\n",
" \"nutrition\": \"Carnivore\"\n",
" }\n",
"]\n"
]
},
{
"output_type": "error",
"ename": "TypeError",
"evalue": "list indices must be integers or slices, not str",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-27-9dc1eb570268>\u001b[0m in \u001b[0;36m<cell line: 7>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mloadedJSON\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mjson\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mloads\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 9\u001b[0;31m \u001b[0manimals\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mloadedJSON\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"animals\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 10\u001b[0m \u001b[0;31m# ###Displaying data in a formattet way\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0ma\u001b[0m \u001b[0;32min\u001b[0m \u001b[0manimals\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mTypeError\u001b[0m: list indices must be integers or slices, not str"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"Nej, det virker ikke altid. ChatGPT kan nogle gange returnere JSON, der ikke er korrekt formateret eller mangler de forventede nøgler. Fejlhåndteringen i koden forsøger at håndtere disse situationer, men det er vigtigt at huske på, at der kan opstå uforudsete problemer med eksterne API'er. For at forbedre pålideligheden kunne man implementere mere robust fejlhåndtering (fx. flere `try-except` blokke eller input validering), eller gentage forespørgslen til API'et hvis der opstår fejl. Det kan også være nødvendigt at give mere specifikke instrukser i prompten."
],
"metadata": {
"id": "6d_xyb9gUEht"
}
},
{
"cell_type": "markdown",
"source": [
"I en applikation ville jeg integrere denne funktion ved at have en brugergrænseflade, hvor brugeren kan indtaste en prompt. Appen ville så sende denne prompt til ChatGPT-API'et, ligesom i eksemplet. \n",
"\n",
"Den kritiske del er håndteringen af ChatGPT's svar. I stedet for blot at printe svaret, skal appen først forsøge at parse svaret som JSON. Hvis parsing lykkes, kan dataene vises pænt i appen, f.eks. i en tabel eller en liste. \n",
"\n",
"Hvis JSON-parsing fejler (som i eksemplet, hvor `json.loads()` kan fejle), skal appen vise en fejlmeddelelse til brugeren. Det er vigtigt at medtage den rå API-respons i fejlmeddelelsen, så brugeren eller udvikleren kan diagnosticere problemet. Det kan også være nyttigt at vise et input felt hvor brugeren kan redigere prompt for at forbedre dens format.\n",
"\n",
"Ideelt set ville appen forsøge at rette ukorrekt formateret JSON. Dette kunne ske ved at bruge et separat bibliotek til at reparere JSON eller ved at implementere logik til at genkende almindelige fejl i ChatGPT's svar og rettet dem. Dette ville forbedre brugsoplevelsen betydeligt. Alternativt kunne applikationen vise brugeren den rå respons og give mulighed for manuelt at rette eventuelle fejl i strukturen.\n"
],
"metadata": {
"id": "uZwKAxhPTKA4"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment