Created
March 9, 2024 14:08
-
-
Save ruvnet/1d03bba3ebb00a16e3931125a78a755c to your computer and use it in GitHub Desktop.
Self reasoning framework
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "# Self-Discover Framework Implementation\n", | |
| "\n", | |
| "This notebook implements the Self-Discover framework for enabling LLMs to autonomously discover reasoning structures for complex tasks, as described in the research paper.\n", | |
| "\n", | |
| "The framework consists of two main stages:\n", | |
| "1. Discover an intrinsic reasoning structure for the task \n", | |
| "2. Use the discovered structure to solve problem instances\n", | |
| "\n", | |
| "Stage 1 has three actions to compose the reasoning structure:\n", | |
| "1. SELECT relevant reasoning modules\n", | |
| "2. ADAPT selected modules to be task-specific\n", | |
| "3. IMPLEMENT adapted modules into an explicit structure\n", | |
| "\n", | |
| "Stage 2 simply follows the reasoning structure to solve each problem instance." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "import openai\n", | |
| "\n", | |
| "def Self_Discover(task, reasoning_modules):\n", | |
| " \"\"\"Main Self-Discover framework\n", | |
| " \n", | |
| " Args:\n", | |
| " task (str): The task to solve\n", | |
| " reasoning_modules (list): List of available reasoning modules\n", | |
| " \n", | |
| " Returns:\n", | |
| " list: Solutions to the task instances\n", | |
| " \"\"\"\n", | |
| " \n", | |
| " # Stage 1: Discover Reasoning Structure on Task-Level\n", | |
| " \n", | |
| " # SELECT relevant reasoning modules\n", | |
| " selected_modules = SELECT(reasoning_modules, task_examples) \n", | |
| " \n", | |
| " # ADAPT selected modules to be task-specific\n", | |
| " adapted_modules = ADAPT(selected_modules, task_examples)\n", | |
| " \n", | |
| " # IMPLEMENT adapted modules into a reasoning structure\n", | |
| " reasoning_structure = IMPLEMENT(adapted_modules, task_examples)\n", | |
| " \n", | |
| " # Stage 2: Solve Problems Using Discovered Structure on Instance-Level\n", | |
| " solutions = []\n", | |
| " for problem_instance in task:\n", | |
| " # Follow reasoning structure to solve problem\n", | |
| " solution = SOLVE(problem_instance, reasoning_structure)\n", | |
| " solutions.append(solution)\n", | |
| " \n", | |
| " return solutions" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "def SELECT(reasoning_modules, task_examples):\n", | |
| " \"\"\"Use LLM to select subset of relevant reasoning modules\n", | |
| " \n", | |
| " Args:\n", | |
| " reasoning_modules (list): List of available reasoning modules \n", | |
| " task_examples (list): Examples of the task\n", | |
| " \n", | |
| " Returns:\n", | |
| " list: Selected subset of reasoning modules\n", | |
| " \"\"\"\n", | |
| " \n", | |
| " # Generate prompt for LLM to select modules\n", | |
| " prompt = generate_select_prompt(reasoning_modules, task_examples)\n", | |
| " \n", | |
| " # Query LLM to select relevant modules \n", | |
| " selected_modules = openai.Completion.create(\n", | |
| " engine=\"text-davinci-002\",\n", | |
| " prompt=prompt,\n", | |
| " max_tokens=100\n", | |
| " )\n", | |
| " \n", | |
| " return selected_modules.choices[0].text.strip().split(',')" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "def ADAPT(selected_modules, task_examples):\n", | |
| " \"\"\"Use LLM to rephrase selected modules to be task-specific\n", | |
| " \n", | |
| " Args:\n", | |
| " selected_modules (list): Subset of reasoning modules\n", | |
| " task_examples (list): Examples of the task\n", | |
| " \n", | |
| " Returns:\n", | |
| " list: Adapted task-specific module descriptions \n", | |
| " \"\"\"\n", | |
| " \n", | |
| " # Generate prompt for LLM to adapt modules\n", | |
| " prompt = generate_adapt_prompt(selected_modules, task_examples)\n", | |
| " \n", | |
| " # Query LLM to rephrase modules to be task-specific\n", | |
| " adapted_modules = openai.Completion.create(\n", | |
| " engine=\"text-davinci-002\",\n", | |
| " prompt=prompt,\n", | |
| " max_tokens=200\n", | |
| " )\n", | |
| " \n", | |
| " return adapted_modules.choices[0].text.strip().split('\\n')" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "def IMPLEMENT(adapted_modules, task_examples):\n", | |
| " \"\"\"Use LLM to implement adapted modules into a reasoning structure\n", | |
| " \n", | |
| " Args:\n", | |
| " adapted_modules (list): Task-specific module descriptions\n", | |
| " task_examples (list): Examples of the task\n", | |
| " \n", | |
| " Returns:\n", | |
| " str: JSON string representing the reasoning structure\n", | |
| " \"\"\"\n", | |
| " \n", | |
| " # Retrieve example of human-written reasoning structure \n", | |
| " human_example_structure = retrieve_human_example_structure()\n", | |
| " \n", | |
| " # Generate prompt for LLM to implement structure\n", | |
| " prompt = generate_implement_prompt(human_example_structure, \n", | |
| " adapted_modules, \n", | |
| " task_examples)\n", | |
| " \n", | |
| " # Query LLM to generate reasoning structure\n", | |
| " reasoning_structure = openai.Completion.create(\n", | |
| " engine=\"text-davinci-002\",\n", | |
| " prompt=prompt,\n", | |
| " max_tokens=500 \n", | |
| " )\n", | |
| " \n", | |
| " return reasoning_structure.choices[0].text.strip()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "def SOLVE(problem_instance, reasoning_structure):\n", | |
| " \"\"\"Use LLM to follow reasoning structure and solve problem instance\n", | |
| " \n", | |
| " Args:\n", | |
| " problem_instance (str): A single instance of the task \n", | |
| " reasoning_structure (str): JSON string of reasoning structure\n", | |
| " \n", | |
| " Returns:\n", | |
| " str: The solution to the problem instance\n", | |
| " \"\"\"\n", | |
| " \n", | |
| " # Generate prompt for LLM to solve instance\n", | |
| " prompt = generate_solve_prompt(problem_instance, reasoning_structure)\n", | |
| " \n", | |
| " # Query LLM to follow structure and solve\n", | |
| " solution = openai.Completion.create(\n", | |
| " engine=\"text-davinci-002\",\n", | |
| " prompt=prompt,\n", | |
| " max_tokens=200\n", | |
| " )\n", | |
| " \n", | |
| " return solution.choices[0].text.strip()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## Example Usage\n", | |
| "\n", | |
| "Here's an example of using the Self-Discover framework to solve a complex reasoning task." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "task = [\n", | |
| " \"Instance 1 of the complex task\",\n", | |
| " \"Instance 2 of the complex task\"\n", | |
| "]\n", | |
| "\n", | |
| "reasoning_modules = [\n", | |
| " \"break problem into steps\",\n", | |
| " \"use logical reasoning\", \n", | |
| " \"leverage world knowledge\",\n", | |
| " \"think creatively\"\n", | |
| "]\n", | |
| "\n", | |
| "solutions = Self_Discover(task, reasoning_modules)\n", | |
| "\n", | |
| "print(solutions)" | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.8.5" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 4 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Considering how methods like "Pass@K" training and "Dr. GRPO" are de-biasing LLM thinking, I wonder if these methods also work with prompt engineering. The art of this would be to know which method to use for which specific task types. https://github.com/RUCAIBox/Passk_Training https://github.com/sail-sg/understand-r1-zero