Skip to content

Instantly share code, notes, and snippets.

@rmtbb
Created January 12, 2026 22:02
Show Gist options
  • Select an option

  • Save rmtbb/7ea10f88fbb9656a0ddecf32bf2453e8 to your computer and use it in GitHub Desktop.

Select an option

Save rmtbb/7ea10f88fbb9656a0ddecf32bf2453e8 to your computer and use it in GitHub Desktop.
Building MiroMind: A Local Research Agent with MiroThinker 1.5

Building MiroMind: A Local Research Agent with MiroThinker 1.5

A complete guide to building a research agent that uses MiroThinker 1.5 (running locally via LM Studio) with SearXNG for web search capabilities.

What We're Building

MiroMind is a research agent that:

  • Takes your research questions
  • Searches the web using SearXNG (self-hosted, no API limits)
  • Fetches and reads relevant webpages
  • Synthesizes findings into comprehensive answers
  • Supports up to 200 tool calls per research session (MiroThinker's strength)

Interfaces:

  • Web UI at localhost:8000
  • CLI for terminal use
  • REST API for integration

Prerequisites

1. LM Studio

Download from lmstudio.ai and install.

2. MiroThinker 1.5 Model

  • Open LM Studio
  • Search for "MiroThinker" in the model browser
  • Download the GGUF version (30B recommended for local use)
  • Load the model and start the server on port 1234

Verify it's running:

curl http://localhost:1234/v1/models

3. Docker

Required for running SearXNG. Install Docker Desktop or Docker Engine.

4. Python 3.11+

python --version  # Should be 3.11 or higher

Project Setup

Create Project Structure

mkdir miromind && cd miromind
git init

Create Requirements

requirements.txt:

fastapi>=0.109.0
uvicorn[standard]>=0.27.0
httpx>=0.26.0
beautifulsoup4>=4.12.0
lxml>=5.1.0
pyyaml>=6.0.1
typer>=0.9.0
rich>=13.7.0
sse-starlette>=2.0.0
pydantic>=2.0.0

Create Configuration

config.yaml:

lm_studio:
  url: "http://localhost:1234/v1"
  model: "mirothinker-v1.5"

searxng:
  url: "http://localhost:8081"

agent:
  max_iterations_quick: 10
  max_iterations_deep: 200
  context_keep_recent: 5

server:
  host: "0.0.0.0"
  port: 8000

Create Docker Compose for SearXNG

docker-compose.yaml:

services:
  searxng:
    image: searxng/searxng:latest
    container_name: miromind-searxng
    ports:
      - "8081:8080"
    volumes:
      - ./searxng:/etc/searxng:rw
    environment:
      - SEARXNG_BASE_URL=http://localhost:8081/
    restart: unless-stopped

Create SearXNG Configuration

mkdir searxng

searxng/settings.yml:

use_default_settings: true

server:
  secret_key: "change-this-in-production"
  limiter: false
  image_proxy: false

search:
  safe_search: 0
  autocomplete: ""
  default_lang: "en"
  formats:
    - html
    - json

engines:
  - name: google
    disabled: false
  - name: bing
    disabled: false
  - name: duckduckgo
    disabled: false
  - name: wikipedia
    disabled: false

outgoing:
  request_timeout: 10.0

Core Implementation

Directory Structure

miromind/
├── src/
│   ├── __init__.py
│   ├── config.py      # Configuration loader
│   ├── models.py      # Pydantic models
│   ├── agent.py       # Core agent with tool-calling loop
│   ├── main.py        # FastAPI application
│   └── tools/
│       ├── __init__.py
│       ├── search.py  # SearXNG search tool
│       └── fetch.py   # Webpage fetch tool
├── static/
│   └── index.html     # Web UI
└── cli.py             # Command-line interface

Config Loader (src/config.py)

from pathlib import Path
from typing import Optional
import yaml
from pydantic import BaseModel


class LMStudioConfig(BaseModel):
    url: str = "http://localhost:1234/v1"
    model: str = "mirothinker-v1.5"


class SearXNGConfig(BaseModel):
    url: str = "http://localhost:8081"


class AgentConfig(BaseModel):
    max_iterations_quick: int = 10
    max_iterations_deep: int = 200
    context_keep_recent: int = 5


class ServerConfig(BaseModel):
    host: str = "0.0.0.0"
    port: int = 8000


class Config(BaseModel):
    lm_studio: LMStudioConfig = LMStudioConfig()
    searxng: SearXNGConfig = SearXNGConfig()
    agent: AgentConfig = AgentConfig()
    server: ServerConfig = ServerConfig()


_config: Optional[Config] = None


def load_config(path: Path = Path("config.yaml")) -> Config:
    global _config
    if _config is not None:
        return _config

    if path.exists():
        with open(path) as f:
            data = yaml.safe_load(f)
        _config = Config(**data)
    else:
        _config = Config()

    return _config


def get_config() -> Config:
    if _config is None:
        return load_config()
    return _config

Pydantic Models (src/models.py)

from typing import Optional, Literal
from pydantic import BaseModel, Field
from enum import Enum


class ResearchMode(str, Enum):
    QUICK = "quick"
    DEEP = "deep"


class ResearchRequest(BaseModel):
    query: str = Field(..., description="The research query")
    mode: ResearchMode = Field(default=ResearchMode.QUICK)


class ResearchStatus(str, Enum):
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    FAILED = "failed"


class ResearchResponse(BaseModel):
    id: str
    status: ResearchStatus
    query: str
    answer: Optional[str] = None
    tool_calls_count: int = 0
    error: Optional[str] = None


class StreamEvent(BaseModel):
    event: Literal["thinking", "tool_call", "tool_result", "answer", "error"]
    data: str

Search Tool (src/tools/search.py)

import httpx
from src.config import get_config


async def web_search(query: str, num_results: int = 10) -> str:
    """Search the web using SearXNG."""
    config = get_config()

    async with httpx.AsyncClient(timeout=30.0) as client:
        response = await client.get(
            f"{config.searxng.url}/search",
            params={"q": query, "format": "json", "pageno": 1}
        )
        response.raise_for_status()
        data = response.json()

    results = data.get("results", [])[:num_results]

    if not results:
        return f"No results found for: {query}"

    formatted = []
    for i, r in enumerate(results, 1):
        title = r.get("title", "No title")
        url = r.get("url", "")
        snippet = r.get("content", "No description")
        formatted.append(f"{i}. {title}\n   URL: {url}\n   {snippet}")

    return "\n\n".join(formatted)


SEARCH_TOOL_DEFINITION = {
    "type": "function",
    "function": {
        "name": "web_search",
        "description": "Search the web for information.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The search query"}
            },
            "required": ["query"]
        }
    }
}

Fetch Tool (src/tools/fetch.py)

import httpx
from bs4 import BeautifulSoup


async def fetch_page(url: str, max_length: int = 8000) -> str:
    """Fetch and parse a webpage, extracting main text content."""
    headers = {"User-Agent": "Mozilla/5.0 (compatible; MiroMind/1.0)"}

    try:
        async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
            response = await client.get(url, headers=headers)
            response.raise_for_status()
            html = response.text
    except httpx.HTTPStatusError as e:
        return f"Error fetching {url}: HTTP {e.response.status_code}"
    except httpx.RequestError as e:
        return f"Error fetching {url}: {str(e)}"

    soup = BeautifulSoup(html, "lxml")

    for tag in soup(["script", "style", "nav", "footer", "header", "aside"]):
        tag.decompose()

    main = soup.find("main") or soup.find("article") or soup.find("body")

    if main is None:
        return f"Could not extract content from {url}"

    text = main.get_text(separator="\n", strip=True)
    lines = [line.strip() for line in text.splitlines() if line.strip()]
    content = "\n".join(lines)

    if len(content) > max_length:
        content = content[:max_length] + "\n\n[Content truncated...]"

    return f"Content from {url}:\n\n{content}"


FETCH_TOOL_DEFINITION = {
    "type": "function",
    "function": {
        "name": "fetch_page",
        "description": "Fetch and read the content of a webpage.",
        "parameters": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The URL to fetch"}
            },
            "required": ["url"]
        }
    }
}

Tools Init (src/tools/init.py)

from src.tools.search import web_search, SEARCH_TOOL_DEFINITION
from src.tools.fetch import fetch_page, FETCH_TOOL_DEFINITION

TOOLS = [SEARCH_TOOL_DEFINITION, FETCH_TOOL_DEFINITION]

TOOL_EXECUTORS = {
    "web_search": web_search,
    "fetch_page": fetch_page,
}

Agent Core (src/agent.py)

This is the heart of MiroMind - the tool-calling loop that orchestrates research:

import json
import uuid
import httpx
from typing import AsyncGenerator, Optional
from src.config import get_config
from src.models import (
    ResearchRequest, ResearchResponse, ResearchStatus, ResearchMode,
    StreamEvent
)
from src.tools import TOOLS, TOOL_EXECUTORS


SYSTEM_PROMPT = """You are a research agent with access to web search and webpage fetching tools.

Your task is to thoroughly research the user's query and provide a comprehensive, well-sourced answer.

Guidelines:
- Use web_search to find relevant information
- Use fetch_page to read detailed content from promising URLs
- Cite your sources with URLs
- Synthesize information from multiple sources when possible
- Be thorough but concise in your final answer
- If you cannot find reliable information, say so clearly"""


SYNTHESIS_PROMPT = """Maximum research iterations reached. Please provide a comprehensive answer based on all the research you have gathered so far.

Synthesize the information from your searches and page fetches into a complete, well-organized response. Include relevant sources and citations.

If you found partial information, clearly indicate what was found and what remains uncertain."""


class Agent:
    def __init__(self):
        self.config = get_config()
        self.client = httpx.AsyncClient(timeout=120.0)

    async def close(self):
        await self.client.aclose()

    async def research(self, request: ResearchRequest) -> ResearchResponse:
        """Execute a research query and return the final response."""
        research_id = str(uuid.uuid4())

        max_iterations = (
            self.config.agent.max_iterations_deep
            if request.mode == ResearchMode.DEEP
            else self.config.agent.max_iterations_quick
        )

        messages = [
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": request.query}
        ]

        tool_calls_count = 0

        try:
            for iteration in range(max_iterations):
                response = await self._chat_completion(messages)
                assistant_message = response["choices"][0]["message"]
                messages.append(assistant_message)

                tool_calls = assistant_message.get("tool_calls")

                if not tool_calls:
                    return ResearchResponse(
                        id=research_id,
                        status=ResearchStatus.COMPLETED,
                        query=request.query,
                        answer=assistant_message.get("content", ""),
                        tool_calls_count=tool_calls_count
                    )

                for tool_call in tool_calls:
                    tool_calls_count += 1
                    result = await self._execute_tool(tool_call)
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call["id"],
                        "content": result
                    })

                messages = self._manage_context(messages)

            # Max iterations - synthesize findings
            synthesis = await self._synthesize_research(messages)
            answer = synthesis if synthesis else "Research reached maximum iterations."

            return ResearchResponse(
                id=research_id,
                status=ResearchStatus.COMPLETED,
                query=request.query,
                answer=answer,
                tool_calls_count=tool_calls_count
            )

        except Exception as e:
            return ResearchResponse(
                id=research_id,
                status=ResearchStatus.FAILED,
                query=request.query,
                error=str(e),
                tool_calls_count=tool_calls_count
            )

    async def research_stream(
        self, request: ResearchRequest
    ) -> AsyncGenerator[StreamEvent, None]:
        """Execute a research query with streaming events."""
        max_iterations = (
            self.config.agent.max_iterations_deep
            if request.mode == ResearchMode.DEEP
            else self.config.agent.max_iterations_quick
        )

        messages = [
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": request.query}
        ]

        try:
            for iteration in range(max_iterations):
                response = await self._chat_completion(messages)
                assistant_message = response["choices"][0]["message"]
                messages.append(assistant_message)

                content = assistant_message.get("content")
                if content:
                    yield StreamEvent(event="thinking", data=content)

                tool_calls = assistant_message.get("tool_calls")

                if not tool_calls:
                    yield StreamEvent(event="answer", data=content or "")
                    return

                for tool_call in tool_calls:
                    yield StreamEvent(
                        event="tool_call",
                        data=json.dumps({
                            "name": tool_call["function"]["name"],
                            "arguments": tool_call["function"]["arguments"]
                        })
                    )

                    result = await self._execute_tool(tool_call)

                    yield StreamEvent(
                        event="tool_result",
                        data=result[:1000] + ("..." if len(result) > 1000 else "")
                    )

                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call["id"],
                        "content": result
                    })

                messages = self._manage_context(messages)

            # Max iterations - synthesize
            yield StreamEvent(event="thinking", data="Synthesizing research findings...")
            synthesis = await self._synthesize_research(messages)

            if synthesis:
                yield StreamEvent(event="answer", data=synthesis)
            else:
                yield StreamEvent(event="answer", data="Unable to synthesize findings.")

        except Exception as e:
            yield StreamEvent(event="error", data=str(e))

    async def _chat_completion(self, messages: list) -> dict:
        """Call LM Studio chat completion endpoint."""
        response = await self.client.post(
            f"{self.config.lm_studio.url}/chat/completions",
            json={
                "model": self.config.lm_studio.model,
                "messages": messages,
                "tools": TOOLS,
                "tool_choice": "auto"
            }
        )
        response.raise_for_status()
        return response.json()

    async def _chat_completion_no_tools(self, messages: list) -> dict:
        """Call LM Studio without tools for synthesis."""
        response = await self.client.post(
            f"{self.config.lm_studio.url}/chat/completions",
            json={
                "model": self.config.lm_studio.model,
                "messages": messages
            }
        )
        response.raise_for_status()
        return response.json()

    async def _synthesize_research(self, messages: list) -> str:
        """Make a final synthesis call when max iterations is reached."""
        synthesis_messages = messages + [
            {"role": "user", "content": SYNTHESIS_PROMPT}
        ]

        try:
            response = await self._chat_completion_no_tools(synthesis_messages)
            return response["choices"][0]["message"].get("content", "")
        except Exception:
            return ""

    async def _execute_tool(self, tool_call: dict) -> str:
        """Execute a tool call and return the result."""
        name = tool_call["function"]["name"]
        args_str = tool_call["function"]["arguments"]

        try:
            args = json.loads(args_str)
        except json.JSONDecodeError:
            return f"Error: Invalid JSON arguments for {name}"

        executor = TOOL_EXECUTORS.get(name)
        if not executor:
            return f"Error: Unknown tool {name}"

        try:
            return await executor(**args)
        except Exception as e:
            return f"Error executing {name}: {str(e)}"

    def _manage_context(self, messages: list) -> list:
        """Truncate older tool results to manage context window."""
        keep_recent = self.config.agent.context_keep_recent
        tool_indices = [i for i, m in enumerate(messages) if m.get("role") == "tool"]

        if len(tool_indices) <= keep_recent:
            return messages

        for i in tool_indices[:-keep_recent]:
            content = messages[i]["content"]
            if len(content) > 500:
                messages[i]["content"] = content[:500] + "\n[Truncated...]"

        return messages


_agent: Optional[Agent] = None


async def get_agent() -> Agent:
    global _agent
    if _agent is None:
        _agent = Agent()
    return _agent

FastAPI Application (src/main.py)

from pathlib import Path
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from fastapi.responses import FileResponse
from sse_starlette.sse import EventSourceResponse
from src.config import load_config
from src.models import ResearchRequest, ResearchResponse
from src.agent import get_agent

load_config()

app = FastAPI(
    title="MiroMind",
    description="Research agent powered by MiroThinker 1.5",
    version="0.1.0"
)


@app.get("/health")
async def health():
    return {"status": "healthy"}


@app.post("/research", response_model=ResearchResponse)
async def research(request: ResearchRequest):
    agent = await get_agent()
    return await agent.research(request)


@app.post("/research/stream")
async def research_stream(request: ResearchRequest):
    agent = await get_agent()

    async def event_generator():
        async for event in agent.research_stream(request):
            yield {"event": event.event, "data": event.data}

    return EventSourceResponse(event_generator())


@app.get("/")
async def root():
    return FileResponse("static/index.html")


static_dir = Path("static")
if static_dir.exists():
    app.mount("/static", StaticFiles(directory="static"), name="static")

CLI (cli.py)

#!/usr/bin/env python3
import asyncio
import sys
from pathlib import Path
from typing import Optional
import typer
from rich.console import Console
from rich.markdown import Markdown
from rich.panel import Panel

sys.path.insert(0, str(Path(__file__).parent))

from src.config import load_config
from src.models import ResearchRequest, ResearchMode
from src.agent import Agent

app = typer.Typer(help="MiroMind - Research agent powered by MiroThinker 1.5")
console = Console()


@app.command()
def research(
    query: str = typer.Argument(..., help="The research query"),
    deep: bool = typer.Option(False, "--deep", "-d", help="Deep research mode"),
    output: Optional[Path] = typer.Option(None, "--output", "-o", help="Save to file"),
    quiet: bool = typer.Option(False, "--quiet", "-q", help="Only show final answer"),
):
    """Research a topic using MiroThinker."""
    load_config()
    mode = ResearchMode.DEEP if deep else ResearchMode.QUICK

    if not quiet:
        console.print(Panel(
            f"[bold]Query:[/bold] {query}\n[bold]Mode:[/bold] {mode.value}",
            title="MiroMind Research",
            border_style="blue"
        ))

    asyncio.run(_research_async(query, mode, output, quiet))


async def _research_async(query: str, mode: ResearchMode, output: Optional[Path], quiet: bool):
    agent = Agent()
    request = ResearchRequest(query=query, mode=mode)

    try:
        if quiet:
            response = await agent.research(request)
            if response.error:
                console.print(f"[red]Error: {response.error}[/red]")
                raise typer.Exit(1)
            console.print(response.answer)
        else:
            answer = ""
            async for event in agent.research_stream(request):
                if event.event == "thinking":
                    console.print(f"[dim]{event.data[:200]}...[/dim]")
                elif event.event == "tool_call":
                    console.print(f"[yellow]{event.data}[/yellow]")
                elif event.event == "tool_result":
                    console.print(f"[green]{event.data[:100]}...[/green]")
                elif event.event == "answer":
                    answer = event.data
                    console.print(Panel(Markdown(answer), title="Result", border_style="green"))
                elif event.event == "error":
                    console.print(f"[red]{event.data}[/red]")
                    raise typer.Exit(1)

            if output and answer:
                output.write_text(answer)
                console.print(f"\n[dim]Saved to {output}[/dim]")
    finally:
        await agent.close()


@app.command()
def serve(
    host: str = typer.Option("0.0.0.0", "--host", "-h"),
    port: int = typer.Option(8000, "--port", "-p"),
):
    """Start the MiroMind web server."""
    import uvicorn
    from src.main import app as fastapi_app

    console.print(Panel(
        f"Starting MiroMind server at http://{host}:{port}",
        title="MiroMind Server",
        border_style="blue"
    ))

    uvicorn.run(fastapi_app, host=host, port=port)


if __name__ == "__main__":
    app()

Web UI (static/index.html)

Create a simple dark-themed web interface (see the project's static/index.html for the full implementation).

Running MiroMind

1. Install Dependencies

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Start SearXNG

docker compose up -d

Verify it's running:

curl "http://localhost:8081/search?q=test&format=json"

3. Start LM Studio

  • Open LM Studio
  • Load MiroThinker 1.5
  • Start the server (should be on port 1234)

4. Run MiroMind

CLI:

python cli.py research "What are the latest developments in quantum computing?"
python cli.py research --deep "Compare vector databases for production use"

Web UI:

python cli.py serve
# Open http://localhost:8000

API:

curl -X POST http://localhost:8000/research \
  -H "Content-Type: application/json" \
  -d '{"query": "Your question here", "mode": "quick"}'

Architecture

User Query
    │
    ▼
┌─────────────────────────────────────────────────────┐
│                    MiroMind Agent                    │
│                                                      │
│  1. Send query + tools to MiroThinker               │
│  2. MiroThinker decides: answer or use tool?        │
│  3. If tool: execute (search/fetch), return result  │
│  4. Repeat until answer or max iterations           │
│  5. If max iterations: synthesize all findings      │
│                                                      │
└─────────────────────────────────────────────────────┘
    │                           │
    ▼                           ▼
┌─────────────┐           ┌─────────────┐
│  LM Studio  │           │   SearXNG   │
│ MiroThinker │           │   (Docker)  │
│  Port 1234  │           │  Port 8081  │
└─────────────┘           └─────────────┘

Key Design Decisions

  1. SearXNG over API services: Self-hosted, no rate limits, aggregates multiple search engines
  2. Tool-calling loop: MiroThinker decides when to search vs answer, up to 200 iterations in deep mode
  3. Context management: Truncates older tool results to fit context window
  4. Synthesis on max iterations: Always produces a final answer, even when hitting limits
  5. Streaming: Real-time visibility into the research process

Extending MiroMind

Adding New Tools

  1. Create a new file in src/tools/
  2. Implement an async function that takes parameters and returns a string
  3. Create a tool definition dict following OpenAI's function calling schema
  4. Add to TOOLS and TOOL_EXECUTORS in src/tools/__init__.py

Example - adding a calculator tool:

# src/tools/calculator.py
async def calculate(expression: str) -> str:
    try:
        result = eval(expression)  # Use a safe eval in production
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {e}"

CALCULATOR_TOOL_DEFINITION = {
    "type": "function",
    "function": {
        "name": "calculate",
        "description": "Evaluate a mathematical expression",
        "parameters": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "Math expression"}
            },
            "required": ["expression"]
        }
    }
}

Using Different Models

Change config.yaml:

lm_studio:
  url: "http://localhost:1234/v1"
  model: "your-model-name"

Any model that supports OpenAI-compatible tool calling should work.

License

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment