Skip to content

Instantly share code, notes, and snippets.

@seqis
Created February 21, 2026 07:38
Show Gist options
  • Select an option

  • Save seqis/69ba87a3d8c552b94b8a6bf9612b1c28 to your computer and use it in GitHub Desktop.

Select an option

Save seqis/69ba87a3d8c552b94b8a6bf9612b1c28 to your computer and use it in GitHub Desktop.
Agent Zero to Telegram Bridge: How-To (AI-Coder Focus)
This guide explains how to wire a Telegram bot to Agent Zero (`/api_message`) with:
- secure auth (no hardcoded LLM keys in the bridge)
- per-chat context continuity
- voice/audio relay to local Whisper
- optional extension pattern for non-audio file attachments
Everything below is based on the bridge implementation in:
- `/a0/usr/telegram/telegram_bridge.py`
- `/a0/usr/telegram/send_message.sh`
## 1) High-level architecture
```text
Telegram user
-> python-telegram-bot bridge (long polling)
-> POST http://<agent-zero-host>/api_message
headers: X-API-KEY: <computed token>
body: { message, context_id? }
<- { response, context_id }
<- Telegram reply
```
Voice/audio path:
```text
Telegram voice/audio attachment
-> bot.get_file(file_id).download_as_bytearray()
-> POST multipart to local Whisper: files={"audio": (filename, bytes)}
<- { "text": "...transcript..." }
-> send transcript into Agent Zero /api_message
```
## 2) Security model (important)
The bridge should not call OpenAI (or other LLM providers) directly. Let Agent Zero handle model routing.
Do this:
1. Store bot token and other secrets in `/a0/usr/secrets.env`.
2. In `/a0/usr/telegram/telegram-agent0.env`, reference secrets as `SECRET(NAME)`.
3. Compute Agent Zero API key at runtime from `/a0/usr/.env` (`A0_PERSISTENT_RUNTIME_ID`, `AUTH_LOGIN`, `AUTH_PASSWORD`) using:
- `sha256(f"{runtime_id}:{username}:{password}")`
- base64url encode, strip `=`
- take first 16 chars
This matches your bridge behavior and avoids persisting the API key in plaintext.
## 3) Config files
`/a0/usr/telegram/telegram-agent0.env` example:
```env
TELEGRAM_BOT_TOKEN="SECRET(TELEGRAM_BOT_TOKEN)"
TELEGRAM_OWNER_USER_ID="SECRET(TELEGRAM_OWNER_USER_ID)"
A0_API_URL="http://localhost:80"
WHISPER_URL="http://<your-whisper-host>:8765/transcribe"
```
`/a0/usr/secrets.env` example:
```env
TELEGRAM_BOT_TOKEN="<bot-token>"
TELEGRAM_OWNER_USER_ID="<numeric-user-id>"
# Optional:
# RESEND_API_KEY="..."
```
Notes:
- DM access is owner-only; group chats are open to group participants (in this implementation).
- If you want strict allow-list behavior for groups, add a group/user ACL gate in `owner_gate()`.
## 4) Bridge runtime behavior
## 4.1 Text messages
- Handler: `filters.TEXT & ~filters.COMMAND`
- For each chat, bridge loads `context_id` from `state.json` and sends it to `/api_message`.
- If Agent Zero returns new `context_id`, bridge saves it for continuity.
- On stale context 404, bridge retries once without `context_id`.
- Long replies are chunked to Telegram max length with markdown fallback.
## 4.2 Voice/audio messages
- Handler: `filters.VOICE | filters.AUDIO`
- File downloaded to memory only (bytearray), not written to disk.
- Transcribe via local Whisper (`multipart/form-data`, field name: `audio`).
- Prefix strategy before sending to Agent Zero:
- DM default: `[Voice]: <transcript>`
- Group default: `[Name via voice]: <transcript>`
- Caption present: caption overrides pending instruction
- Pending voice instruction (TTL 300s) used if no caption
- Transcription failures and empty transcript have explicit user-visible errors.
## 4.3 State file schema
`/a0/usr/telegram/state.json`:
```json
{
"contexts": {
"<chat_id>": "<agent_zero_context_id>"
},
"pending_voice": {
"<chat_id>": {
"text": "<instruction>",
"expires_at": 1735689600.0
}
}
}
```
Use atomic writes (`tempfile + os.replace`) to avoid corruption.
## 5) Attachment handling status
Current bridge capability:
- Supported: text, voice notes, audio files
- Not supported by default: photos/documents/video relay into Agent Zero content
If someone asks “file attachments,” be explicit: this bridge currently treats voice/audio as supported attachment types; other file types require an extension.
## 6) Extension pattern for photos/documents
If you want generic file attachments, add handlers for `filters.PHOTO | filters.Document.ALL` and convert files into text the LLM can consume.
Recommended pattern:
1. Download file bytes in memory.
2. Enforce max size + MIME allow-list.
3. Route by MIME:
- `image/*`: OCR or vision endpoint -> text summary
- `application/pdf`: OCR/text extraction -> text summary
- plain text/code files: decode and truncate
4. Build an Agent Zero payload string:
- metadata (`filename`, `mime`, `size`)
- extracted text/summary
- optional user caption/instruction
5. Send through existing `/api_message` flow so memory/tools still work.
Minimal skeleton:
```python
app.add_handler(MessageHandler(filters.PHOTO | filters.Document.ALL, handle_file_message))
async def handle_file_message(update, context):
msg = update.effective_message
doc = msg.document or (msg.photo[-1] if msg.photo else None)
if not doc:
return
tg_file = await context.bot.get_file(doc.file_id)
data = await tg_file.download_as_bytearray()
# TODO: validate size/mime, extract text, summarize
extracted = extract_to_text(data, mime_type=getattr(doc, "mime_type", "application/octet-stream"))
agent_text = (
f"[Attachment: {getattr(doc, 'file_name', 'photo')}"
f", mime={getattr(doc, 'mime_type', 'unknown')}]\n{extracted}"
)
result = await call_agent_zero(agent_text, get_context_id(msg.chat_id))
# persist context + reply (reuse existing helpers)
```
## 7) Reliability and ops
Use supervised process restart (not ad-hoc shell sessions):
- `supervisorctl restart telegram_bridge`
- Logs: `/a0/usr/telegram/bridge.log`
Health checks to run after deploy:
1. `docker exec agent-zero supervisorctl status telegram_bridge`
2. Send DM text -> verify Agent Zero response.
3. Send voice note -> verify Whisper transcription + Agent Zero response.
4. Send long message -> verify split reply behavior.
5. Run tests:
- `python3 -m pytest /a0/usr/telegram/tests -v`
## 8) Practical hardening checklist
- Never commit real secrets (`secrets.env` should stay private).
- Keep API timeout + Whisper timeout finite (bridge uses 120s/60s).
- Apply retry only for transient upstream errors (already done for 404/502/503 + connect errors).
- Sanitize logs: do not log raw tokens or secret values.
- Keep all customization under `/a0/usr/` so upgrades don’t wipe it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment