Skip to content

Instantly share code, notes, and snippets.

@davidmezzetti
Created November 13, 2025 14:48
Show Gist options
  • Select an option

  • Save davidmezzetti/eec42e5038dc14bf74665c4b22ddaefe to your computer and use it in GitHub Desktop.

Select an option

Save davidmezzetti/eec42e5038dc14bf74665c4b22ddaefe to your computer and use it in GitHub Desktop.

Image to parse

from txtai.pipeline import Textractor

textractor = Textractor(backend="docling", headers={"user-agent": "Mozilla/5.0"})
textractor("https://miro.medium.com/v2/resize:fit:720/format:webp/1*HHPVwIrcxYcLRvjDpwLQyQ.png")

Output

Model Parameters MNLI (acc m/mm) MRPC (fl/acc) SST-2 (acc)
baseline (bert-tiny) 4.4M 0.7114 / 0.7161 0.8318 / 0.7353 0.8222
bert-hash-femto 0.243M 0.5697 / 0.5750 0.8122 / 0.6838 0.7821
bert hash-pico 0.448M 0.6228 / 0.6363 0.8205 / 0.7083 0.7878
bert-hash-nano 0.969M 0.6565 / 0.6670 0.8172 / 0.7083 0.8131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment