Current baseline was the temporal PARSeq-Small/parseq setup (e.g., embed_dim=384, enc_depth=12, ~23.8M-class model family).
Two approaches are proposed:
- Approach 1: PARSeq-Tiny transfer learning
- Dimitri's temporal PARSeq modifications were reused from the existing pipeline, but the model was switched from PARSeq-Small to PARSeq-Tiny and retrained from the Hugging Face/PARSeq-Tiny checkpoint.
- PARSeq-Tiny should be sufficient because the target vocabulary is only digits (
0-9) plus control token(s), unlike broader OCR character sets. - Unseen-number handling via digit-level recognition: supervision remained token-level (
0-9 + EOS), not 100 jersey classes. The model learned digit identities and sequence order, so unseen combinations were compositional.