Skip to content

Instantly share code, notes, and snippets.

@dehaenw
Last active January 18, 2026 20:24
Show Gist options
  • Select an option

  • Save dehaenw/1318d2265db7ab2744306a7b26e1c535 to your computer and use it in GitHub Desktop.

Select an option

Save dehaenw/1318d2265db7ab2744306a7b26e1c535 to your computer and use it in GitHub Desktop.
This is a short description of our latest submission to the OpenADMET + ExpansionRx Blind Challenge.
# OpenADMET + ExpansionRx Blind Challenge
This is a short description of our latest submission to the OpenADMET + ExpansionRx Blind Challenge.
This submission was made on behalf of the UCT Prague cheminformatics group. People who contributed to the various submissions of the UCT team are (alphabetically):
- Joanna Ceklarz
- Ivan Čmelo
- Wim Dehaen
- Valeriia Fil
- Jozef Fülöp
- Lukáš Kerti
- Martin Šícho
- Hunzallah Usmani
## Short description
The approach is based on an ensemble of TabPFN regression models combined with a stacked meta-learner (Ridge).
Each ADMET endpoint is modeled independently, with automatic per-endpoint selection of the best target transform (e.g., linear, log, Box-Cox, Yeo-Johnson, asinh, quantile).
!! NOTE: for one of the endpoints, HLM CLint, a parallel completely different approach was taken and this column was spliced in.
The approach for this endpoint was based on an ensemble consisting of MLPs, xgboost and classic features like morgan maccs and rdkit descs.
We use a diverse feature set:
- MOE descriptors
- CheMeleon features
- MORDRED descriptors
- RDKit descriptors
- RDKit Morgan fingerprints (chiral-aware)
Models are trained using Butina clustering for fold splits to reduce scaffold leakage.
After prediction, we apply:
- Post-prediction calibration (linear or isotonic)
- Prediction clipping to training ranges
- Optional multitask residual correction across endpoints
## Performance notes
Extensive feature ensembles + transform search improved robustness across endpoints
No finetuning of TabPFN itself; improvements come from transforms, calibration, ensembling, and residual correction
When the challenge ends, code and a more detailed description will be made public at:
https://github.com/lich-uct/openADMET-challenge
Internal OOF performance is consistently predictive of leaderboard behavior in terms of ranking, and overly optimistic in terms of metrics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment