This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Architecture | Model_I | Model_II | Model_III | |
|---|---|---|---|---|
| Convolutional Vision Transformer | 91.54% | 99.41% | 99.04% | |
| CrossFormer | 91.20% | 97.42% | 98.13% | |
| LeViT | 91.64% | 97.12% | 97.97% | |
| TwinsSVT | 91.08% | 97.44% | 98.48% | |
| CCT | 89.71% | 69.68% | 99.48% | |
| CrossViT | 84.20% | 91.33% | 81.29% | |
| CaiT | 67.93% | 62.97% | 69.38% | |
| T2TViT | 88.12% | 94.33% | 77.29% | |
| PiT | 40.63% | 33.60% | 34.18% |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| TWINSSVT_CONFIG = { | |
| "network_type": "TwinsSVT", | |
| "pretrained": False, | |
| "image_size": 224, | |
| "batch_size": 64, | |
| "num_epochs": 15, | |
| "optimizer_config": { | |
| "name": "AdamW", | |
| "weight_decay": 0.01, | |
| "lr": 0.001, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Hyperparameter | Value | |
|---|---|---|
| Batch size | 64 | |
| Epochs | 15 | |
| Image channels | 1 | |
| Optimizer | AdamW | |
| Weight decay | 0.01 | |
| Learning rate schedule | Cosine schedule with warmup | |
| Initial learning rate | 0.01 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from typing import Optional | |
| from torchvision import transforms | |
| from PIL import Image | |
| import albumentations as A | |
| from albumentations.pytorch import ToTensorV2 | |
| from torch.utils.data import DataLoader, Dataset | |
| def get_transform_train( | |
| upsample_size: int, final_size: int, channels: Optional[int] = 1 |