Created
December 30, 2023 08:04
-
-
Save stevenkolawole/a991e272796de7f88aa7d054540f466f to your computer and use it in GitHub Desktop.
MistralAttention's CUDA Error
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| python my_main.py --model mistralai/Mistral-7B-v0.1 --save out --masks_per_iter 100 | |
| torch 1.10.1 | |
| transformers 4.36.1 | |
| accelerate 0.25.0 | |
| # of gpus: 1 | |
| Namespace(model='mistralai/Mistral-7B-v0.1', seed=0, nsamples=14, sparsity_ratio=0.5, prune_frac=0.1, bsz=14, mlp_attn_ratio=1.0, prune_method='magnitude', cache_dir='llm_weights', use_variant=False, save='out', save_model=None, masks_per_iter=100, tol=0.02, sm_reg_weight='[1e2, 1e-4, 0]', sm_lr_factor='[100, 10, 1, 0.1]', sm_reg_type='l1', sm_lin_model_type='global', sm_bsz='[32, 64, 128]', sm_nepochs=50, wandb_project_name='Prune-No-Backward') | |
| wandb: Currently logged in as: skolawol (shapelyprune). Use `wandb login --relogin` to force relogin | |
| wandb: Tracking run with wandb version 0.16.1 | |
| wandb: Run data is saved locally in /home/skolawol/workspace/my_wanda/wandb/run-20231230_041046-fegweqcc | |
| wandb: Run `wandb offline` to turn off syncing. | |
| wandb: Syncing run nsamp=14_sp=0.5_pfrac=0.1_bsz=14_ma_ratio=1.0_mpi=100_Lin.regtype=l1_pmethod=magnitude_mlp_attn_ratio=1.0_Lin.regweight=100.0-0.0001-0_Lin.lr=100-10-1-0.1_Lin.bsz=32-64-128_Lin.nepochs=50_Lin.type=global_name=Prune-No-Backward | |
| wandb: ⭐️ View project at https://wandb.ai/shapelyprune/Prune-No-Backward | |
| wandb: 🚀 View run at https://wandb.ai/shapelyprune/Prune-No-Backward/runs/fegweqcc | |
| loading llm model mistralai/Mistral-7B-v0.1 | |
| Loading checkpoint shards: 100%|███████████████████████████████████████████| 2/2 [00:14<00:00, 7.41s/it] | |
| Some weights of MistralForCausalLM were not initialized from the model checkpoint at mistralai/Mistral-7B-v0.1 and are newly initialized: ['model.layers.18.self_attn.rotary_emb.inv_freq', 'model.layers.13.self_attn.rotary_emb.inv_freq', 'model.layers.6.self_attn.rotary_emb.inv_freq', 'model.layers.3.self_attn.rotary_emb.inv_freq', 'model.layers.22.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.28.self_attn.rotary_emb.inv_freq', 'model.layers.16.self_attn.rotary_emb.inv_freq', 'model.layers.0.self_attn.rotary_emb.inv_freq', 'model.layers.11.self_attn.rotary_emb.inv_freq', 'model.layers.29.self_attn.rotary_emb.inv_freq', 'model.layers.30.self_attn.rotary_emb.inv_freq', 'model.layers.21.self_attn.rotary_emb.inv_freq', 'model.layers.2.self_attn.rotary_emb.inv_freq', 'model.layers.4.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', 'model.layers.14.self_attn.rotary_emb.inv_freq', 'model.layers.20.self_attn.rotary_emb.inv_freq', 'model.layers.12.self_attn.rotary_emb.inv_freq', 'model.layers.5.self_attn.rotary_emb.inv_freq', 'model.layers.8.self_attn.rotary_emb.inv_freq', 'model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.17.self_attn.rotary_emb.inv_freq', 'model.layers.9.self_attn.rotary_emb.inv_freq', 'model.layers.31.self_attn.rotary_emb.inv_freq', 'model.layers.19.self_attn.rotary_emb.inv_freq', 'model.layers.26.self_attn.rotary_emb.inv_freq', 'model.layers.23.self_attn.rotary_emb.inv_freq', 'model.layers.27.self_attn.rotary_emb.inv_freq', 'model.layers.15.self_attn.rotary_emb.inv_freq', 'model.layers.25.self_attn.rotary_emb.inv_freq', 'model.layers.24.self_attn.rotary_emb.inv_freq'] | |
| You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. | |
| evaluating on wikitext2 | |
| nsamples 163 | |
| sample 0 | |
| sample 50 | |
| sample 100 | |
| sample 150 | |
| nsamples 128 | |
| sample 0 | |
| sample 50 | |
| sample 100 | |
| Gathering statistics for pruning | |
| nsamples 14 | |
| sample 0 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 0 PPL = 7.170496463775635 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 0 PPL = 6.812166690826416 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 1 PPL = 6.358439922332764 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 1 PPL = 7.749149799346924 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 2 PPL = 6.306626796722412 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 2 PPL = 8.290532112121582 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 3 PPL = 6.415926933288574 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 3 PPL = 14.613344192504883 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 4 PPL = 6.352543830871582 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 4 PPL = 8.251137733459473 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 5 PPL = 6.292605400085449 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 5 PPL = 7.797684192657471 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 6 PPL = 6.309180736541748 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 6 PPL = 7.813832759857178 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 7 PPL = 7.047357559204102 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 7 PPL = 6.724523067474365 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 8 PPL = 7.289963245391846 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 8 PPL = 7.090325832366943 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 9 PPL = 7.569590091705322 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 9 PPL = 6.6845245361328125 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 10 PPL = 6.966268062591553 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 10 PPL = 7.4620490074157715 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 11 PPL = 7.363272190093994 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 11 PPL = 6.755545139312744 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 12 PPL = 6.282954692840576 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 12 PPL = 7.682135581970215 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 13 PPL = 7.435348033905029 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 13 PPL = 7.827632904052734 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 14 PPL = 7.147010803222656 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 14 PPL = 6.789153099060059 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 15 PPL = 7.13185977935791 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 15 PPL = 6.772646903991699 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 16 PPL = 6.397947311401367 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 16 PPL = 7.833366394042969 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 17 PPL = 6.276535987854004 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 17 PPL = 7.884089946746826 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 18 PPL = 7.197674751281738 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 18 PPL = 6.597900390625 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 19 PPL = 7.330356597900391 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 19 PPL = 6.686861515045166 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 20 PPL = 6.290648937225342 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 20 PPL = 7.843414306640625 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 21 PPL = 6.8761725425720215 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 21 PPL = 7.712000370025635 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 22 PPL = 6.402265548706055 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 22 PPL = 7.8440470695495605 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 23 PPL = 7.224035739898682 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 23 PPL = 6.7227911949157715 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 24 PPL = 6.4420695304870605 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 24 PPL = 7.521695613861084 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 25 PPL = 6.274460792541504 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 25 PPL = 7.820735931396484 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 26 PPL = 6.4466729164123535 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 26 PPL = 7.354723930358887 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 27 PPL = 7.381236553192139 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 27 PPL = 6.672240257263184 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 28 PPL = 6.975872039794922 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 28 PPL = 7.308136463165283 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 29 PPL = 6.268823146820068 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 29 PPL = 7.9021100997924805 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 30 PPL = 6.270018100738525 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 30 PPL = 8.025629997253418 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 31 PPL = 6.234014511108398 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 31 PPL = 7.933075904846191 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 32 PPL = 7.042508125305176 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 32 PPL = 6.848394393920898 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 33 PPL = 6.2845778465271 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 33 PPL = 7.776439666748047 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 34 PPL = 7.355691909790039 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 34 PPL = 6.726131916046143 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 35 PPL = 7.127555847167969 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 35 PPL = 7.14017915725708 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 36 PPL = 7.26483678817749 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 36 PPL = 6.679401874542236 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 37 PPL = 7.134992599487305 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 37 PPL = 6.707190036773682 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 38 PPL = 7.502450466156006 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 38 PPL = 6.622519016265869 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 39 PPL = 6.360657691955566 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 39 PPL = 7.755874156951904 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 40 PPL = 7.178237438201904 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 40 PPL = 6.691136360168457 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 41 PPL = 6.235278129577637 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 41 PPL = 8.105535507202148 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 42 PPL = 6.289116382598877 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 42 PPL = 7.69081449508667 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 43 PPL = 6.388136863708496 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 43 PPL = 8.014190673828125 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 44 PPL = 7.44135856628418 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 44 PPL = 6.70253849029541 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 45 PPL = 7.3979387283325195 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 45 PPL = 6.733610153198242 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 46 PPL = 6.276342391967773 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 46 PPL = 7.7677507400512695 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 47 PPL = 7.16223669052124 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 47 PPL = 6.719339370727539 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 48 PPL = 6.5343122482299805 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 48 PPL = 7.516190052032471 | |
| nsamples 14 | |
| sample 0 | |
| [v1]Iter : 49 PPL = 6.595152378082275 | |
| nsamples 14 | |
| sample 0 | |
| [v2]Iter : 49 PPL = 7.790565013885498 | |
| /home/skolawol/miniconda3/envs/pruneenv_old/lib/python3.9/site-packages/plotly/matplotlylib/renderer.py:609: UserWarning: | |
| I found a path object that I don't think is part of a bar chart. Ignoring. | |
| Prune model | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [12,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [13,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [14,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [15,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [16,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [17,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [18,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [19,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [20,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [21,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [22,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [23,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [92,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed. | |
| Traceback (most recent call last): | |
| File "/home/skolawol/workspace/my_wanda/my_main.py", line 563, in <module> | |
| main() | |
| File "/home/skolawol/workspace/my_wanda/my_main.py", line 541, in main | |
| prune_model(args, model, mask_info, tokenizer) # Do some stuffs here :) | |
| File "/home/skolawol/workspace/my_wanda/my_main.py", line 454, in prune_model | |
| prune_attn(mask_, module) | |
| File "/home/skolawol/workspace/my_wanda/my_main.py", line 416, in prune_attn | |
| new_k_proj = (prune_linear_layer(module.k_proj, updated_indices)).half() | |
| File "/home/skolawol/miniconda3/envs/pruneenv_old/lib/python3.9/site-packages/transformers/pytorch_utils.py", line 69, in prune_linear_layer | |
| W = layer.weight.index_select(dim, index).clone().detach() | |
| RuntimeError: CUDA error: device-side assert triggered | |
| wandb: | |
| wandb: Run history: | |
| wandb: SysStats/pruneruntime ▁ | |
| wandb: SysStats/scoreruntime ▁ | |
| wandb: | |
| wandb: Run summary: | |
| wandb: SysStats/pruneruntime 0.05156 | |
| wandb: SysStats/scoreruntime 4195.80715 | |
| wandb: | |
| wandb: 🚀 View run nsamp=14_sp=0.5_pfrac=0.1_bsz=14_ma_ratio=1.0_mpi=100_Lin.regtype=l1_pmethod=magnitude_mlp_attn_ratio=1.0_Lin.regweight=100.0-0.0001-0_Lin.lr=100-10-1-0.1_Lin.bsz=32-64-128_Lin.nepochs=50_Lin.type=global_name=Prune-No-Backward at: https://wandb.ai/shapelyprune/Prune-No-Backward/runs/fegweqcc | |
| wandb: ️⚡ View job at https://wandb.ai/shapelyprune/Prune-No-Backward/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjEyNjMwMTMyMA==/version_details/v2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment