I need help with this error
error message
**RuntimeError: Given groups=1, weight of size [32, 4, 3, 3, 3], expected input[1, 1, 960, 240, 155] to have 4 channels, but got 1 channels instead**
```
Traceback (most recent call last):
File "FeTS_Challenge.py", line 581, in
restore_from_checkpoint_folder = restore_from_checkpoint_folder)
File "/home/Challenge/Task_1/fets_challenge/experiment.py", line 459, in run_challenge_experiment
collaborators[col].run_simulation()
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/openfl/component/collaborator/collaborator.py", line 170, in run_simulation
self.do_task(task, round_number)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/openfl/component/collaborator/collaborator.py", line 262, in do_task
**kwargs)
File "/root/.local/workspace/src/fets_challenge_model.py", line 48, in validate
mode="validation")
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/compute/forward_pass.py", line 313, in validate_network
result = step(model, image, label, params, train=True)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/compute/step.py", line 77, in step
output = model(image)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/models/unet.py", line 224, in forward
x1 = self.ins(x)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/models/seg_modules/InitialConv.py", line 81, in forward
x = self.conv0(x)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 620, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 610, in _conv_forward
input, weight, bias, self.stride, self.padding, self.dilation, self.groups
RuntimeError: Given groups=1, weight of size [32, 4, 3, 3, 3], expected input[1, 1, 960, 240, 155] to have 4 channels, but got 1 channels instead
```
The input shape of the training dataset used in the model is
Is torch.Size([1, 960, 240, 155]) correct?
Created by SeonYeong An SeonYeongAN ok awesome, very glad it worked! I am amazed by your insight.
The code runs properly. Thank you.
I installed pytorch version 1.8.2 for cuda11 and it works properly. Yes your dimension is correct. Btw could you try a pytorch installation of CUDA 11 instead? It occurred to me that since this was back in 2022, it might not have CUDA 12 support. Thank you for your reply.
So the input dimension of your data is torch.Size([1, 4, 240, 240, 155])? Yeah very odd. Thanks for the detailed info. I will need to reach out to some of the original devs to see if we can figure it out so it might take a while to get back to you. Hang on tight friend.
```
(/home/Challenge/Task_1/venv) root@asy:/home/Challenge/Task_1# nvidia-smi
Sun Jun 2 20:49:00 2024
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 528.24 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | Off |
| 0% 37C P8 27W / 450W | 1572MiB / 24564MiB | 5% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:03:00.0 On | Off |
| 0% 34C P8 19W / 450W | 473MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 33 G /Xwayland N/A |
| 1 N/A N/A 33 G /Xwayland N/A |
+-----------------------------------------------------------------------------+
```
------------------------------------------GPU 0 is being used.-------------------------------
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | Off |
| 0% 37C P8 27W / 450W | 1572MiB / 24564MiB | 5% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:03:00.0 On | Off |
| 0% 34C P8 19W / 450W | 473MiB / 24564MiB | 0% Default |
| | | N/A | It seems 22.04 GB of GPU is occupied from previous experiments
Can you check with the nvidia-smi command what the status is before starting the experiment? full error message
```
********************
Starting validation :
********************
cuda
Using Automatic mixed precision
Looping over validation data: 0%| | 0/1 [00:00, ?it/s]== Current subject: ['FeTS2022_01434']
=== Current patch: 0 , time : 2024/06/02::15:44:16 , location : tensor([[ 0, 0, 0, 64, 64, 64]])
??? ?? image torch.Size([1, 4, 240, 240, 155])
=== Validation shapes : label: torch.Size([1, 240, 240, 155]) , image: torch.Size([1, 4, 240, 240, 155])
|===========================================================================|
| PyTorch CUDA memory summary, device ID 0 |
|---------------------------------------------------------------------------|
| CUDA OOMs: 0 | cudaMalloc retries: 0 |
|===========================================================================|
| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |
|---------------------------------------------------------------------------|
| Allocated memory | 805 MB | 805 MB | 1587 MB | 782 MB |
| from large pool | 775 MB | 775 MB | 1522 MB | 747 MB |
| from small pool | 29 MB | 35 MB | 65 MB | 35 MB |
|---------------------------------------------------------------------------|
| Active memory | 805 MB | 805 MB | 1587 MB | 782 MB |
| from large pool | 775 MB | 775 MB | 1522 MB | 747 MB |
| from small pool | 29 MB | 35 MB | 65 MB | 35 MB |
|---------------------------------------------------------------------------|
| GPU reserved memory | 960 MB | 960 MB | 1220 MB | 266240 KB |
| from large pool | 922 MB | 922 MB | 1170 MB | 253952 KB |
| from small pool | 38 MB | 38 MB | 50 MB | 12288 KB |
|---------------------------------------------------------------------------|
| Non-releasable memory | 45979 KB | 64221 KB | 589218 KB | 543238 KB |
| from large pool | 43254 KB | 58624 KB | 515446 KB | 472192 KB |
| from small pool | 2725 KB | 8416 KB | 73771 KB | 71046 KB |
|---------------------------------------------------------------------------|
| Allocations | 835 | 1033 | 2943 | 2108 |
| from large pool | 72 | 84 | 156 | 84 |
| from small pool | 763 | 949 | 2787 | 2024 |
|---------------------------------------------------------------------------|
| Active allocs | 835 | 1033 | 2943 | 2108 |
| from large pool | 72 | 84 | 156 | 84 |
| from small pool | 763 | 949 | 2787 | 2024 |
|---------------------------------------------------------------------------|
| GPU reserved segments | 58 | 58 | 76 | 18 |
| from large pool | 39 | 39 | 51 | 12 |
| from small pool | 19 | 19 | 25 | 6 |
|---------------------------------------------------------------------------|
| Non-releasable allocs | 20 | 41 | 172 | 152 |
| from large pool | 14 | 17 | 47 | 33 |
| from small pool | 6 | 34 | 125 | 119 |
|---------------------------------------------------------------------------|
| Oversize allocations | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Oversize GPU segments | 0 | 0 | 0 | 0 |
|===========================================================================|
|===========================================================================|
| CPU Utilization |
Load_Percent : 35.7
MemUtil_Percent: 6.2
|===========================================================================|
step.py image.shape torch.Size([1, 4, 240, 240, 155])
Memory Total : 24.0 GB, Allocated: 0.8 GB, Cached: 0.9 GB
??? ?? ?? torch.Size([1, 4, 240, 240, 155])
Memory Total : 24.0 GB, Allocated: 4.6 GB, Cached: 4.7 GB
Memory Total : 24.0 GB, Allocated: 5.5 GB, Cached: 5.6 GB
Memory Total : 24.0 GB, Allocated: 8.0 GB, Cached: 8.1 GB
??? ?? :prediction shape torch.Size([1, 4, 240, 240, 155])
=== Validation shapes : label: torch.Size([1, 240, 240, 155]) , image: torch.Size([1, 4, 240, 240, 155])
|===========================================================================|
| PyTorch CUDA memory summary, device ID 0 |
|---------------------------------------------------------------------------|
| CUDA OOMs: 0 | cudaMalloc retries: 0 |
|===========================================================================|
| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |
|---------------------------------------------------------------------------|
| Allocated memory | 16660 MB | 17484 MB | 21949 MB | 5288 MB |
| from large pool | 16625 MB | 17448 MB | 21876 MB | 5251 MB |
| from small pool | 35 MB | 35 MB | 72 MB | 37 MB |
|---------------------------------------------------------------------------|
| Active memory | 16660 MB | 17484 MB | 21949 MB | 5288 MB |
| from large pool | 16625 MB | 17448 MB | 21876 MB | 5251 MB |
| from small pool | 35 MB | 35 MB | 72 MB | 37 MB |
|---------------------------------------------------------------------------|
| GPU reserved memory | 16770 MB | 17704 MB | 20972 MB | 4202 MB |
| from large pool | 16732 MB | 17666 MB | 20916 MB | 4184 MB |
| from small pool | 38 MB | 38 MB | 56 MB | 18 MB |
|---------------------------------------------------------------------------|
| Non-releasable memory | 74824 KB | 92262 KB | 811 MB | 756417 KB |
| from large pool | 72436 KB | 89873 KB | 734 MB | 679302 KB |
| from small pool | 2388 KB | 8416 KB | 77 MB | 77115 KB |
|---------------------------------------------------------------------------|
| Allocations | 1018 | 1042 | 3428 | 2410 |
| from large pool | 143 | 144 | 265 | 122 |
| from small pool | 875 | 949 | 3163 | 2288 |
|---------------------------------------------------------------------------|
| Active allocs | 1018 | 1042 | 3428 | 2410 |
| from large pool | 143 | 144 | 265 | 122 |
| from small pool | 875 | 949 | 3163 | 2288 |
|---------------------------------------------------------------------------|
| GPU reserved segments | 108 | 108 | 151 | 43 |
| from large pool | 89 | 89 | 123 | 34 |
| from small pool | 19 | 19 | 28 | 9 |
|---------------------------------------------------------------------------|
| Non-releasable allocs | 64 | 74 | 402 | 338 |
| from large pool | 40 | 41 | 106 | 66 |
| from small pool | 24 | 34 | 296 | 272 |
|---------------------------------------------------------------------------|
| Oversize allocations | 9 | 9 | 10 | 1 |
|---------------------------------------------------------------------------|
| Oversize GPU segments | 9 | 9 | 10 | 1 |
|===========================================================================|
|===========================================================================|
| CPU Utilization |
Load_Percent : 15.8
MemUtil_Percent: 7.4
|===========================================================================|
step.py image.shape torch.Size([1, 4, 240, 240, 155])
Memory Total : 24.0 GB, Allocated: 16.3 GB, Cached: 16.4 GB
??? ?? ?? torch.Size([1, 4, 240, 240, 155])
Memory Total : 24.0 GB, Allocated: 20.1 GB, Cached: 20.2 GB
Memory Total : 24.0 GB, Allocated: 21.0 GB, Cached: 21.1 GB
Looping over validation data: 0%| | 0/1 [00:07, ?it/s]
Traceback (most recent call last):
File "FeTS_Challenge.py", line 581, in
restore_from_checkpoint_folder = restore_from_checkpoint_folder)
File "/home/Challenge/Task_1/fets_challenge/experiment.py", line 459, in run_challenge_experiment
collaborators[col].run_simulation()
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/openfl/component/collaborator/collaborator.py", line 170, in run_simulation
self.do_task(task, round_number)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/openfl/component/collaborator/collaborator.py", line 262, in do_task
**kwargs)
File "/root/.local/workspace/src/fets_challenge_model.py", line 48, in validate
mode="validation")
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/compute/forward_pass.py", line 352, in validate_network
result = step(model, image, label, params, train=True)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/compute/step.py", line 77, in step
output = model(image)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/models/unet.py", line 323, in forward
x = self.us_0(x)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/models/seg_modules/UpsamplingModule.py", line 51, in forward
x = self.conv0(self.interpolate(x))
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/models/seg_modules/Interpolate.py", line 24, in forward
return nn.functional.interpolate(x, **(self.interp_kwargs))
File "/home/Challenge/Task_1/venv/lib/python3.7/site-packages/torch/nn/functional.py", line 3953, in interpolate
return torch._C._nn.upsample_trilinear3d(input, output_size, align_corners, scale_factors)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.07 GiB (GPU 0; 23.99 GiB total capacity; 22.04 GiB already allocated; 0 bytes free; 22.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
```
Hi Linardos!
Input is torch.Size([1, 4, 240, 240, 155])
1: batch size
4: moality((T1, T2, T1CE, T2Flair)
240 x 240 x 155 : (data shape)
When torch.Size([1, 4, 240, 240, 155]) shape is input to the model,
The following error occurs:
```
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.07 GiB (GPU 0; 23.99 GiB total capacity; 22.04 GiB already allocated; 0 bytes free; 22.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
```
It seems to me that the patch code in forwardpass.py is not working properly.
What should be the dimensions of the input data before entering the model?
Hi SeonYeong,
What is your input data? Are you using all four modalities (T1, T2, T1CE, T2Flair) or just one of them? forwardpass.py
https://github.com/mlcommons/GaNDLF/blob/92a6c42024488150c572d20ec15815f8f02fcf75/GANDLF/compute/forward_pass.py#L292
As a result of printing `image.shape` above line 294 of `forwardpass.py`,
`"torch.Size([1, 960, 240, 155])"` appears. Can this be seen as having patches applied?
Drop files to upload
RuntimeError: Given groups=1, weight of size [32, 4, 3, 3, 3], expected input[1, 1, 960, 240, 155] ... page is loading…