MLcube Compatibility test - out of shared memory

Hi, I am trying to run a compatibility test using this command. mlcube run --task infer data_path= output_path= --gpus 1 But I am getting the error: /usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:146: UserWarning: NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/ warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name)) ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). 0%| | 0/5 [00:06 app() File "/mlcube_project/mlcube.py", line 80, in infer run_test(data_path, output_path, parameters, checkpoint_dir) File "/mlcube_project/run_testing_phase.py", line 38, in run_test run_inference( File "/mlcube_project/testing_phase_get_predictions.py", line 117, in run_inference inputs = sample["image"].cuda() if cuda else sample["image"] File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 88) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit. I am using docker image: nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04, and torch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 on CUDA Version: 11.7 on nvidia RTX A5000 Any help would be appreciated. Thanks

Created by Anees Hashmi aneeshashmi
@aneeshashmi , Thank you for providing the traceback. Looking through the errors, I see: ```text RuntimeError: DataLoader worker (pid 88) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit. ``` One possible solution would be to allow for more shared memory to the container run, by adding `gpu_args` to your mlcube.yaml file: ```yaml docker: image: docker.synapse.org/.../... ... gpu_args: --shm-size=2g ``` Hope this helps! EDIT: add more details

Your web browser must have JavaScript enabled in order for this application to display correctly.
If you are an automated web crawler from a search engine, follow this AJAX application crawl link

Drop files to upload

MLcube Compatibility test - out of shared memory page is loading…