My inference code is running on GPU in my local machine, but it always runs on CPU on the server, and hence is really slow.
I have done model=model.cuda() as well as I am doing x=x.cuda() before doing y=model(x). Can you please help me with this?
Created by BISPL Can someone please help in this regards?