Good evening, I am having issues with changing the Spark environment in my code workbook so I can include more machine learning packages for developing my model. I can currently only use the default environment without the Spark session loader timing out. I submitted a ticket for review but was wondering if there is anything else I can try in the meantime since I cannot currently train my models and I haven't received a response yet from the triage team. Thank you.
Created by Sarah Pungitore uastudent Hi @neelay and @uastudent,
Unfortunately, there are limited GPU resources to go around which is why there has been significant delays in loading the GPU environment. Palantir is trying to work on a solution, but that may take time. In the meantime, there are two new environments that you can use that have been given additional CPU and memory resources.
For anything that uses significant memory (i.e. not using spark, e.g. using pandas or native R or model training in the driver memory), you can use: `profile-high-driver-cores-and-memory` from the environment options.
For anything that can be distributed (e.g. initial filter or data processing of large dataframes in spark, in SQL or Pyspark or SparkR), you can use: `profile-high-executor-cores-and-memory` from the environment options.
Let me know if you have any issues with these. Apologies for the issues with the GPU environment, this has been a platform wide issue that is being actively investigated.
Thanks,
Tim I have to second this -- it is, unfortunately, impractical to use PyTorch within the Enclave. We cannot train our models quickly without access to the GPU profile. When we try to make a custom profile and run PyTorch just on the CPU, the Spark environment disconnects after some time (not to mention that it still would have been too slow to train the model, anyway). This leaves us at a significant bottleneck since we cannot train deep models, ultimately preventing us from developing our final solution. One might wonder what negative impact this will have on the ultimate results of the competition. Any solution to this would be great. Thanks! Hi Tim, I have been trying to load the pytorch package. I have also tried loading the pre-built gpu environment that has this package but I still have issues with loading the Spark environment. Thank you! Hi @uastudent ,
What packages are you adding to the default environment? You could try looking at other pre-built environments like the `RP-2AE058-ML-Resources` environment so see if that one has the packages you need pre-loaded.
Thanks!
Tim