I am trying to load the pickle files using the latest version of Pandas (using Python 3.6.3) but get the error shown at the end of this message using the following code:
import csv
import pandas
path_pickle="100000.pickle.gz"
data=pandas.read_pickle(path_pickle, compression='gzip')
When I try to load another file: "allSRS.pickle.gz" , it seems to be working fine. I checked the md5 checksum of the 100000.pickle.gz file and it matches. I also tried loading the file using python 2.7 and a different version of Pandas, but I get the same error. Lastly I tried opening the file in Java using Jython (specifically: http://www.jython.org/javadoc/org/python/core/package-summary.html). Could you tell me what I am doing wrong/how I can load this file?
Thank you in advance
ERROR:
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/site-packages/pandas-0.21.0-py3.6-linux-x86_64.egg/pandas/io/pickle.py", line 113, in read_pickle
return try_read(path, encoding='latin1')
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/site-packages/pandas-0.21.0-py3.6-linux-x86_64.egg/pandas/io/pickle.py", line 108, in try_read
lambda f: pc.load(f, encoding=encoding, compat=True))
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/site-packages/pandas-0.21.0-py3.6-linux-x86_64.egg/pandas/io/pickle.py", line 84, in read_wrapper
return func(f)
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/site-packages/pandas-0.21.0-py3.6-linux-x86_64.egg/pandas/io/pickle.py", line 108, in
lambda f: pc.load(f, encoding=encoding, compat=True))
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/site-packages/pandas-0.21.0-py3.6-linux-x86_64.egg/pandas/compat/pickle_compat.py", line 194, in load
return up.load()
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/pickle.py", line 1050, in load
dispatch[key[0]](self)
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/pickle.py", line 1347, in load_stack_global
self.append(self.find_class(module, name))
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/site-packages/pandas-0.21.0-py3.6-linux-x86_64.egg/pandas/compat/pickle_compat.py", line 117, in find_class
return super(Unpickler, self).find_class(module, name)
File "/apps/software/Python/3.6.3-foss-2015b/lib/python3.6/pickle.py", line 1388, in find_class
__import__(module, level=0)
ModuleNotFoundError: No module named 'pandas.core.arrays'
Created by Sipko van dam Sipko Thank you, your first comment solved the problem. I was under the illusion I had the latest version of Pandas, but that was not the case. pip update pandas did not take it to the latest version, I had to specify pandas==0.23.4. I did not expect that. My bad, sorry. Still nice you added it to the documentation though. Might save others some time :) In the quick start,
I have added the exact command on how to set up the environments, hope this help:
https://github.com/brianyiktaktsui/Skymap/blob/master/README.md#quick-start-10-mins Can you try installing pandas v0.23.4?
Sorry about this.
On a side note, we are looking into more extendable storage solution now. pandas pickle is the fastest and smallest by far. I ran into different problems with HDF5, feather, numpy array solution.
We probably will build a NOSQL database for querying.
Drop files to upload
Failure to load reads/variant pickle files page is loading…