I have this error in the submission, not sure how to solve it.
This is part of the code:
```
def fill_feature_mat(rows):
snps_featurs=pd.DataFrame(np.zeros([len(rows),len(my_snp_list)]),index=rows,columns=my_snp_list)
for index in rows:
vcf_file=validation_data.loc[index,'WES_mutationFileMutect']
if pd.isnull(vcf_file):
continue
vcf_file = vcf_dir + vcf_file
f=pysam.VariantFile(vcf_file)
for rec in f.fetch():
if (rec.id != None):
if (rec.id in my_snp_list):
snps_featurs.set_value(index,rec.id,1)
return snps_featurs
pool = Pool(ncpus)
snps_featurs = pd.concat(pool.map(fill_feature_mat,sub_tasks))
pool.close()
pool.join()
```
The error is:
```
Traceback (most recent call last):
File "./prepare_data_matrix.py", line 160, in
indel_variant_stat = pd.concat(pool.map(fill_caleb_features,sub_tasks))
File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
NotImplementedError: seek not implemented in files compressed by method 1
```
I don't have any error when I run it in my machine. Not sure why it can be run in docker.
Really need some helps here.
Thank you!
Yichao