Hi everyone,
I thought about running the tensorflow example locally to see how it works using the given dataset but got somesort of KeyError. What is it that i did wrong?
```
Python 3.5.2 :: Continuum Analytics, Inc.
---
Metadata-Version: 2.0
Name: tensorflow
Version: 0.9.0
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: opensource@google.com
Installer: pip
License: Apache 2.0
Location: /home/lintangsutawika/anaconda3/envs/tensorflow/lib/python3.5/site-packages
Requires: six, protobuf, numpy, wheel
Classifiers:
Development Status :: 4 - Beta
Intended Audience :: Developers
Intended Audience :: Education
Intended Audience :: Science/Research
License :: OSI Approved :: Apache Software License
Programming Language :: Python :: 2.7
Topic :: Scientific/Engineering :: Mathematics
Topic :: Software Development :: Libraries :: Python Modules
Topic :: Software Development :: Libraries
Entry-points:
[console_scripts]
tensorboard = tensorflow.tensorboard.tensorboard:main
Parsing the csv's.
Traceback (most recent call last):
File "DREAM_DM_starter_tf.py", line 700, in
main(sys.argv)
File "DREAM_DM_starter_tf.py", line 689, in main
X_tr, X_te, Y_tr, Y_te = create_data_splits(path_csv_crosswalk, path_csv_metadata)
File "DREAM_DM_starter_tf.py", line 96, in create_data_splits
Y_tot.append(dict_tuple_to_cancer[dict_img_to_patside[img_name]])
KeyError: ('65725', '646644.dcm.gz')
```
Created by Lintang Adyuta Sutawika lintangsutawika This is explained in the Dictionary.xls file
If you want to read the label values from the metadata file in a safe way, you can use something like this:
```
# Labels should be 0 or 1
# In some (rare) cases they can be:
# . - not imaged
# * - value masked
def force_num_label(a):
# force label to be a number
try:
return int(a)
except:
return 2
```
It forces values that are not numeric (. or *) to be mapped to a (numeric) bogus label value of 2. Hi Joshua,
The dot '.' represents missing data. Hey guys,
I finally got my training submission to start and I solved this problem by inserting a check on the values of row[3] and row[4]. I noticed that sometimes those values could be '.', so I just inserted a check for that. Also, may I asked what is the meaning of '.'? Does it mean that the data set is incomplete? Or does it have a special meaning?
Joshua Hey guys,
So the way the code generally works is that I need a dictionary that maps the name of the dicom file to the condition (binary 0/1 for whether there was an abnormality on that breast). We can get this using our .tsv files. Later in the code, when I create my batches, I do so randomly inline, choosing a random batch of dicom files to read in. This will be stored in something I call dataXX. Then, to train, I need the corresponding labels to all my images I stored in dataXX. This should be something like dataYY.
It seems like you guys aren't getting to the actual training part, but parsing the .tsv's and creating those dictionaries. In reference to the original question, Lintang, I believe the files are no longer compressed, but actually just dicom files. But it seems maybe you are using an older version of my code? One when the files were still compressed to .gz? Can you confirm that you are indeed using the most recent version of the code? Other than that, I'd have to ask if there are any differences between your local environment and the synapse environments in which I tested out the code.
- Darvin Michael Kawczynski (MichaelK), I also had this problem. I removed the int from ```int(row[3])``` to ```row[3]``` Also getting a similar ValueError:
```
STDOUT: Mon Oct 24 06:47:20 2016
STDOUT: +------------------------------------------------------+
STDOUT: | NVIDIA-SMI 352.99 Driver Version: 352.99 |
STDOUT: |-------------------------------+----------------------+----------------------+
STDOUT: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
STDOUT: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
STDOUT: |===============================+======================+======================|
STDOUT: | 0 Tesla K80 Off | 0000:87:00.0 Off | 0 |
STDOUT: | N/A 31C P8 26W / 149W | 55MiB / 11519MiB | 0% Default |
STDOUT: +-------------------------------+----------------------+----------------------+
STDOUT: | 1 Tesla K80 Off | 0000:88:00.0 Off | 0 |
STDOUT: | N/A 31C P8 28W / 149W | 55MiB / 11519MiB | 0% Default |
STDOUT: +-------------------------------+----------------------+----------------------+
STDOUT:
STDOUT: +-----------------------------------------------------------------------------+
STDOUT: | Processes: GPU Memory |
STDOUT: | GPU PID Type Process name Usage |
STDOUT: |=============================================================================|
STDOUT: | No running processes found |
STDOUT: +-----------------------------------------------------------------------------+
STDERR: Python 2.7.6
STDOUT: ---
STDOUT: Metadata-Version: 2.0
STDOUT: Name: tensorflow
STDOUT: Version: 0.9.0
STDOUT: Summary: TensorFlow helps the tensors flow
STDOUT: Home-page: http://tensorflow.org/
STDOUT: Author: Google Inc.
STDOUT: Author-email: opensource@google.com
STDOUT: Installer: pip
STDOUT: License: Apache 2.0
STDOUT: Location: /usr/local/lib/python2.7/dist-packages
STDOUT: Requires: numpy, six, wheel, protobuf
STDOUT: Classifiers:
STDOUT: Development Status :: 4 - Beta
STDOUT: Intended Audience :: Developers
STDOUT: Intended Audience :: Education
STDOUT: Intended Audience :: Science/Research
STDOUT: License :: OSI Approved :: Apache Software License
STDOUT: Programming Language :: Python :: 2.7
STDOUT: Topic :: Scientific/Engineering :: Mathematics
STDOUT: Topic :: Software Development :: Libraries :: Python Modules
STDOUT: Topic :: Software Development :: Libraries
STDOUT: Entry-points:
STDOUT: [console_scripts]
STDOUT: tensorboard = tensorflow.tensorboard.tensorboard:main
STDERR: /usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
STDERR: "This module will be removed in 0.20.", DeprecationWarning)
STDERR: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
STDERR: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
STDERR: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
STDERR: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
STDERR: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
STDOUT: hdf5 not supported (please install/reinstall h5py)
STDOUT: Parsing the csv's.
STDERR: Traceback (most recent call last):
STDERR: File "DREAM_DM_starter_tf.py", line 700, in
STDERR: main(sys.argv)
STDERR: File "DREAM_DM_starter_tf.py", line 689, in main
STDERR: X_tr, X_te, Y_tr, Y_te = create_data_splits(path_csv_crosswalk, path_csv_metadata)
STDERR: File "DREAM_DM_starter_tf.py", line 89, in create_data_splits
STDERR: dict_tuple_to_cancer[(row[0].strip(), 'L')] = int(row[3])
STDERR: ValueError: invalid literal for int() with base 10: '.'
```
Drop files to upload
KeyError when running tensorflow example page is loading…