The challenge data description suggests that the ChIP peaks are provided in narrowpeak format (https://genome.ucsc.edu/FAQ/FAQformat.html#format12). However my `ChIPseq.A549.GATA3.relaxed.narrowPeak.gz` file has 20 columns, not 10 ``` chr5 17305890 17306561 . 1000 . 592.95022 -1.00000 4.50129 325 5.00 5.00 17305909 17306491 316.44739 293 17305920 17306471 275.30545 295 chr9 117670329 117670875 . 1000 . 584.32485 -1.00000 4.50129 310 5.00 5.00 117670336 117670862 282.17684 298 117670352 117670841 303.17644 291 chr3 124556509 124557047 . 1000 . 511.73383 -1.00000 4.50129 226 5.00 5.00 124556520 124557038 263.61195 213 124556533 124557027 250.43190 247 chr5 142974452 142975393 . 1000 . 514.05251 -1.00000 4.50129 595 5.00 5.00 142974486 142975356 273.46880 574 142974468 142975352 238.51025 578 chr12 31902134 31902614 . 1000 . 491.56946 -1.00000 4.50129 255 5.00 5.00 31902162 31902603 255.00719 256 31902161 31902590 242.13701 220 chr9 118706524 118707262 . 1000 . 488.54446 -1.00000 4.50129 326 5.00 5.00 118706543 118707243 261.88087 339 118706548 118707252 234.92551 269 chr20 14387068 14387548 . 1000 . 495.04979 -1.00000 4.50129 229 5.00 5.00 14387083 14387536 264.09469 199 14387088 14387536 231.80143 208 chrX 69440210 69440627 . 1000 . 490.50548 -1.00000 4.50129 201 5.00 5.00 69440213 69440621 245.04028 189 69440238 69440595 246.12970 193 chr4 75579312 75579895 . 1000 . 488.65999 -1.00000 4.50129 340 5.00 5.00 75579344 75579865 241.27670 306 75579339 75579849 249.34905 291 chr2 26234946 26235712 . 1000 . 493.05366 -1.00000 4.50129 291 5.00 5.00 26234981 26235676 237.23640 247 26234970 26235696 252.89558 272 ``` Can someone explain what the columns are? Thanks in advance.

Created by John Reid Epimetheus
Hi Epimetheus, The first 10 columns correspond to a standard narrowPeak file. The next 10 columns are: localIDR float -log10(Local IDR value) globalIDR float -log10(Global IDR value) rep1_chromStart int The starting position of the feature in the chromosome or scaffold for common replicate 1 peaks, shifted based on offset. The first base in a chromosome is numbered 0. rep1_chromEnd int The ending position of the feature in the chromosome or scaffold for common replicate 1 peaks. The chromEnd base is not included in the display of the feature. rep1_signalValue float Signal measure from replicate 1. Note that this is determined by the --rank option. e.g. if --rank is set to signal.value, this corresponds to the 7th column of the narrowPeak, whereas if it is set to p.value it corresponds to the 8th column. rep1_summit int The summit of this peak in replicate 1. rep2_chromEnd int The ending position of the feature in the chromosome or scaffold for common replicate 1 peaks. The chromEnd base is not included in the display of the feature. rep2_signalValue float Signal measure from replicate 1. Note that this is determined by the --rank option. e.g. if --rank is set to signal.value, this corresponds to the 7th column of the narrowPeak, whereas if it is set to p.value it corresponds to the 8th column. rep2_summit int The summit of this peak in replicate 1. I describe this format in some detail on the IDR website (https://github.com/nboley/idr/). Best, Nathan

ChIP peak file format page is loading…