Output specification for chip.wdlΒΆ

All output filenames keep prefixes from corresponding input filenames. For example. If you have started from REP1.fastq.gz and REP2.fastq.gz then corresponding alignment log for each replicate has a filename of REP1.flagstat.qc and REP2.flagstat.qc, respectively.

Some summarizing QC files do not have any prefix. Find qc.json and qc.html for final QC and HTML report.

  1. DNANexus: If you choose to use dxWDL and run pipelines on DNANexus platform, then output will be stored on the specified output directory without any subdirectories.
  2. Cromwell: Otherwise Cromwell will store outputs for each task under cromwell-executions/[WORKFLOW_ID]/call-[TASK_NAME]/shard-[IDX]. For all tasks except idr and overlap, [IDX] means a zero-based index for each replicate but for tasks idr and overlap it stands for a zero-based index for all possible pair of replicates. For example, you have 3 replicates and all possible combination of two replicates are [(rep1,rep2), (rep1,rep3), (rep2,rep3)]. Therefore, call-idr/shard-2 should be an output directory for the pair of replicate 2 and 3.

For more details, refer to the file table section in an HTML report generated by the pipeline. Files marked as (E) are outputs to be uploaded during ENCODE accession.

Task name File Description
merge_fastq merge_fastqs_R?_*.fastq.gz Merged FASTQ
trim_fastq *.trim_*bp.fastq.gz Trimmed FASTQ
bwa *.bam Raw BAM
bwa *.bai BAI for Raw BAM
bwa *.flagstat.qc Samtools flagstat log for raw BAM
filter *.nodup.bam Filtered/deduped BAM
filter *.nodup.flagstat.qc Samtools flagstat log for filtered/deduped BAM
filter *.dup.qc Picard/sambamba markdup log
filter *.pbc.qc PBC QC log
bam2ta *.tagAlign.gz TAG-ALIGN generated from filtered BAM
bam2ta *.N.tagAlign.gz Subsampled (N reads) TAG-ALIGN generated from filtered BAM
bam2ta *.tn5.tagAlign.gz TN5-shifted TAG-ALIGN
spr *.pr1.tagAlign.gz 1st pseudo-replicated TAG-ALIGN
spr *.pr2.tagAlign.gz 2nd pseudo-replicated TAG-ALIGN
pool_ta *.tagAlign.gz Pooled TAG-ALIGN from all replciates
fingerprint *.jsd.qc DeepTools fingerprint log
fingerprint *.png DeepTools fingerprint plot
choose_ctl ctl_for_rep*.tagAlign.gz Chosen control for each IP replicate
xcor *.cc.plot.pdf Cross-correlation plot PDF
xcor *.cc.plot.png Cross-correlation plot PNG
xcor *.cc.qc Cross-correlation analysis score log
xcor *.cc.fraglen.txt Estimated fragment length
macs2 *.narrowPeak.gz MACS2 NARROWPEAK
macs2 *.bfilt.narrowPeak.gz Blacklist-filtered NARROWPEAK
macs2 *.pval.signal.bigwig p-val signal BIGWIG
macs2 *.fc.signal.bigwig fold enrichment signal BIGWIG
macs2 *.frip.qc Fraction of read (TAG-ALIGN) in peaks (NARROWPEAK)
spp *.regionPeak.gz SPP NARROWPEAK (REGIONPEAK)
spp *.bfilt.narrowPeak.gz Blacklist-filtered NARROWPEAK
spp *.frip.qc Fraction of read (TAG-ALIGN) in peaks (NARROWPEAK)
idr *.*Peak.gz IDR NARROWPEAK
idr *.bfilt.*Peak.gz Blacklist-filtered IDR NARROWPEAK
idr *.txt.png IDR plot PNG
idr *.txt.gz Unthresholded IDR output
idr *.log IDR STDOUT log
idr *.frip.qc Fraction of read (TAG-ALIGN) in peaks (IDR NARROWPEAK)
overlap *.*Peak.gz Overlapping NARROWPEAK
overlap *.bfilt.*Peak.gz Blacklist-filtered overlapping NARROWPEAK
overlap *.frip.qc Fraction of read (TAG-ALIGN) in peaks (overlapping NARROWPEAK)
reproducibility *.reproducibility.qc Reproducibililty QC log
reproducibility optimal_peak.gz Optimal final peak file.
reproducibility conservative_peak.gz Conservative final peak file.
qc_report qc.html Final HTML QC report
qc_report qc.json Final QC JSON