We developed a model-based multi-peak algorithm - Grizzly Peak - to accurately
identify significant ZLD bound loci across the genome. Grizzly Peak is an
iterative model-based peak fitting method, which we modified from Capaldi et
al. In brief, Grizzly Peak estimates the expected shape of a binding
event in ChIP-seq measurement. The algorithm then iteratively scans the genome
and identifies enriched regions with high protein occupancy. These regions are
expanded and analyzed, aiming at finding a minimal set # of peaks (each with a
genomic position and an occupancy level) optimizing the fit to the
measured data. To allow for overlapping peaks, we devised a simple heuristic
for considering actions such as adding or removing peaks. Each step is then
assigned a score, and steps are taken if a significant improvement in the score
is achieved. Once a genomic region has been analyzed and fitted, the optimized
set of peaks is recorded, and this genomic region is discarded from future
fitting. This process is repeated until no significantly bound loci remain.