TiMAT ChIP Chip Usage (Versions >= 3.2)

Here is an outline on how to use TiMAT to process your chIP chip Affymetrix tiling microarray data. The steps parallel those for earlier TiMAT versions but are fewer in number. Perhaps the biggest change previous TiMAT users need to be aware of is the new file formats for *cela and *celp files. These file formats are not backward compatible; although, it is possible and not particularly hard to write a Java based convertor to update or backdate *cela and *celp files.

Recommended Resources:

  1. Java version 1.5 or greater
  2. 1GB of RAM
  3. R statistical programming environment (see installation guide)

TiMAT is developed and used primarily on a Linux platform and has been tested in large part on MacOSX. TiMAT should, in theory, run on any platform for which Java and R are supported, including MS Windows.

Experimental Recommendations:

The following suggested usuage is based on an experiment such as that desccribed below: For chIP chip experiments:


Processing Protocol:

Do once...
  1. Aqcuire the D. melanogaster tpmap file from the BDTNP website or optionally
  2. TPMap processing

As of TiMAT version 3.2, much of the windowing is handled organically to specific application via the WindowCircularBuffer class, ie. a tpmap file is necessary but not the files generated TPMapProcessor. Still, these files are used in a few places in TiMAT and hence, for the moment, running TPMapProcessor is still a required steps. TPMapProcessor may go away in future TiMAT releases.

For every experiment...
  1. Convert your Affymetrix binary *CEL files to TiMAT's *cela file format using CelFileConvertor; *cela files maintain all probe intensity information found in the *CEL files as well as its random organization
  2. Inspect each *cela file for problem spots using CelMasker, mask out problem areas and save as a revised *cela file if necessary
  3. Perform quantile normalization and convert to TiMAT's *celp file format using CelProcessor; *celp files are organized by biological position and have had probes not found in the tpmap file removed
  4. Run ScanChip twice, once for the chromatin controls and antibody chips and once for the chromatin controls and mock-IP chips; ScanChip combines replicates, factors out background noise based on controls, builds and scores windows including a p-value for each window
  5. Take note of the correlation coefficients between replicates given by ScanChip; these values should high, for instance >= 0.85
  6. Run FDRWindowConvertor to calculate a false detection rate score
  7. Merge high scoring, overlapping Windows into Intervals with the IntervalMaker program; the biggest difficulty is where to set the threshold for merging windows; two FDR estimations are provided by TiMAT: an empirical FDR based on a mock IP and a statistical FDR based on Richard Bourgon's non-parametric symetric p-test; set this generously, for instance at 25% and filter later with IntervalFilter or manually (for version 3.2 the argument of x=25% is given to IntervalMaker on the scale of -10*ln(x)/ln(10) = 6.02 while for version 3.3 and later, simply give the percentile, ie. 25); if you have multiple replicas, or have used different antibodies you can merge the different Window arrays using the MultiWindowIntervalMaker
  8. Load the intervals with oligo information using LoadIntervalOligoInfo
  9. Find the best average intensity difference sub window within each Interval, as well as enrichment peaks using FindSubBindingRegions
  10. Filter Intervals with a variety of parameters using IntervalFilter; sorts intervals into pass or fail, a suggested cutoff is 1% fdr (for version 3.2 and earlier, the argument of x=1% is given to IntervalFilter on the scale of -10*ln(x)/ln(10) = 20 while for version 3.3 and later, simply give the percentile value, ie. 1)
  11. Print the Intervals in GFF3 format with IntervalGFFPrinter
  12. Print a summary report with IntervalReportPrinter

  1. Visually pick binding regions with a mouse using the IntervalPlotter; the FindSubBindingRegions program will automatically identify binding peaks with some reliability but cannot be trusted to correctly picks the flanks of the peak
  2. Use the AnnotateRegions application to fetch information about genes surrounding the binding region picks from the IntervalPlotter; given the gross inconsistencies between GFF file formats, this is currently only configured to work with the fly chips
  3. Compare lists of regions with the RankedSetAnalysis application; it creates a visual representation of region intersection overlap using box-line-boxes