Core Applications

On unix-like systems running bourne or bash shells, the applications may be launched using the shell scripts in TiMAT's bin directory. If the path to TiMAT/bin has been added to the PATH environment variable, one should be able to run an application as follows:

shell$ ScanChip

On unix-like or other OS's, one should be able to execute the jar files directly as follows:

shell$ java -jar ScanChip.jar

where the full path to the jar file from the current working directory is specified.

Important note on file formats and versions: Version 3.2 and later has a new file format for TiMAT's *cela and *celp files. The change was made to allow for more memory efficient algorithms. The change is not backward compatible, ie. old *cela and *celp files will not work with most v. >=3.2 applications and visa versa. However, it is possible to convert the file formats-- please see the CHANGES file for details.

Splits a text file into many files given a number of lines, use to split the TPMap file for TPMapOligoBlastFiltering.
Blasts each oligo against a genome. Counts the number of matches (exact matches and 1bp mismatches). Replaces the t/f orientation column in the TPMap file with this count. Reassigns the chromosomal coordinates. Writes two files: BlastScored, saves TPMap lines that have at least one exact match; BlastFiltered, saves only those TPMap lines that have one exact match. Provides statistics on the filtering.
Uses Mummer to map and filter a 1lq text file against a genome. 100x faster than the TPMap Oligo Blast Filter and recommended if you don't care about 1bp mismatches. For processing very large chromosomes, ie human chr2 or 3 but not any drosophila, mummer needs a 64 bit machine with lots of RAM.
Joins many text files together, paying attention to avoid fusing the last and first lines from two files. Use to combine the remapped TPMap file.
Sorts a TPMap by chromosome and start position. Use to sort the remapped TPMap file.
Deprecated as of version 3.2-- most TiMAT applications now use the full tpmap and compute windows on the fly. Converts a tpmap file into precomuted windows across each chromosome for use by other TiMAT applications.
Converts binary *.CEL files (v. >=3.2) or text *.cel files (v. <3.2) into serialized float[][] object (*.cela) for use by other TiMAT applications.
Removed as of version 3.2, use CelMasker instead -- Creates image files for each text version '.cel' file provided. Useful for identifying chips in need of masking by the Cel Masker application.
Removed as of version 3.2 -- Calculates various statistics on groups of xxx.cela files, including hierarchical clustering. Renders and saves png virtual slides for outlier files.
Renders an image of the chip from *cela files and provides tools for masking out blemished areas. Image (masked or otherwise) may be saved in either *cela or, as of v. 3.2, *png format.
Walks through the TPMap object file fetching the PMMM intensity values, saves the float array to disk. Provides options for median scaling, quantile normalization and PM-MM transformation.
Removed as of version 3.2, correlation coefficients given by ScanChip in v. >=3.2 -- Very simple scatter plot and correlation coefficient calculator for huge int or float arrays. Use to estimate data reproducibility. For example, the correlation coefficient for two independent ChIP Chip experiments using the same antibody should be above 0.8 and preferably above 0.9. If not, optimization is needed.
Performs hierarchical clustering on a directory of serialized float[] arrays, transformed cel files (*.celp). Very useful in flagging bad chips.
Use to score windows of oligo intensities with a Wilcoxon Rank Sum test (removed in v. >=3.2) and a trimmed mean log ratio test. Wraps Richard Bourgon's Symmetric P-value test and John Storey's q-value multiple testing applications to convert log ratios into corrected p-values. Saves a variety of window and point sgr files for direct visualization to Affymetrix's IGB.
Associates an empirical FDR estimate with each window provided mock IPs were performed.
Counts the number of windows that pass a range of score cutoffs. Use this info to calculate false discovery rates.
Merges overlapping windows into intervals. Options are provided to set the minimum required overlap and minimum score needed for merging.
Merges overlapping windows into intervals from multiple Window[] arrays/ multiple experiments. Options are provided to set the minimum required overlap and minimum score needed for each Window[] array for merging, as well as the minimum number of passing windows, ie 2 out of 3. Useful for combining replicas or different antibodies into composit Intervals when one doesn't want to process them as a pool.
Fetches and saves oligo intensity information from processed cel files for each interval.
Scores Intervals for hits to a transcription factor binding matrix, LLPSPM.
Scores a genome for hits to a transcription factor binding matrix, LLPSPM.
Scores a multi-FASTA file of sequences for hits to a transcription factor binding matrix, LLPSPM.
Finds the highest average scoring intensity difference or ratio sub window, within an interval, typically 350bp. Also automatically picks binding peaks and binding regions.
Filters intervals by a variety of criteria into two files, pass or fail.
Sorts two sets of Intervals based on whether they overlap one another. Good for finding common sets of Intervals between say two different antibodies or excluding intervals found associated with controls.
Performs an intersection analysis between Intervals and a list of user specified regions.
Displays a detailed picture for each interval with oligo intensity information, transcription factor binding sites, repeat regions, best windows, etc. Use as the final filter in deciding whether the Interval is reasonable. Also use to pick binding regions and binding peaks by clicking and dragging.
Generates a spreadsheet or detailed reports for the Intervals with an assortment of calculations.
Prints a .sgr file to represent the Intervals in Affymetrix's IGB.
Prints a text GFF3 file to represent the Intervals.
Prints .sgr files for the intensity ratio and intensity difference for each oligo when given processed treatment and control cel files.
Prints .sgr files for any serialized int array (i.e. the processed cel files).
Converts a text file containing four tab delimited columns (chr, start, stop, score) to a heat map compatible file for import into IGB.
Performs an intersection analysis on lists of ranked regions creating a visual box-line-box representation as well as a rank based % intersection graph.
Use to characterize what genes surround the binding regions and whether there is a bias in their distribution. Can be combine with programs like GOMiner to identify possible biological functions associated with ChIP Chip data. At present it works with Drosophila 4.0 annotation. Parsers for other gff annotation can be readily adapted.
New to version 3.3. NeighboringSequences finds arbitrarily sized sequences adjacent to TiMAT festures as given in the third column of the gff3 file produced by IntervalGFFPrinter.
New to version 3.4. Converts TiMAT's celp files to an sgr file; useful for debugging code and experiments
New to version 3.4. Builds a tpmap file from Nimblegen .pos and .ndf files