| File Splitter | Splits a text file into many files given a number of lines, use to split the bpmap file for BPMapOligoBlastFiltering. |
| BPMap Oligo Blast Filter | Blasts each oligo against a genome. Counts the number of matches (exact matches and 1bp mismatches). Replaces the t/f orientation column in the bpmap file with this count. Reassigns the chromosomal coordinates. Writes two files: BlastScored, saves bpmap lines that have at least one exact match; BlastFiltered, saves only those bpmap lines that have one exact match. Provides statistics on the filtering. |
| File Joiner | Joins many text files together, paying attention to avoid fusing the last and first lines from two files. Use to combine the remapped bpmap file. |
| Mummer Filter | Uses Mummer to map and filter a ProbeExporter bpmap-ish text file against a genome. 100x faster than the Oligo Blast Filter and recommended if you don't care about 1bp mismatches. Will reasign coordinates and remove any oligos that map more than once to the genome. For processing very large chromosomes, ie human chr2 or 3 but not any drosophila, mummer needs a 64bit machine with 6MB of RAM. |
| BPMap Sort | Sorts a bpmap by chromosome and start position. Use to sort the remapped bpmap file. |
| BPMap Processor | Three part application: BPMapDuplicateFilter, collapses oligos into a unique set of sequences, notes where repeats are spotted and saves their coordinates for averaging oligo intensities. MapSplitter, collects information about the bpmapfile, # lines, start stop line indexes for different chromosomes etc. Saves float arrays for the genomic start positions, and number of duplicates for each oligo split by chromosome. WindowMaker, makes overlapping windows, advancing one oligo at a time with a maximized width and minimum number of oligos, for each chromosome. |
| Virtual Cel | Creates image files for each text version '.cel' file provided. Useful for identifying chips in need of masking by the Cel Masker application. |
| Cel Masker | Draws a virtual cel file base on a text version '.cel' file. Problem areas can be circled and masked. |
| Cel Mapper | Stand alone application to print a pseudo bpmap file where the PM and MM coordinates have been replaced with their intensity values when given a .cel file and a bpmap file. |
| Cel Processor | Three part application: CelMapper, builds a virtual slide with cel file intensities, walks through the bpmap file fetching the PMMM intensity values, saves the float array to disk. QuantileNormalization, provides options to quantile normalize and median scale multiple float arrays from CelMapper. Also creates an averaged chip to represent multiple treatments or controls. PMMMTransformer can be use to perform a simple max((PM-MM),0) transformation of the float arrays. |
| Scatter Plot | Very simple scatter plot and correlation coefficient calculator for huge int or float arrays. Use to estimate data reproducibility. For example, the correlation coefficient for two independent ChIP Chip experiments using the same antibody should be above 0.8 and preferably above 0.9. If not, optimization is needed. |
| Sum Intensity Test | Use to score windows of oligo intensities with a Wilcoxon Rank Sum test, a trimmed mean relative difference test and a trimmed mean ratio test. Also saves .sgr files for each of the tests for import in to Affymetrix's IGB. |
| Skeptical Sum Intensity Test | Ditto but assigns the lowest sum, ratio, and difference scores to the windows from all possible pairwise comparisons between treatment and control chips. |
| Interval Maker | Merges overlapping windows into intervals. Options are provided to set the minimum required overlap and minimum score(s) needed for merging. |
| Window Scanner | Scores Window arrays over a range of ratio values into pass and fail. Useful for establishing a false positive rate when comparing treatment and control. |
| Load Interval Oligo Info | Fetches and saves oligo intensity information from processed cel files for each interval. |
| Score Intervals | Scores Intervals for hits to a transcription factor binding matrix, LLPSPM. |
| Score Chromosomes | Scores a genome for hits to a transcription factor binding matrix, LLPSPM. |
| Score Sequences | Scores a multi-FASTA file of sequences for hits to a transcription factor binding matrix, LLPSPM. |
| Find Sub Binding Regions | Finds the highest average scoring intensity difference or ratio sub window, within an interval, typically 350bp. Also automatically picks binding peaks and binding regions. |
| Interval Filter | Filters intervals by a variety of criteria into two files, pass or fail. |
| Overlap Counter | Sorts two sets of Intervals based on whether they overlap one another. Good for finding common sets of Intervals between say two different antibodies or excluding intervals found associated with controls. |
| Intersect Intervals With Regions | Performs an intersection analysis between Intervals and a list of user specified regions. |
| Interval Plotter | Displays a detailed picture for each interval with oligo intensity information, transcription factor binding sites, repeat regions, best windows, etc. Use as the final filter in deciding whether the Interval is reasonable. Also use to pick binding regions and binding peaks by clicking and dragging. |
| Interval Report Printer | Generates a spreadsheet or detailed reports for the Intervals with an assortment of calculations. |
| Interval Graph Printer | Prints a .sgr file to represent the Intervals in Affymetrix's IGB. |
| Interval GFF Printer | Prints a text GFF3 file to represent the Intervals. |
| Binding Region Graph Printer | Prints a .sgr file to represent the binding regions picked using the IntervalPlotter application in Affymetrix's IGB |
| Oligo Intensity Printer | Prints .sgr files for the intensity ratio and intensity difference for each oligo when given processed treatment and control cel files. |
| Intensity Printer | Prints .sgr files for any serialized int array (i.e. the processed cel files). |
| Annotate Regions | Use to characterize what genes surround the binding regions and whether there is a bias in their distribution. Can be combine with programs like GOMiner to identify possible biological functions associated with ChIP Chip data. At present it works with Drosophila 4.0 annotation. Parsers for other gff annotation can be readily adapted upon request. |