Erez Ben-Yaacov and Yonina C. Eldar
A central task in the analysis of aCGH and Tiling microarray data is the segmentation into groups of probes sharing the same copy number. Some well known segmentation methods suffer from very long running times, preventing interactive data analysis.
We suggest a new 1-D piecewise constant segmentation method, based on wavelet decomposition and thresholding, which detects significant breakpoints in the data. Our algorithm is over 1,000 times faster than leading approaches, with similar performance. Another key advantage of the proposed method is its simplicity and flexibility. Due to its intuitive structure it can be easily generalized to incorporate several types of side information. We consider two extensions which include side information indicating the reliability of each measurement, and compensating for a changing variability in the measurement noise. The resulting algorithm outperforms existing methods, both in terms of speed and performance, when applied to real high density aCGH data.
|
|
Two examples of a genomic profile of a tumor (green) and HaarSeg segmentation result (blue). Data is taken from Lai et al. 2005. |
ReferenceE. Ben-Yaacov and Y. C. Eldar, "A Fast and Flexible Method for the Segmentation of aCGH Data", Bioinformatics, vol. 24, no. 16, pp. i139-i145, September 2008. |
Download HaarSeg Matlab Implementation. Version 1.2. June, 2009.
Installation:
1. Unzip all files to a directory of
your choice.
2. Compile the mex functions: (compiled windows 32-bit versions
are provided).
(-) In your matlab environment, set the directory to
where you unzipped the sources.
(-) Type the following in the matlab
environment:
>> mex mexConvAndPeak.c
>> mex
mexThresAndUnify.c
>> mex mexAdjustBreaks.c
Usage:
1. HaarSeg.m is the main function, used to
segment data. Type "help HaarSeg.m" for basic usage instructions.
2.
thresBySig.m is the function of the aberration threshold, which can be applied
on the segmentation result.
Type "help thresBySig.m" for basic
usage instructions.
Download HaarSeg R Implementation
(via R-Forge). Version 0.0.2 June,
2009.
Usage:
1. HaarSeg.R is the main function, used to
segment data. See comments inside HaarSeg.R for usage instructions.