----------------------------------------------------------- |INSTRUCTIONS FOR USING gBPCR (also for users not used to R)| ----------------------------------------------------------- ##Requirements: ##the package needs the library nlme, therefore you need to install it ##to use the functions of the package## -Open R -Change the working directory with the folder gBPCR. (the user can use the command "file"/"change directory" in the menu of the console, or the command line setwd("path of the folder")). -Type source("gBPCR.R"). -Type library(nlme) ##how to use the help## -All the functions are described in the help in folder html. -To use the help easily, open the file html\00Index.html with a browser. ##examples## -In the help, the description of each function ends with an example of how to use it. -In the folder R-ex, there are the corresponding R scripts. ##attention## Some functions of the package need to know the parameter maxProbeNumber=maximum number of probes that a chromosome (or arm of a chromosome) can have to be analyzed. In fact, the procedure of profile estimation needs the computation of an array of length (length(chromosome)+1)*(length(chromosome)+2)/2. To be sure to set this parameter correctly, try to create the array A <- array(1, dim=(maxProbeNumber+1)*(maxProbeNumber+2)/2), before starting with the estimation procedure ##how to apply gBPCR## Before using gBPCR, it is necessary to estimate the density of the estimated log2ratio values (by mBPCR or another segmentation method). A possible function that can be used for this purpose is normalmixEM (in library mixtools). To obtain a good estimation of the density using this function, we suggest to: - eliminate the outliers; - give to the function reasonable guess values for the mean and the variance of the three normal distributions corresponding to loss, normal and gain states. To verify the goodness of the estimation, you can plot the estimated density over the hystogram, by using the functions norMix and dnorMix in library nor1mix. The thresholds for the definition of the prior distribution of the copy number events can be computed using createThr (in the package of gBPCR). -Example 1: ----------- ###import the 250K nsp data of sample NA10851_LOH_20 path <- paste(getwd(), "/data/NA10851_LOH_20.dat",sep='') sample <- importGenomicData(path, NRowSkip=1) ###we select only the data belonging to the first part of chromosome 7 chr1 <- sample$chr chr1[chr1==7][-(1:500)] <- 8 pHetData <- xPrior(typeArray='Affy250Knsp', race='CEU') pHet1 <- array(dim=length(chr1)) pHet1[chr1==7] <- pHetData$pHet[pHetData$chrPHet == 7][1:500] ###we import the estimated parameters of the density of the estimeted log2ratio values load(paste(getwd(),'/data/paramHist20.RData',sep='')) thrHist <- createThr(paramHist) ###estimation of the profile of the part of interest of chromosome 7 results <- estProfileWithGBPCR(snpName=sample$snpName, chr=chr1, position=sample$position, call=sample$call, rawLogratio=sample$rawLogratio, estLogratio=sample$estLogratio, thrHist=thrHist, chrToBeAnalyzed=7, maxProbeNumber=1000, pHet=pHet1, kMax=10) ###plot the estimated profile UPD <- results$estUPD UPD[UPD == 0] <- NA UPD[UPD == 1] <- 0 plot(sample$position[chr1 == 7],sample$rawLogratio[chr1 == 7]) points(sample$position[chr1 == 7],sample$estLogratio[chr1 == 7],col=3,type='l',lwd=3) points(sample$position[chr1 == 7],cna2logCn(results$estCNA[chr1 == 7]),col=2,type='l',lwd=2) points(sample$position[chr1 == 7],UPD[chr1 == 7],col=5,type='l',lwd=2) legend(x="bottomleft",legend=c('mBPCR','CNAs','IBD/UPD'),lty=c(1,1,1),col=c(3,2,5)) -Example 2: ----------- ###for the estimation of the profile of the whole chromosome 7, it is sufficient to use chr=sample$chr, ### the default parameters for pHet and kMax and augment maxProbeNumber ###(notice that it could need long time to run) results <- estProfileWithGBPCR(snpName=sample$snpName, chr=sample$chr, position=sample$position, call=sample$call, rawLogratio=sample$rawLogratio, estLogratio=sample$estLogratio, thrHist=thrHist, chrToBeAnalyzed=7, maxProbeNumber=14000) UPD <- results$estUPD UPD[UPD == 0] <- NA UPD[UPD == 1] <- 0 plot(sample$position[sample$chr == 7],sample$rawLogratio[sample$chr == 7]) points(sample$position[sample$chr == 7],sample$estLogratio[sample$chr == 7],col=3,type='l',lwd=3) points(sample$position[sample$chr == 7],cna2logCn(results$estCNA[sample$chr == 7]),col=2,type='l',lwd=2) points(sample$position[sample$chr == 7],UPD[sample$chr == 7],col=5,type='l',lwd=2) legend(x="bottomleft",legend=c('mBPCR','CNAs','IBD/UPD'),lty=c(1,1,1),col=c(3,2,5)) -Example 3: ----------- #In sample NA10851_UPD_20, the portion of chromosome 7 analized in EXAMPLE 1 has a region of copy-neutral LOH #corresponding to the region of loss found in NA10851_LOH_20 ###import the 250K nsp data of sample NA10851_UPD_20 path <- paste(getwd(), "/data/NA10851_UPD_20.dat",sep='') sample <- importGenomicData(path, NRowSkip=1) ###set the parameters as in EXAMPLE 1 chr1 <- sample$chr chr1[chr1==7][-(1:500)] <- 8 pHetData <- xPrior(typeArray='Affy250Knsp', race='CEU') pHet1 <- array(dim=length(chr1)) pHet1[chr1==7] <- pHetData$pHet[pHetData$chrPHet == 7][1:500] load(paste(getwd(), "/data/paramHist20.RData",sep='')) thrHist <- createThr(paramHist) ###estimation of the profile of the part of interest of chromosome 7 results <- estProfileWithGBPCR(snpName=sample$snpName, chr=chr1, position=sample$position, call=sample$call, rawLogratio=sample$rawLogratio, estLogratio=sample$estLogratio, thrHist=thrHist, chrToBeAnalyzed=7, maxProbeNumber=1000, pHet=pHet1, kMax=10) ###plot the estimated profile UPD <- results$estUPD UPD[UPD == 0] <- NA UPD[UPD == 1] <- 0 plot(sample$position[chr1 == 7],sample$rawLogratio[chr1 == 7]) points(sample$position[chr1 == 7],sample$estLogratio[chr1 == 7],col=3,type='l',lwd=3) points(sample$position[chr1 == 7],cna2logCn(results$estCNA[chr1 == 7]),col=2,type='l',lwd=2) points(sample$position[chr1 == 7],UPD[chr1 == 7],col=5,type='l',lwd=2) legend(x="bottomleft",legend=c('mBPCR','CNAs','IBD/UPD'),lty=c(1,1,1),col=c(3,2,5)) ##suggestions## -If the goal is to estimate the profile of a patient, it is better to estimate the variance of the segment levels (rhoSquare) on a cell line or on a sample with many aberrations and use this value in the profile estimation of the patient. In fact, we need many aberrations to estimate well rhoSquare.