Sample Submittal | RNA Isolation | RNA Amplification | Pricing | Protocols | Quality Control | Data Analysis | FAQ's
Arrayer | Sample Preparation | Array Production | Slide Processing | Slide Scanning | Data Analysis
Sample Submittal | Roche LightTyper | Affymetrix Mapping Arrays | Data Analysis
GCOS | Imagene | GraphPad Prism | Bioconductor | BRB Array Tools | Ingenuity Pathways Analysis | GenMAPP | RMA Express
How to Join | Sample Submittal | Protocols | Data Analysis | CFG home
SNAPS Publications | Additional NeuroAIDS Publication | SNAPS Home
Program Overview | Protocols | Publications | Salomon lab | Project 2 | Project 3 | Bioinformatics | TGCG Homepage
subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link

Data Analysis

Fisher's Combined P Method

Fisher’s Combined P Method for detecting expressed genes with Affymetrix expression arrays

 

Implementation:

The Fisher’s Combined P Method has been implemented (Schaffer, personal communication) in the R program software (R is available as Free Software under the terms of the Free Software Foundation's GNU General Public License).  This method determines whether the average intensity of probe-sets in a sample group is greater than the average intensity of background probes of similar GC content to the sample probes.    The chi-square threshold for determining how to make the expressed/unexpressed calls has been adjusted so that the calls are compatible with the present and absent calls as implemented in the Affymetrix Microarray Suite version 5.  The Affymetrix algorithm performs a Wilcoxon signed rank-based gene expression to calculate the presence and absence calls using mismatched probesets.  The chi-squared threshold adjustment was performed such that 97% of the probesets called all absent in a sample group are called unexpressed, and 95% of the probesets called all present in a sample group are called expressed with the Fisher’s Combined P Method.

 

Background:

Fisher (1932) proposed a method for combining p-values from independent tests of significance.  This combined p method has been used by others among them Hess and Iyer (2007) to detect differentially expressed genes using Affymetrix expression array data.  Hess and Iyer used the Fisher’s combined p method to combine p-values from probe level tests of significance.  They demonstrated that this method successfully selected differentially expressed genes identified by other current methods using three spike-in datasets.  The same algorithm has been used to test whether genes are detected above background (DABG) by Affymetrix software (Affymetrix Power Tools).   

 

Fisher’s combined P Method

Fisher's method combines extreme value probabilities, p-values (results at least as extreme, assuming H0 true that both groups are the same) from each test into one test statistic (χ2) having a chi-square distribution using the formula

x-squared2

 

 

where a particular probe set i, m is the number of probes in the probe set, and pi is the p-value for the test for the particular probe.

The p-value for (χ2) itself can then be interpolated from a chi-square table using 2m "degrees of freedom", where m is the number of tests being combined. As in any similar test, H0 is rejected for small p-values at the α level of significance if

significance-formula

 

 

Code details

  • Read in the CEL files as Affybatch.
  • Quantile normalize the probe level intensities.
  • List all the sample names so that they will be factored as separate covariates.
  • Process all the GC probe bins and create matrix of the average log2 normalized signals, probe indices, and GC content, standard deviations, number of probes.
  • Read in the previously calculated GC content of all the probes and their indices.
  • Bind together the probe information by probe set, including the log2 normalized intensities and GC content for both the array probes and their GC bin probes.
  • Calculate the Fisher’s combined p-value for each probe set.
  • Determine the Expression call using a calibrated chi-squared threshold value of 80.

Calibration

The chi-square threshold adjustment was performed to the quantile normalized probes such that 97% of the probe sets called absent on a single array are called unexpressed, and 95% of the probe sets called present are called expressed.

 

graph

 

 

 

 

 

 

 

 

 

 

References

Hess, A. and Iyer, H, Fisher’s combined p-value for detecting differentially expressed genes using Affymetrix expression arrays. (2007) BMC Genomics, 8:96.

Fisher, R. A. "Combining independent tests of significance" (1948) American Statistician, vol. 2, issue 5, page 30.

 

Publications | Funding | Contacts | Links | Sitemap | Home | TSRI Home | Updated 1/7/2008