Fisher's Combined P Method
Fisher’s Combined P Method for detecting expressed genes with Affymetrix expression arrays
Implementation:
The Fisher’s Combined P Method has been implemented (Schaffer, personal communication) in the R program software (R is available as Free Software under the terms of the Free Software Foundation's GNU General Public License). This method determines whether the average intensity of probe-sets in a sample group is greater than the average intensity of background probes of similar GC content to the sample probes. The chi-square threshold for determining how to make the expressed/unexpressed calls has been adjusted so that the calls are compatible with the present and absent calls as implemented in the Affymetrix Microarray Suite version 5. The Affymetrix algorithm performs a Wilcoxon signed rank-based gene expression to calculate the presence and absence calls using mismatched probesets. The chi-squared threshold adjustment was performed such that 97% of the probesets called all absent in a sample group are called unexpressed, and 95% of the probesets called all present in a sample group are called expressed with the Fisher’s Combined P Method.
Background:
Fisher (1932) proposed a method for combining p-values from independent tests of significance. This combined p method has been used by others among them Hess and Iyer (2007) to detect differentially expressed genes using Affymetrix expression array data. Hess and Iyer used the Fisher’s combined p method to combine p-values from probe level tests of significance. They demonstrated that this method successfully selected differentially expressed genes identified by other current methods using three spike-in datasets. The same algorithm has been used to test whether genes are detected above background (DABG) by Affymetrix software (Affymetrix Power Tools).
Fisher’s combined P Method
Fisher's method combines extreme value probabilities, p-values (results at least as extreme, assuming H0 true that both groups are the same) from each test into one test statistic (χ2) having a chi-square distribution using the formula
![]()
where a particular probe set i, m is the number of probes in the probe set, and pi is the p-value for the test for the particular probe.
The p-value for (χ2) itself can then be interpolated from a chi-square table using 2m "degrees of freedom", where m is the number of tests being combined. As in any similar test, H0 is rejected for small p-values at the α level of significance if
![]()
Code details
- Read in the CEL files as Affybatch.
- Quantile normalize the probe level intensities.
- List all the sample names so that they will be factored as separate covariates.
- Process all the GC probe bins and create matrix of the average log2 normalized signals, probe indices, and GC content, standard deviations, number of probes.
- Read in the previously calculated GC content of all the probes and their indices.
- Bind together the probe information by probe set, including the log2 normalized intensities and GC content for both the array probes and their GC bin probes.
- Calculate the Fisher’s combined p-value for each probe set.
- Determine the Expression call using a calibrated chi-squared threshold value of 80.
Calibration
The chi-square threshold adjustment was performed to the quantile normalized probes such that 97% of the probe sets called absent on a single array are called unexpressed, and 95% of the probe sets called present are called expressed.

References
Hess, A. and Iyer, H, Fisher’s combined p-value for detecting differentially expressed genes using Affymetrix expression arrays. (2007) BMC Genomics, 8:96.
Fisher, R. A. "Combining independent tests of significance" (1948) American Statistician, vol. 2, issue 5, page 30.
