# False Discovery Benjamin-Hochberg

19 messages
Open this post in threaded view
|

## False Discovery Benjamin-Hochberg

 HiIs there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e.p-value adjusted to take account of false discoverynumber of "true" rejections of the null?Am looking at situations where  the number of tests may be in the 1000sThis can be genome dataOR it might be simulations, where p value distributions are comparedbestDiana  _______________Professor Diana KornbrotUniversity of HertfordshireCollege Lane, Hatfield, Hertfordshire AL10 9AB, UK+44 (0) 170 728 4626+44 (0) 208 444 2081+44 (0) 7403 18 16 12[hidden email]http://dianakornbrot.wordpress.com/skype:  kornbrotme_______________________________
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 Hi Diana, Assuming you want the Benjamini-Hochberg procedure (http://en.wikipedia.org/wiki/False_discovery_rate#Benjamini.E2.80.93Hochberg_procedure), you could just throw the p-values into the data editor, sort them, compute (k/m)*alpha for your chosen false discovery rate for each p-value, and then find the largest p-value that satisfies the inequality in the procedure.  P-values less than that value are then considered significant. Alex
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 In reply to this post by Kornbrot, Diana This seems like it would be a simple problem. One problem might be getting the p values into a data file. But let’s say that is done. As I recall the BH procedure, the p’s are sorted in ascending order. It would be a good idea to number the records, which can be done using \$casenum. I’m probably not remembering the formula correctly but let’s say it’s .05/irec, where irec is the record number. A p passes if it is less than .05/irec; otherwise, it and all following p’s fail. The computation is this.   Compute pass=0. If (p le .05/irec) pass=1.   Gene Maguin     From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kornbrot, Diana Sent: Saturday, March 29, 2014 10:24 AM To: [hidden email] Subject: False Discovery Benjamin-Hochberg   Hi Is there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e. p-value adjusted to take account of false discovery number of "true" rejections of the null?   Am looking at situations where  the number of tests may be in the 1000s This can be genome data OR  it might be simulations, where p value distributions are compared   best   Diana      _______________ Professor Diana Kornbrot University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK +44 (0) 170 728 4626 +44 (0) 208 444 2081 +44 (0) 7403 18 16 12 skype:      kornbrotme_______________________________
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 Administrator In reply to this post by Kornbrot, Diana http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/FDR Kornbrot, Diana wrote Hi Is there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e. p-value adjusted to take account of false discovery number of "true" rejections of the null? Am looking at situations where  the number of tests may be in the 1000s This can be genome data OR it might be simulations, where p value distributions are compared best Diana _______________ Professor Diana Kornbrot University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK +44 (0) 170 728 4626 +44 (0) 208 444 2081 +44 (0) 7403 18 16 12 [hidden email] http://dianakornbrot.wordpress.com/ http://go.herts.ac.uk/Diana_Kornbrotskype:  kornbrotme_______________________________ -- Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 In reply to this post by Alex Reutter Does this syntax do it?  TENTATIVE  needs to be checked by other list members. * make up some data. new file. input program.    string SomeTestName (a25).    numeric ObtainedProb (f5.3) .    loop id = 1 to 10000.       compute SomeTestName = "some descriptive words".       compute ObtainedProb = (rnd(1000*rv.uniform(0,1))/1000).       end case.    end loop.    end file. end input program. *. *  the part below here would be applied to your data. *. numeric LessEqual.05 (f1). compute LessEqual.05 = ObtainedProb le .05. frequencies variable=ObtainedProb LessEqual.05    /formats = notable /histogram. aggregate outfile=* mode=addvariables    /TestsDone = N    /FoundRaw.05  = sum(LessEqual.05). Numeric AdjustedProb(f5.3). compute AdjustedProb = (TestsDone/FoundRaw.05) * ObtainedProb. compute NowSignificant = AdjustedProb le .05. sort cases by AdjustedProb. temporary. select if NowSignificant. List /variables = ID SomeTestName AdjustedProb. ```Art Kendall Social Research Consultants``` On 3/30/2014 9:52 PM, Alex Reutter [via SPSSX Discussion] wrote: Hi Diana, Assuming you want the Benjamini-Hochberg procedure (http://en.wikipedia.org/wiki/False_discovery_rate#Benjamini.E2.80.93Hochberg_procedure), you could just throw the p-values into the data editor, sort them, compute (k/m)*alpha for your chosen false discovery rate for each p-value, and then find the largest p-value that satisfies the inequality in the procedure.  P-values less than that value are then considered significant. Alex If you reply to this email, your message will be added to the discussion below: http://spssx-discussion.1045642.n5.nabble.com/False-Discovery-Benjamin-Hochberg-tp5725087p5725123.html To start a new topic under SPSSX Discussion, email [hidden email] To unsubscribe from SPSSX Discussion, click here. NAML BH possible syntax to check..sps (904 bytes) Download Attachment Art Kendall Social Research Consultants
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 Administrator In reply to this post by Bruce Weaver I just noticed that the syntax on that MRC page uses the "GHB Horrible Hack" method to define a macro variable for the number of cases.  Using AGGREGATE (as in the syntax Art K posted) is a far better way to go!  I.e., this... * Calculate the number of p values. RANK PVALS /n into N. * N contains the number of cases in the file. * make a submacro to be invoked from the syntax. DO IF \$CASENUM=1. WRITE OUTFILE 'C:\temp.sps' /"DEFINE !nbcases()"n"!ENDDEFINE.". END IF. EXE. INCLUDE FILE='C:\temp.sps'. ...could be replaced with a simple AGGREGATE that writes the number of cases to a new variable. Bruce Weaver wrote http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/FDR Kornbrot, Diana wrote Hi Is there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e. p-value adjusted to take account of false discovery number of "true" rejections of the null? Am looking at situations where  the number of tests may be in the 1000s This can be genome data OR it might be simulations, where p value distributions are compared best Diana _______________ Professor Diana Kornbrot University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK +44 (0) 170 728 4626 +44 (0) 208 444 2081 +44 (0) 7403 18 16 12 [hidden email] http://dianakornbrot.wordpress.com/ http://go.herts.ac.uk/Diana_Kornbrotskype:  kornbrotme_______________________________ -- Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 Administrator Bruce,   Not only that, but the whole thing could/should be done in MATRIX in the first place. It uses MATRIX to do the roundup, so why bother with all that early calculation which is trivial to do in MATRIX?. OTOH: WTF? "COMPUTE ccmpmx = pvals LE (CCOMP>0)*CMAX(((CMAX(ccomp) &* (diff LE 0)) EQ ccomp) &* pvals) ." Given a few minutes of eye bleeding I could sort that, but I'm not going to bother. Could probably be cast in standard syntax but let sleeping dogs lie as it were. I write that sort of s#\$% on occasion but usually comment the hell out of it if I intend it for public consumption (OPU). ------ Bruce Weaver wrote I just noticed that the syntax on that MRC page uses the "GHB Horrible Hack" method to define a macro variable for the number of cases.  Using AGGREGATE (as in the syntax Art K posted) is a far better way to go!  I.e., this... * Calculate the number of p values. RANK PVALS /n into N. * N contains the number of cases in the file. * make a submacro to be invoked from the syntax. DO IF \$CASENUM=1. WRITE OUTFILE 'C:\temp.sps' /"DEFINE !nbcases()"n"!ENDDEFINE.". END IF. EXE. INCLUDE FILE='C:\temp.sps'. ...could be replaced with a simple AGGREGATE that writes the number of cases to a new variable. Bruce Weaver wrote http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/FDR Kornbrot, Diana wrote Hi Is there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e. p-value adjusted to take account of false discovery number of "true" rejections of the null? Am looking at situations where  the number of tests may be in the 1000s This can be genome data OR it might be simulations, where p value distributions are compared best Diana _______________ Professor Diana Kornbrot University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK +44 (0) 170 728 4626 +44 (0) 208 444 2081 +44 (0) 7403 18 16 12 [hidden email] http://dianakornbrot.wordpress.com/ http://go.herts.ac.uk/Diana_Kornbrotskype:  kornbrotme_______________________________ Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 In reply to this post by Kornbrot, Diana On Sat, Mar 29, 2014 at 10:24 AM, Kornbrot, Diana wrote: HiIs there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e.p-value adjusted to take account of false discoverynumber of "true" rejections of the null? Am looking at situations where Â the number of tests may be in the 1000sThis can be genome dataORÂ it might be simulations, where p value distributions are compared bestDianaÂ  _______________ Professor Diana KornbrotUniversity of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK +44 (0) 170 728 4626 +44 (0) 208 444 2081 +44 (0) 7403 18 16 12 skype: Â kornbrotme_______________________________
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 Administrator In reply to this post by David Marso Agreed!  I retract that link, and suggest that Diana use the one Ryan posted instead.   http://www-01.ibm.com/support/docview.wss?uid=swg21476447 David Marso wrote Bruce,   Not only that, but the whole thing could/should be done in MATRIX in the first place. It uses MATRIX to do the roundup, so why bother with all that early calculation which is trivial to do in MATRIX?. OTOH: WTF? "COMPUTE ccmpmx = pvals LE (CCOMP>0)*CMAX(((CMAX(ccomp) &* (diff LE 0)) EQ ccomp) &* pvals) ." Given a few minutes of eye bleeding I could sort that, but I'm not going to bother. Could probably be cast in standard syntax but let sleeping dogs lie as it were. I write that sort of s#\$% on occasion but usually comment the hell out of it if I intend it for public consumption (OPU). ------ Bruce Weaver wrote I just noticed that the syntax on that MRC page uses the "GHB Horrible Hack" method to define a macro variable for the number of cases.  Using AGGREGATE (as in the syntax Art K posted) is a far better way to go!  I.e., this... * Calculate the number of p values. RANK PVALS /n into N. * N contains the number of cases in the file. * make a submacro to be invoked from the syntax. DO IF \$CASENUM=1. WRITE OUTFILE 'C:\temp.sps' /"DEFINE !nbcases()"n"!ENDDEFINE.". END IF. EXE. INCLUDE FILE='C:\temp.sps'. ...could be replaced with a simple AGGREGATE that writes the number of cases to a new variable. Bruce Weaver wrote http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/FDR Kornbrot, Diana wrote Hi Is there any simple way to enter a clump of p-values and have SPSS give Benjamini-Hochberg results, i.e. p-value adjusted to take account of false discovery number of "true" rejections of the null? Am looking at situations where  the number of tests may be in the 1000s This can be genome data OR it might be simulations, where p value distributions are compared best Diana _______________ Professor Diana Kornbrot University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK +44 (0) 170 728 4626 +44 (0) 208 444 2081 +44 (0) 7403 18 16 12 [hidden email] http://dianakornbrot.wordpress.com/ http://go.herts.ac.uk/Diana_Kornbrotskype:  kornbrotme_______________________________ -- Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 In reply to this post by Alex Reutter Thanks for help from several SPSS experts.There is no direct option for Benjamini-Hochberg in any SPSS procedure. Nor is there any stand-alone procedure.Several people provided helpful scripts. Many thanks. However, this is inevitably a cumbersome approachConsequently, I have produced a simple EXCEL spreadsheet that takes a column of p-value and returns the number that are 'trully' reject the null at any chosen false discovery rate, q.http://wp.me/PYt7T-6MSee Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289-300. http://www.jstor.org/stable/2346101Apparently SPSS is considering adding BH to the options available in some procedures [e.g. GLM].I would also favour a stand alone procedure. This is because the set of p-values may well come form running a single procedure with a 'by' variable, as in my case. SPSS was far from eager to run logistic with 2 factors, 1 of which had 3219 levels [after 3 hours it was still on iteration2 - i gave up]Comments welcomebestDiana ___________Professor Diana KornbrotWorkUniversity of HertfordshireCollege Lane, Hatfield, Hertfordshire AL10 9AB, UK+44 (0) 170 728 4626[hidden email]http://dianakornbrot.wordpress.com/skype:  kornbrotmeHome19 Elmhurst AvenueLondon N2 0LT, UK +44 (0) 208 444 2081
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

 Many of the regression models in SPSS have options to save the parameter estimates, standard errors & p-values to a separate dataset (see the OUTFILE subcommand for whatever procedure). So basically given the listed macros all you need to do is save the outfile of the estimates, potentially eliminate superfluous rows in that parameter estimate dataset, and then subject the p-values of interest to the FDR procedure. If the regression procedure does not have an outfile subcommand you could use OMS to do the same thing with alittle more work. Andy W apwheele@gmail.com http://andrewpwheeler.wordpress.com/
Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

Open this post in threaded view
|

## Re: False Discovery Benjamin-Hochberg

Open this post in threaded view
|