repeating analyses, exluding one case at a time

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

repeating analyses, exluding one case at a time

nina
Hello

I'd like to check the sensitivity of my analyses with regard to single cases. For example, if I were to conduct a simple t-test with 100 cases, which syntax would you recomend to conduct the test for 99 cases with one case excluded, then again with 99 cases but now a different case excluded, and so on?

Best
Nina

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: repeating analyses, exluding one case at a time

Jon Peck
In the case of a t test, the easiest way to get a compact representation of all the leave-one-out results is to formulate the problem as a regression (the group variable is the independent variable) and save the DfBeta variable and perhaps other similar variables.  You can then look at a summary of that variable.

  • DfBeta(s)The difference in beta value is the change in the regression coefficient that results from the exclusion of a particular case. A value is computed for each term in the model, including the constant.
Note also that bootstrapping is already doing something similar in spirit, but it won't show you all the individual results.

​In the general case, doing all the leave-one-out computations is going to generate a lot of output, but you can do it by using a Python loop.  For example

begin program.
import spss
n = spss.GetCaseCount()

for i in range(1,n+1):
    spss.Submit("""compute allbut1 = $casenum ne %s.
filter by allbut1.""" %i)
    spss.Submit("""
T-TEST GROUPS=minority(0 1)
  /VARIABLES=salary.
""")
end program.
 You would put whatever procedure you want where the T-TEST syntax is in the example.  The code loops omitting each case in turn and running the specified syntax.  Note the indentation as shown in the example is important.​

On Sun, May 21, 2017 at 4:25 PM, Nina Lasek <[hidden email]> wrote:
Hello

I'd like to check the sensitivity of my analyses with regard to single cases. For example, if I were to conduct a simple t-test with 100 cases, which syntax would you recomend to conduct the test for 99 cases with one case excluded, then again with 99 cases but now a different case excluded, and so on?

Best
Nina

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: repeating analyses, exluding one case at a time

Rich Ulrich
In reply to this post by nina

A better idea for checking the possible influence of one or more outliers on  a t-test is to

start with FREQ to get all the z-scores and see how big the largest (absolute) z's are.  Then,

maybe, drop those cases. (See how much of the total variance they account for, if you want

a guide to "over-influence".)


The biggest z's show which cases are going to have the biggest effect on both the mean

and the variance.  And you can see if there is more than one.


Tukey's formal jack-knife procedure is a bit more complicated than a simple "leave-one-out";

it has an extra computation at each step to estimate a corrected-t.  I do not find even the

jack-knife interesting for a t-test, but if I were going to do the tedious part of the work, I

think I would generate the statistic that /some/ people might appreciate.


--

Rich Ulrich



From: SPSSX(r) Discussion <[hidden email]> on behalf of Nina Lasek <[hidden email]>
Sent: Sunday, May 21, 2017 6:25:07 PM
To: [hidden email]
Subject: repeating analyses, exluding one case at a time
 
Hello

I'd like to check the sensitivity of my analyses with regard to single cases. For example, if I were to conduct a simple t-test with 100 cases, which syntax would you recomend to conduct the test for 99 cases with one case excluded, then again with 99 cases but now a different case excluded, and so on?

Best
Nina

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: repeating analyses, exluding one case at a time

Art Kendall
In addition to z-scores. Use EXPLORE and check the data entry for what it labels "outliers". YMMV but think or outliers as values that are suspicious and so should be checked.

I suggest using the procedures for checking  anomalous values, EXPLORE, z-scores, visualizations. etc  as part of the data preparation and validation before any actual test are applied to the data.

P.s. Did you double enter or proof read your data?
Art Kendall
Social Research Consultants
Loading...