You are welcome, but I'd rather keep the thread open to everyone, not
Since I were using stata before, I am quite new to SPSS.
I hope that you would not mind helping me in use your code to fit my purpose.
I want to draw 100 samples
I think that 100 bootstrap samples is too small.
from my data named bootsstripdata.sav(contains 21 variables and N=7475)
Run the following LOGISTIC REGRESSION on the 100 samples, then calculate the
average areas under the ROC curve and the standard deviation of ROC.
Then you are NOT bootstraping the AUC, but the logistic regression
model itself. What do you really want to do?
A) Get 100 different fitted regression models (different coefficients,
pseudo R-square measures, goodness of it... plus 100 different AUC).
I see that you are using a stepwise method to get the model. Please,
search the archived messages, this topic has been discussed several
times and the general idea is that this is a VERY BAD way of building a
model (at least, you should not use FSTEP(WALD), but FSTEP(LR), not so
awful). See Scott Millis' answer to a message named "Multiple
Regression with Continuous and Categorical Variables" (july 24th) for a
good collection of reasons for not using stepwise methods. Besides, in
your case, you might end up with different variables, with different
coefficients, being included. What's the use of getting an average AUC
for different models?
B) Get one single model and bootstrap its AUC.
The second approach needs very little modification of my macro, the
first approach will make it useless (a very different approach should
be then used: build a file with the 100 bootstrapped samples, run
logistic regression splitting the file, and use OMS to capture the
relevant info and save it to a new dataset, then open the dataset and
work with it)
If you want to use the second approach, then this is what you have to
1) Run LOGISTIC REGRESSION and save the predicted probabilities (but
please, build the model in a more sensible way than the one you are
using, see my July 24th reply to the same message named "Multiple
Regression with Continuous and Categorical Variables" for some
guidelines in model building strategies, extracted from
Hosmer&Lemshow book on Logistic Regression) .
A variable called PRE-1 will be added to your dataset
2) Run the MACRO:
BOOTROC PRE_1 StatusatdisN(1) k=100. /* (Assuming that StatusatdisN
=1 is the event).
You will get bootstrap estimates for the AUC of the unique model you
have built. On second thoughts, why do you want to use bootstrap to get
the AUC and its standard error? given the sample size you mention (over
7000), asymptotic methods for the SE(AUC) will be OK. You can use SPSS
ROC PRE_1 BY StatusatdisN(1)
I am trying to use bootstrap resampling techniques to calculate the
average areas under the ROC curve and the standard deviation of ROC
curve predicted by logistic regression.
Has anyone written or does anyone know any algorithm to resample by
bootstrap or calculate the average and standard deviation of ROC
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command
This post has NOT been accepted by the mailing list yet.
I'm doing a clinical scoring system, and need to internal validate it. I'm just curious how to perform ROC by bootstrap on SPSS. You have posted a code which is BOOTROC PRE_1 StatusatdisN(1) k=100. /* (Assuming that StatusatdisN =1 is the event). But it need run marco at first. Since I'm not good at SPSS. Could you post a whole code about how to present a ROC curve by resampling my socring system? I appricate that!