Hello :)
First time posting so forgive me if I give an unclear description of what I am trying to achieve.. I need to bootstrap my correlations due to nonnormal age distribution (and age is correlated with most variables, so I have controlled by regressing tasks onto age). Anyway, I would like to take advantage of all the data available but some data is not available (i.e. because the participant didn't understand the task so were excluded, or they didn't complete the task). So this means I have different N on each variable. Of course when I bootstrap all correlations in the matrix it uses listwise deletion. Is it ok to instead bootstrap each pair of correlations and create my own matrix? And then run a multiple regression on these bootstrapped correlations using pairwise deletion (or use as input into Amos)? Any insights and advice would be greatly appreciated! Many thanks, Laura 
Administrator

If I follow, the ultimate goal is to estimate a multiple regression model that includes age plus a bunch of other variables, but you are concerned because age is not normally distributed. Right? What does the age distribution look like?
Bear in mind the following points. 1. The normal distribution is just a model, and nothing in nature is truly normal (see mkweb.bcgsc.ca/pointsofsignificance/img/Boxonmaths.pdf). (Nothing in nature is truly linear either.) 2. The key assumptions of OLS linear regression are that the *errors* (not the variables) are independently and identically distributed as normal with a mean of 0 and some variance (sigma^2). And normality of the errors is less important than their independence and homoscedasticity. With those points in mind, you might want to just fit your regression model and then examine the residuals (e.g., using residual plots). Googling <SPSS regression residual analysis> will likely turn up some good info on how to proceed. HTH.

Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an email, please use the address shown above. 
Besides Bruce's points, which are all valid, you may have endogenous selection of your complete data for the regression, which could introduce bias in the regression model. However, without more information about that model and the dependent variable, it is impossible to know. On Wed, May 24, 2017 at 5:52 AM, Bruce Weaver <[hidden email]> wrote: If I follow, the ultimate goal is to estimate a multiple regression model 
Free forum by Nabble  Edit this page 