Executing Discriminant on different set of questions

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Executing Discriminant on different set of questions

Arora, Manoj (IMDLR)

Hi All,

 

I have around 60 variables and I want to run discriminant analysis based on 60 C 5 combination means I want syntax run discriminant with different set of variables eg

 

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then on v1, v2, v3, v4, v6 and so on. So each time it should change the variable and run the output.

 

 

Thanks & Regards
 
Manoj Kumar



Kantar Disclaimer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Art Kendall
This sounds like an unusual thing to do.  I have been doing statistical consulting since 1972.
Reading between the lines it may be that you are trying to to some sort of stepwise approach.

Please explain in more detail what your situation is, how many cases you have what your variables are on both sides, etc.

Are your variables scale items?

How many groups of cases do you have?

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

David Marso
Administrator
In reply to this post by Arora, Manoj (IMDLR)
OK:  First off DISCRIM takes a grouping variable and a set of typically continuous discriminants.

What is the point of doing this all possible sub-subsets business?
There is nothing built into SPSS to generate these combinations but feel free to roll your own if you have sufficient grasp of algorithms to carry out the construction of the desired combinations.
If I were to do this I would opt to use the MATRIX language rather than a buttload of LOOP LOOP IF blah blah blah blah type crap.  IIRC some time ago we had a thread for dealing with APSS type simulations.
Meanwhile, best of luck.  You have some deep reading ahead of you.  Happy googling...

Arora, Manoj (IMDLR) wrote
Hi All,

I have around 60 variables and I want to run discriminant analysis based on 60 C 5 combination means I want syntax run discriminant with different set of variables eg

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then on v1, v2, v3, v4, v6 and so on. So each time it should change the variable and run the output.


Thanks & Regards

Manoj Kumar


Kantar Disclaimer<http://www.kantar.com/disclaimer.html>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Mike
I don't completely understand what the OP is trying to do but
it sounds like he wants to do something like "all or best
subsets regression analysis" where I think the OP wants the
combination of 5 predictors out of the 60 that maximally
discriminate the groups or predicts the group membership.

The OP is unclear about how many groups are being
"discriminated" -- if only two, then a form of logistic regression
might suffice, if more, then a form of multinomial regression
might work.  The old BMDP series had the program "9R" or
"All Possible Subsets Regression" but even if this were available
the 1992 version was limited to 25 predictors in the equation.

Perhaps Jon Peck or someone else can provide more information
about whether a similar analysis could be done through SPSS'
SIMPLAN procedure -- a paper by Oshima & Dell-Ross provide
an example using "Automatic Linear Modeling" through SIMPLAN
which might be useful; it can be obtained here:
http://digitalcommons.georgiasouthern.edu/gera/2016/2016/1?utm_source=digitalcommons.georgiasouthern.edu%2Fgera%2F2016%2F2016%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages

However, it is unclear to me whether this procedure would
all of the 5 predictor combinations models using 60 predictors.

Raynald Levesque provided syntax for a version of all possible
regression using SPSS 2001 capabilities -- It may or may not
be able to do what the OP wants but it might be worth taking
a looking at; see:
http://spsstools.net/en/syntax/syntax-index/regression-repeated-measures/do-all-subsets-regressions/

Again, I think the OP needs to be clearer on what the goal of
the analysis is:  is it equations for all 5 variable combinations
or the "best" combination of 5 variables or something else.

-Mike Palij
New York University
[hidden email]



On Wednesday, June 07, 2017 8:16 AM, David Marso wrote:

> OK:  First off DISCRIM takes a grouping variable and a set of
> typically
> continuous discriminants.
>
> What is the point of doing this all possible sub-subsets business?
> There is nothing built into SPSS to generate these combinations but
> feel
> free to roll your own if you have sufficient grasp of algorithms to
> carry
> out the construction of the desired combinations.
> If I were to do this I would opt to use the MATRIX language rather
> than a
> buttload of LOOP LOOP IF blah blah blah blah type crap.  IIRC some
> time ago
> we had a thread for dealing with APSS type simulations.
> Meanwhile, best of luck.  You have some deep reading ahead of you.
> Happy
> googling...
>
>
> Arora, Manoj (IMDLR) wrote
>> Hi All,
>>
>> I have around 60 variables and I want to run discriminant analysis
>> based
>> on 60 C 5 combination means I want syntax run discriminant with
>> different
>> set of variables eg
>>
>> I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5
>> then
>> on v1, v2, v3, v4, v6 and so on. So each time it should change the
>> variable and run the output.
>>
>>
>> Thanks & Regards
>>
>> Manoj Kumar
>>
>>
>> Kantar Disclaimer&lt;http://www.kantar.com/disclaimer.html&gt;

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

,=

David Marso
Administrator
This post was updated on .
In reply to this post by Arora, Manoj (IMDLR)
I think you are thinking about digging yourself a GIANT hole and will end up burning up your computer in the process.
Here is some rather crude logic for generating ALL possible 4 selections from 60 items.
We end up with a file containing 487,635 such combinations which tallies with https://www.calculatorsoup.com/calculators/discretemathematics/combinations.php

I could go on and do the C(60,5) but NO.  
If you have half a brain then you can build on what I already coded by simple copy/paste extrapolate.
If you don't understand the code then there is the FM with full documentation of the MATRIX language.

If you take it to the next step you end up with C(60,5)=5,461,512 .

C(60,2)= 1,770
C(60,3)=34,220
C(60,4)=487,635
C(60,5)=5,461,512
C(60,6)=50,063,860


<REALLY FUGLY CODE ALERT >.
NEW FILE.
DATASET CLOSE ALL.
PRESERVE.
SET MXLOOPS=1000000.
MATRIX.
COMPUTE Base= {1:60}.
COMPUTE pairs=MAKE(60*59/2,2,0).
COMPUTE #=1.
LOOP i=1 TO 59.
LOOP j=i+1 TO 60.
COMPUTE pairs(#,:)={i,j}.
COMPUTE #=#+1.
END LOOP.
END LOOP.
LOOP row=1 TO NROW(Pairs).
LOOP j=Pairs(row,2)+1 TO 60.
SAVE {Pairs(row,:),j} /OUTFILE * /VARIABLES x1 x2 x3.
END LOOP.
END LOOP.
END MATRIX.

MATRIX.
GET Triples/FILE * / VARIABLES x1 x2 x3.
LOOP row=1 TO NROW(Triples).
LOOP j=Triples(row,3)+1 TO 60.
SAVE {Triples(row,:),j} /OUTFILE * /VARIABLES x1 x2 x3 x4.
END LOOP.
END LOOP.
END MATRIX.
RESTORE.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Jon Peck
In reply to this post by Mike
I wasn't aware of the paper Mike cites, but on a quick look, I didn't find a reference to SIMPLAN.  Is this the right paper and I just missed the reference?

I should point out that DISCRIMINANT does provide a stepwise method with a number of methods.

It would be easy to generate DISCRIMINANT commands with all the variations using a little Python code, but that would be a large mess of output to sort through.  There are over 5,000,000 ways to choose 5 variables out of 60.  I don't know how long that would take.  With a little more effort, the best n models could be selected, and Python external mode would speed that up, but it would still take a long time.;.



On Wed, Jun 7, 2017 at 10:49 AM, Mike Palij <[hidden email]> wrote:
I don't completely understand what the OP is trying to do but
it sounds like he wants to do something like "all or best
subsets regression analysis" where I think the OP wants the
combination of 5 predictors out of the 60 that maximally
discriminate the groups or predicts the group membership.

The OP is unclear about how many groups are being
"discriminated" -- if only two, then a form of logistic regression
might suffice, if more, then a form of multinomial regression
might work.  The old BMDP series had the program "9R" or
"All Possible Subsets Regression" but even if this were available
the 1992 version was limited to 25 predictors in the equation.

Perhaps Jon Peck or someone else can provide more information
about whether a similar analysis could be done through SPSS'
SIMPLAN procedure -- a paper by Oshima & Dell-Ross provide
an example using "Automatic Linear Modeling" through SIMPLAN
which might be useful; it can be obtained here:
http://digitalcommons.georgiasouthern.edu/gera/2016/2016/1?utm_source=digitalcommons.georgiasouthern.edu%2Fgera%2F2016%2F2016%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages

However, it is unclear to me whether this procedure would
all of the 5 predictor combinations models using 60 predictors.

Raynald Levesque provided syntax for a version of all possible
regression using SPSS 2001 capabilities -- It may or may not
be able to do what the OP wants but it might be worth taking
a looking at; see:
http://spsstools.net/en/syntax/syntax-index/regression-repeated-measures/do-all-subsets-regressions/

Again, I think the OP needs to be clearer on what the goal of
the analysis is:  is it equations for all 5 variable combinations
or the "best" combination of 5 variables or something else.

-Mike Palij
New York University
[hidden email]



On Wednesday, June 07, 2017 8:16 AM, David Marso wrote:
OK:  First off DISCRIM takes a grouping variable and a set of typically
continuous discriminants.

What is the point of doing this all possible sub-subsets business?
There is nothing built into SPSS to generate these combinations but feel
free to roll your own if you have sufficient grasp of algorithms to carry
out the construction of the desired combinations.
If I were to do this I would opt to use the MATRIX language rather than a
buttload of LOOP LOOP IF blah blah blah blah type crap.  IIRC some time ago
we had a thread for dealing with APSS type simulations.
Meanwhile, best of luck.  You have some deep reading ahead of you. Happy
googling...


Arora, Manoj (IMDLR) wrote
Hi All,

I have around 60 variables and I want to run discriminant analysis based
on 60 C 5 combination means I want syntax run discriminant with different
set of variables eg

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then
on v1, v2, v3, v4, v6 and so on. So each time it should change the
variable and run the output.


Thanks & Regards

Manoj Kumar


Kantar Disclaimer&lt;http://www.kantar.com/disclaimer.html&gt;

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Mike

Hi Jon,
 
The paper I cite uses the menu system to assess "Automatic
Linear Modeling" within SIMPLAN.  If one examines the
syntax manual (I have ver 23 handy) on page 1754 in the
Simplan section is a list of models that can be estimated and
among them is "Automatic Linear Models" though the syntax
to implement this analysis is remarkably unclear -- which may
explain why the menu system was used in the paper. 
 
Since my posted my message earlier today, I have found
two different ways of getting all/best subset regression:
 
(1) Huberty who had done work in this area uses a Fortran
program by McCabe to obtain a best subset of variables in
discriminant analysis.  He describes McCabe's program in
the following book:
 
Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA
and discriminant analysis (Vol. 498). John Wiley & Sons.
 
Chapter 6 on "Ordering and Deleting Variables" Huberty provides
examples of analyses using the McCabe program.  The book is
on books.google.com and is in preview mode (many pages are
available for viewing) and can be accessed here:
 
(2)  The problem with the McCabe program is that though there
is a window's version it is somewhat difficult to locate.  The Wiley
website where the book can be downloaded if your institution
has the right subscription deal with Wiley does not have either the
McCabe program or the datasets that are used in the book (the
authors imply that both should be available).  McCabe has several
papers that provide the code for the procedure but it is kind of
difficult to locate the appropriate sources.  McCabe appears to be
at Perdue and perhaps one can contact him directly about this
program or an updated version.
 
Alternatively, Darlington and Hayes have updated Darlington's
regression textbook and they have a section on all subsets regression
through the use of a macro called RLM which is available for both
SPSS and SAS.  The text is also available on books.google.com and
can be accessed here:
 
RLM is covered in Chapter 8 and Figure 8.2  provides output from
SPSS RLM for the best models based on the criterion one has chosen.
Appendix A goes into more detail about the RLM macro.  The ref for
the book is:
 
Darlington, R. B., & Hayes, A. F. (2016). Regression analysis
and linear models: Concepts, applications, and implementation.
New York, NY: Guilford Publications.
 
And the book's webpage on Guilford is here:
 
If I am not mistaken, Andrew Hayes (2nd author) sometimes chimes in
on this list.
 
However, as Dave Marso has mentioned in a previous post, the problem
as stated seems to be somewhat ridiculous and we still don't know how
many groups are to be discriminated.  So, the is an opportunity for some
to learn a "new:"/old procedure.
 
R fans will find several ways of getting best subsets using different methods; see:
 
Again, until the OP's problem is more clearly stated, it is not clear what the most
appropriate analysis (or even possible).
 
-Mike Palij
New York University
 
----- Original Message -----
Sent: Wednesday, June 07, 2017 10:08 PM
Subject: Re: [SPSSX-L] Executing Discriminant on different set of questions

I wasn't aware of the paper Mike cites, but on a quick look, I didn't find a reference to SIMPLAN.  Is this the right paper and I just missed the reference?

I should point out that DISCRIMINANT does provide a stepwise method with a number of methods.

It would be easy to generate DISCRIMINANT commands with all the variations using a little Python code, but that would be a large mess of output to sort through.  There are over 5,000,000 ways to choose 5 variables out of 60.  I don't know how long that would take.  With a little more effort, the best n models could be selected, and Python external mode would speed that up, but it would still take a long time.;.



On Wed, Jun 7, 2017 at 10:49 AM, Mike Palij <[hidden email]> wrote:
I don't completely understand what the OP is trying to do but
it sounds like he wants to do something like "all or best
subsets regression analysis" where I think the OP wants the
combination of 5 predictors out of the 60 that maximally
discriminate the groups or predicts the group membership.

The OP is unclear about how many groups are being
"discriminated" -- if only two, then a form of logistic regression
might suffice, if more, then a form of multinomial regression
might work.  The old BMDP series had the program "9R" or
"All Possible Subsets Regression" but even if this were available
the 1992 version was limited to 25 predictors in the equation.

Perhaps Jon Peck or someone else can provide more information
about whether a similar analysis could be done through SPSS'
SIMPLAN procedure -- a paper by Oshima & Dell-Ross provide
an example using "Automatic Linear Modeling" through SIMPLAN
which might be useful; it can be obtained here:
http://digitalcommons.georgiasouthern.edu/gera/2016/2016/1?utm_source=digitalcommons.georgiasouthern.edu%2Fgera%2F2016%2F2016%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages

However, it is unclear to me whether this procedure would
all of the 5 predictor combinations models using 60 predictors.

Raynald Levesque provided syntax for a version of all possible
regression using SPSS 2001 capabilities -- It may or may not
be able to do what the OP wants but it might be worth taking
a looking at; see:
http://spsstools.net/en/syntax/syntax-index/regression-repeated-measures/do-all-subsets-regressions/

Again, I think the OP needs to be clearer on what the goal of
the analysis is:  is it equations for all 5 variable combinations
or the "best" combination of 5 variables or something else.

-Mike Palij
New York University
[hidden email]



On Wednesday, June 07, 2017 8:16 AM, David Marso wrote:
OK:  First off DISCRIM takes a grouping variable and a set of typically
continuous discriminants.

What is the point of doing this all possible sub-subsets business?
There is nothing built into SPSS to generate these combinations but feel
free to roll your own if you have sufficient grasp of algorithms to carry
out the construction of the desired combinations.
If I were to do this I would opt to use the MATRIX language rather than a
buttload of LOOP LOOP IF blah blah blah blah type crap.  IIRC some time ago
we had a thread for dealing with APSS type simulations.
Meanwhile, best of luck.  You have some deep reading ahead of you. Happy
googling...


Arora, Manoj (IMDLR) wrote
Hi All,

I have around 60 variables and I want to run discriminant analysis based
on 60 C 5 combination means I want syntax run discriminant with different
set of variables eg

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then
on v1, v2, v3, v4, v6 and so on. So each time it should change the
variable and run the output.


Thanks & Regards

Manoj Kumar


Kantar Disclaimer&lt;http://www.kantar.com/disclaimer.html&gt;

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

David Marso
Administrator
FWIW:
 Here is somewhat cleaner and general/dangerous in the wrong hands sort of code.
Presumably someone with the sufficient knowledge could build the necessary MATRIX materials and use the resulting file to control the SWEEP operations central to the DISCRIM algorithms.
So here goes.  If you shoot yourself in the foot don't come begging me for a bandaid.
I'm more than likely to shoot you in the other foot and watch you hop away

NEW FILE.
DEFINE !Pop (!POS !CHAREND ("/") /!POS !CMDEND)
MATRIX.
GET data /FILE * / VARIABLES !1.
LOOP row=1 TO NROW(data).
LOOP j=data(row,NCOL(data))+1 TO 60.
SAVE {data(row,:),j} /OUTFILE * /VARIABLES !2.
END LOOP.
END LOOP.
END MATRIX.
!ENDDEFINE.

DEFINE Head (!POS !CMDEND )
DATASET CLOSE ALL.
PRESERVE.
SET MXLOOPS=100000000.
MATRIX.
LOOP i=1 TO !1-1.
LOOP j=i+1 TO !1.
SAVE {i,j} /OUTFILE * /VARIABLES x1 x2 .
END LOOP.
END LOOP.
END MATRIX.
!ENDDEFINE .


!head 60 .
!pop x1 x2          /x1 x2 x3 .
!pop x1 x2 x3       /x1 x2 x3 x4.
!pop x1 x2 x3 x4    /x1 x2 x3 x4 x5.

RESTORE.

FREQUENCIES X1.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Rich Ulrich
In reply to this post by Arora, Manoj (IMDLR)

I have not seen it mentioned explicitly, but "best subset" regression (or discrimination) has no 

supporters, as a strategy, among statisticians in the areas that I know anything about. It is

considered, most likely, as a step worse than "stepwise."


For example, there are a large group of journals in psychology which will automatically challenge you

very firmly if you try to use stepwise.  I would say that wise reviewers /might/ be able to justify the

use of one of them, mainly as an exercise for achieving conciseness given a set of excellent-but-highly-

redundant predictors. But I expect that the general approach by reviewers at those journals is probably a flat,

unarguable No.  And I, too, would probably say that to someone starting with 60 variables.


 - "Stepwise" is widely disparaged for its misleading nominal tests and widespread misinterpretation

of results by abusers; and "best subset" seems to take those faults a step further.  I expect you can

still Google [Frank Harrell stepwise] to get a good summary of problems with stepwise.  (I have been

referring people to those ever since he posted them in one of the Usenet .stats groups, umpteen

years ago.)


On the other hand, I do remember, from maybe 25 years ago, reading about strategies for rationally

trimming the sets that have to be tested.  Based on univariate correlations and intercorrelations and assuming

a few variables do well, then it can be possible to eliminate (even) a large fraction of the remaining variables

without testing them at all.


--

Rich Ulrich



From: SPSSX(r) Discussion <[hidden email]> on behalf of Arora, Manoj (IMDLR) <[hidden email]>
Sent: Thursday, June 1, 2017 8:09:33 AM
To: [hidden email]
Subject: Executing Discriminant on different set of questions
 

Hi All,

 

I have around 60 variables and I want to run discriminant analysis based on 60 C 5 combination means I want syntax run discriminant with different set of variables eg

 

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then on v1, v2, v3, v4, v6 and so on. So each time it should change the variable and run the output.

 

 

Thanks & Regards
 
Manoj Kumar



Kantar Disclaimer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Mike
Hi Rich,
 
I don't want to start a religious war but maybe you should read
Chapter 8 "Assessing the Importance of Regressors" in Darlington and
Hayes' "Regression Analysis and Linear Models".  After identifying the
types of problems associated with stepwise procedures in Chapter 7,
they review how Dominance theory as developed by Budescu (a search
on Google Scholar will turn up relevant publications) can be used to
identify the relative importance of regressors in an equation.  Dominance
analysis uses all subsets regression results in order to determine which
subset is better/dominates other subsets.  This is what Darlington and Hayes
RLM macro for SPSS and SAS does.  This is somewhat different from
traditional all/best subsets regression but it does show how all subsets
regression can be useful.  People who dismiss all subsets regression without
understanding if it is being used in this fashion is allowing their biases to
blind them to new developments.  If you have questions about this you
might want to contact Andrew Hayes about how this works and is
implemented in SPSS.
 
HTH.
 
-Mike Palij
New York University
 
 
 
----- Original Message -----
Sent: Thursday, June 08, 2017 12:32 AM
Subject: Re: Executing Discriminant on different set of questions

I have not seen it mentioned explicitly, but "best subset" regression (or discrimination) has no 

supporters, as a strategy, among statisticians in the areas that I know anything about. It is

considered, most likely, as a step worse than "stepwise."


For example, there are a large group of journals in psychology which will automatically challenge you

very firmly if you try to use stepwise.  I would say that wise reviewers /might/ be able to justify the

use of one of them, mainly as an exercise for achieving conciseness given a set of excellent-but-highly-

redundant predictors. But I expect that the general approach by reviewers at those journals is probably a flat,

unarguable No.  And I, too, would probably say that to someone starting with 60 variables.


 - "Stepwise" is widely disparaged for its misleading nominal tests and widespread misinterpretation

of results by abusers; and "best subset" seems to take those faults a step further.  I expect you can

still Google [Frank Harrell stepwise] to get a good summary of problems with stepwise.  (I have been

referring people to those ever since he posted them in one of the Usenet .stats groups, umpteen

years ago.)


On the other hand, I do remember, from maybe 25 years ago, reading about strategies for rationally

trimming the sets that have to be tested.  Based on univariate correlations and intercorrelations and assuming

a few variables do well, then it can be possible to eliminate (even) a large fraction of the remaining variables

without testing them at all.


--

Rich Ulrich



From: SPSSX(r) Discussion <[hidden email]> on behalf of Arora, Manoj (IMDLR) <[hidden email]>
Sent: Thursday, June 1, 2017 8:09:33 AM
To: [hidden email]
Subject: Executing Discriminant on different set of questions
 

Hi All,

 

I have around 60 variables and I want to run discriminant analysis based on 60 C 5 combination means I want syntax run discriminant with different set of variables eg

 

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then on v1, v2, v3, v4, v6 and so on. So each time it should change the variable and run the output.

 

 

Thanks & Regards
 
Manoj Kumar



Kantar Disclaimer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Bruce Weaver
Administrator
In reply to this post by Rich Ulrich
One of Stata's FAQs summarizes the old usenet discussions Rich is talking about below.  It even mentions your name, Rich.  

http://www.stata.com/support/faqs/statistics/stepwise-regression-problems/


Rich Ulrich wrote
I have not seen it mentioned explicitly, but "best subset" regression (or discrimination) has no

supporters, as a strategy, among statisticians in the areas that I know anything about. It is

considered, most likely, as a step worse than "stepwise."


For example, there are a large group of journals in psychology which will automatically challenge you

very firmly if you try to use stepwise.  I would say that wise reviewers /might/ be able to justify the

use of one of them, mainly as an exercise for achieving conciseness given a set of excellent-but-highly-

redundant predictors. But I expect that the general approach by reviewers at those journals is probably a flat,

unarguable No.  And I, too, would probably say that to someone starting with 60 variables.


 - "Stepwise" is widely disparaged for its misleading nominal tests and widespread misinterpretation

of results by abusers; and "best subset" seems to take those faults a step further.  I expect you can

still Google [Frank Harrell stepwise] to get a good summary of problems with stepwise.  (I have been

referring people to those ever since he posted them in one of the Usenet .stats groups, umpteen

years ago.)


On the other hand, I do remember, from maybe 25 years ago, reading about strategies for rationally

trimming the sets that have to be tested.  Based on univariate correlations and intercorrelations and assuming

a few variables do well, then it can be possible to eliminate (even) a large fraction of the remaining variables

without testing them at all.


--

Rich Ulrich


________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Arora, Manoj (IMDLR) <[hidden email]>
Sent: Thursday, June 1, 2017 8:09:33 AM
To: [hidden email]
Subject: Executing Discriminant on different set of questions

Hi All,

I have around 60 variables and I want to run discriminant analysis based on 60 C 5 combination means I want syntax run discriminant with different set of variables eg

I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then on v1, v2, v3, v4, v6 and so on. So each time it should change the variable and run the output.


Thanks & Regards

Manoj Kumar


Kantar Disclaimer<http://www.kantar.com/disclaimer.html> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email]<mailto:[hidden email]> (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Art Kendall
In reply to this post by Rich Ulrich
I tend to think of all possible regressions as having a sort of powered measure of riskiness over the very risky stepwise.  (Strongly distinguishing stepwise approaches from stepped approaches.)

Until we hear from the OP about the nature of the actual problem, # of groups, the substantive meaning of the grouping variable, the substantive nature of the 60 discriminating variables, the number of cases, the goals of the effort, etc., etc. we can only speculate about the many possible ways to shoot oneself in the foot.

For example it would take many many multiples of 5 million cases to fit that many models.
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Mike
Art,

I think the issue of the validity of "all subsets regression",
especially in the context of a dominance theory approach,
should probably be kept separate from the original
problem of selecting 5 predictors out of a group of 60
which strikes me as being very peculiar and odd.
A lot more info is needed to understand why one would
want to do this and why the group of 60 variables are
not reduced to a smaller number of constructs given
that they are probably correlated.

-Mike Palij
New York University
[hidden email]


On Thursday, June 08, 2017 8:39 AM Art Kendall wrote:

>I tend to think of all possible regressions as having a sort of powered
> measure of riskiness over the very risky stepwise.  (Strongly
> distinguishing
> stepwise approaches from stepped approaches.)
>
> Until we hear from the OP about the nature of the actual problem, # of
> groups, the substantive meaning of the grouping variable, the
> substantive
> nature of the 60 discriminating variables, the number of cases, the
> goals of
> the effort, etc., etc. we can only speculate about the many possible
> ways to
> shoot oneself in the foot.
>
> For example it would take many many multiples of 5 million cases to
> fit that
> many models.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Art Kendall
agree.  From the wording of the query from the OP, I guessed that the OP was not in a position to use very esoteric/advanced techniques.
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Executing Discriminant on different set of questions

Bruce Weaver
Administrator
In reply to this post by Mike
Here's an article that I imagine covers pretty much the same ground as the chapter in Darlington & Hayes.

https://www.researchgate.net/profile/David_Budescu/publication/10608588_The_Dominance_Analysis_Approach_for_Comparing_Predictors_in_Multiple_Regression/links/0c960527085ed14e21000000.pdf

Here's another one that discusses applying dominance analysis to logistic regression.

https://www.researchgate.net/profile/Razia_Azen/publication/215446081_Using_Dominance_Analysis_to_Determine_Predictor_Importance_in_Logistic_Regression/links/5745cfe008ae9ace8424315e.pdf


Mike wrote
Hi Rich,

I don't want to start a religious war but maybe you should read
Chapter 8 "Assessing the Importance of Regressors" in Darlington and
Hayes' "Regression Analysis and Linear Models".  After identifying the
types of problems associated with stepwise procedures in Chapter 7,
they review how Dominance theory as developed by Budescu (a search
on Google Scholar will turn up relevant publications) can be used to
identify the relative importance of regressors in an equation.  Dominance
analysis uses all subsets regression results in order to determine which
subset is better/dominates other subsets.  This is what Darlington and Hayes
RLM macro for SPSS and SAS does.  This is somewhat different from
traditional all/best subsets regression but it does show how all subsets
regression can be useful.  People who dismiss all subsets regression without
understanding if it is being used in this fashion is allowing their biases to
blind them to new developments.  If you have questions about this you
might want to contact Andrew Hayes about how this works and is
implemented in SPSS.

HTH.

-Mike Palij
New York University
[hidden email]



  ----- Original Message -----
  From: Rich Ulrich
  To: [hidden email] 
  Sent: Thursday, June 08, 2017 12:32 AM
  Subject: Re: Executing Discriminant on different set of questions


  I have not seen it mentioned explicitly, but "best subset" regression (or discrimination) has no  


  supporters, as a strategy, among statisticians in the areas that I know anything about. It is

  considered, most likely, as a step worse than "stepwise."





  For example, there are a large group of journals in psychology which will automatically challenge you

  very firmly if you try to use stepwise.  I would say that wise reviewers /might/ be able to justify the

  use of one of them, mainly as an exercise for achieving conciseness given a set of excellent-but-highly-

  redundant predictors. But I expect that the general approach by reviewers at those journals is probably a flat,


  unarguable No.  And I, too, would probably say that to someone starting with 60 variables.





   - "Stepwise" is widely disparaged for its misleading nominal tests and widespread misinterpretation

  of results by abusers; and "best subset" seems to take those faults a step further.  I expect you can

  still Google [Frank Harrell stepwise] to get a good summary of problems with stepwise.  (I have been


  referring people to those ever since he posted them in one of the Usenet .stats groups, umpteen


  years ago.)




  On the other hand, I do remember, from maybe 25 years ago, reading about strategies for rationally


  trimming the sets that have to be tested.  Based on univariate correlations and intercorrelations and assuming


  a few variables do well, then it can be possible to eliminate (even) a large fraction of the remaining variables


  without testing them at all.





  --


  Rich Ulrich






------------------------------------------------------------------------------

  From: SPSSX(r) Discussion <[hidden email]> on behalf of Arora, Manoj (IMDLR) <[hidden email]>
  Sent: Thursday, June 1, 2017 8:09:33 AM
  To: [hidden email]
  Subject: Executing Discriminant on different set of questions

  Hi All,

   

  I have around 60 variables and I want to run discriminant analysis based on 60 C 5 combination means I want syntax run discriminant with different set of variables eg

   

  I have variables v1 to v60, first it should run on v1, v2, v3, v4, v5 then on v1, v2, v3, v4, v6 and so on. So each time it should change the variable and run the output.

   

   

  Thanks & Regards
   
  Manoj Kumar



  Kantar Disclaimer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
  ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Loading...