Re: selecting lowest scores if missing data or ties
Using restructure it is vary easy to find min max location in a file .
Here is a example of finding the minimum of 5 quiz and deleting the
lowest one .
If you need more control on the deleting procces you can add more
rolls like if quiz3 and quiz5 are the lowest select quez5 .
input program .
loop ii=1 to 10 .
compute quiz1= trunc(uniform(6)).
compute quiz2= trunc(uniform(6)).
compute quiz3= trunc(uniform(6)).
compute quiz4= trunc(uniform(6)).
compute quiz5= trunc(uniform(6)).
end case .
end loop .
end file .
end input program .
VARSTOCASES /MAKE quiz FROM quiz1 quiz2 quiz3 quiz4 quiz5
/INDEX = Index1(5)
/KEEP = ii
/NULL = KEEP.
sort cases by ii quiz .
add files file=* /by ii/ first=start .
if start eq 1 #seq=0 .
select if seq gt 1 .
sort cases by ii index1 .
/ID = ii
/INDEX = Index1
/GROUPBY = VARIABLE .
Ben Gurion U
Dale Glaser wrote:
> Hi all....based on Levesques' syntax for selecting maximum score given the case, I was trying same for the case when there are 5 quizzes and isolating (and deleting) the lowest score.......so assuming one has sorted by ID I used the following syntax (appended below)......then I did an incredibly inelegant way of flagging the cases with the lowest scores (which will be deleted in the total summed scoring of the quizzes) by just recreating the initial raw score and then basically mapping the created ranked variable (where a ranked value of 5 is the lowest score) with the raw score and then using some implausible integer (e.g., -1) and code for missing......so though a little cumbersome this works fine.........however, if there is missing data, say a student takes only four of the quizzes, and given they get to drop one quiz, that student will just get a summed score for all four quizzes...the lowest score for that student will not be deleted....so any suggestions as to not
> coding for the lowest score if there is any missing data (akin to using a compute statement such as: 'sum.4' when at least four scores must be answered to compute a score).
> Also, what if there is the full complement of data for the five quizzes, but there are ties
> for the lowest scores:
> ......when I construct the vector for the ranked variable, as you would guess, it will show up as (for now, the value of 1 being the lowest score):
> ............what I would like to do is somehow have a unique number and delete only one of the lowest numbers.................any suggestions?
> thank you very much for your time.....dale
> ***five quizzes****
> vector quiz = q1 to q5 .
> loop quizvar = 1 to 5.
> compute quizrate = quiz(quizvar) .
> xsave outfile = 'C:\temp1.sav'
> / keep = id quizvar quizrate.
> end loop.
> ***get the temp file****
> rank variables = quizrate (d) by id / ties = low / rank into quizrank .
> numeric RANKq1 RANKq2 RANKq3 RANKq4 rankq5 (f4.1).
> vector quizr = rankq1 to rankq5 .
> compute quizr(quizvar) = quizrank.
> aggregate outfile = *
> /presorted / break = id /RANKq1 RANKq2 RANKq3 RANKq4 rankq5 = min(rankq1 to rankq5).
> MATCH FILES /FILE = 'C:\Documents\lowscore.sav'
> /FILE = * /BY id .
> **converts lowest score (with value of '5')***and can do this for each variable**
> ****best to autorecode or rename so don't write over old variables....****
> compute q1new=q1.
> compute q2new=q2.
> compute q3new=q3.
> compute q4new=q4.
> compute q5new=q5.
> if (rankq1 eq 5) q1new=-1.
> if (rankq2 eq 5) q2new=-1.
> if (rankq3 eq 5) q3new=-1.
> if (rankq4 eq 5) q4new=-1.
> if (rankq5 eq 5) q5new=-1.
> missing values q1new to q5new (-1).
> freq var=q1new to q5new.
> compute totquiz=sum(q1new to q5new).
> list var=q1new to q5new totquiz.
> Dale Glaser, Ph.D.
> Principal--Glaser Consulting
> 4003 Goldfinch St, Suite G
> San Diego, CA 92103
> phone: 619-220-0602
> fax: 619-220-0412
> email: [hidden email] > website: www.glaserconsult.com
Re: selecting lowest scores if missing data or ties
At 07:22 PM 6/14/2006, Dale Glaser asked, but it's hard to quote. Let
me see if I understand:
* Students are given 5 quizzes, and a score on each. (In the test data,
the scores are from 1 to 9.) Quizzes may be missed, in which case the
corresponding score is missing.
* You want to know
- The lowest score each student received, counting 'missing' as the
lowest possible - if the student missed any quiz, the 'lowest' score is
- The first quiz on which that student received that lowest score
- The student's mean score, after (one instance of) the lowest score
has been dropped. (This is simply the mean score, if any quiz has been
Hillel Vardi posted a neat wide-> long-> wide solution. (That is, from
each student record, it creates a separate record for each quiz, drops
the one with the lowest score, and reassembles the student record.)
It replaces the lowest score with system-missing. That may be what you
want, but I'm not sure it'll handle missing quiz scores the way you
want to. And it loses information: you no longer know what was the
lowest score, only that (one instance of it) is no longer in the list.
Anyway, here's a 'wide' solution, processing within each student's
record. (It uses VECTOR/LOOP logic, which is a common alternative to
VARSTOCASES/CASESTOVAR logic.) It does not eliminate or change the
lowest score. However, it calculates,
- On which quiz the lowest (or missing) score first occurred
- What that lowest score was - missing, if that's what it was
- The mean of all quizzes the student took (variable MEAN.5)
- The mean of four quizzes, dropping one instance of the lowest score,
if the student took all five. (If the student missed any quiz, the two
scores are the same.)