# Lines on scatterplot Classic List Threaded 3 messages Reply | Threaded
Open this post in threaded view
|

## Lines on scatterplot

 This is not for analysis, but a revised version of my SPSS tutorial 4.5.1 Graphic teaching aid for regression and correlation [http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/4.5.1_graphic_teaching_aid_for_regression_and_correlation.pdf]. This was written on artficial data, but I now want to produce similar charts on real data, particularly the regression lines of X on Y and Y on X, plus the overlay of both regression lines to show that Pearson's r is the cosine of the angle between the two. With these data from 21 countries in Europe, plus UK and USA (N=23) V1 = Homicide rate; V2 Gini coefficient. 1.0 34.94 0.7 30.25 1.7 28.53 1.6 33.68 1.0 26.63 0.8 29.02 2.2 27.74 1.3 33.78 1.0 31.14 1.6 34.48 1.1 32.30 2.0 42.78 0.9 34.41 0.9 28.73 0.6 25.86 1.1 33.25 1.2 35.84 1.6 27.32 0.9 35.79 1.0 26.81 0.7 32.72 1.2 34.81 4.8 40.46 . . and this syntax: DATASET ACTIVATE DataSet0. STATS REGRESS PLOT YVARS=homicide XVARS=Gini /OPTIONS CATEGORICAL=BARS GROUP=1 BOXPLOTS INDENT=15 YSCALE=75 /FITLINES APPLYTO=TOTAL. I can produce the following chart: I can then copy the chart to Word, Right-click >> Wrap text >> Behind and manually add: 1) a horizontal line through mean y with repeated underscores ____ 2) a vertical line through mean x with repeated rows of "¦" ¦ ¦ ¦ Can SPSS produce a chart with a) a vertical line through mean x b) a horizontal line through mean y (preferably both together)? After that, can I get a chart with: a) horizontal lines from each data point to mean x and b) vertical lines from each data point to mean y? Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

## Re: Lines on scatterplot

 It can -- you have to supply the mean of each axis (at least I cannot calculate them inside GPL). This makes the lines of the means limited to the minimum and maximum of the data. Otherwise the have to be hardcoded in gpl (with GUIDE: form.line(...)), or added manually added (possibly by a template). To modify the code you only have to change the variable names in the GGRAPH VARIABLES section. They are renamed to minimize changes in the GPL code if other variables are used. The only changes in the GPL code are the axis titles  found in GUIDE: commands in the Main Graph section. HTH, PR DATASET CLOSE ALL. PRESERVE. SET DECIMAL=DOT. DATA LIST free / Homicide (F8.1) Gini (F8.2). BEGIN DATA 1.0 34.94 0.7 30.25 1.7 28.53 1.6 33.68 1.0 26.63 0.8 29.02 2.2 27.74 1.3 33.78 1.0 31.14 1.6 34.48 1.1 32.30 2.0 42.78 0.9 34.41 0.9 28.73 0.6 25.86 1.1 33.25 1.2 35.84 1.6 27.32 0.9 35.79 1.0 26.81 0.7 32.72 1.2 34.81 4.8 40.46 END DATA. RESTORE. DATASET NAME JFH. *Get mean values. AGGREGATE   /OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES   /BREAK=   /homicide_mean=MEAN(homicide)   /Gini_mean=MEAN(Gini). *Labels are set in Main Graph GUIDE: commands. GGRAPH   /GRAPHDATASET    NAME          = "graphdataset"    VARIABLES     = Gini           [NAME = 'x'   ]                    Gini_mean      [NAME = 'x_mn']                    homicide       [NAME = 'y'   ]                    homicide_mean  [NAME = 'y_mn']    MISSING       = LISTWISE    REPORTMISSING = NO   /GRAPHSPEC    SOURCE        = INLINE. BEGIN GPL   SOURCE:  s=userSource(id("graphdataset"))   DATA:    x   =col(source(s), name("x"))   DATA:    x_mn=col(source(s), name("x_mn"))   DATA:    y   =col(source(s), name("y"))   DATA:    y_mn=col(source(s), name("y_mn"))   COMMENT:  MAIN GRAPH   GRAPH:     begin(origin(10%, 10%), scale(70%, 70%))     GUIDE:   axis(dim(1), label("Gini"))     GUIDE:   axis(dim(2), label("Homicide"))     COMMENT: vertical x-mean line     ELEMENT: line(position(x_mn*y)                  ,size(size."0.5")                  ,color(color.gray))                  )     COMMENT: horizontal y-mean line     ELEMENT: line(position(x*y_mn)                  ,size(size."0.5")                  ,color(color.gray))                  )     COMMENT: vertical lines     ELEMENT: edge(position(link.join(x*(y_mn+y)))                  ,shape(shape.half_dash)                  ,size(size."0.25")                  )     COMMENT: horizontal lines     ELEMENT: edge(position(link.join((x_mn+x)*y))                  ,shape(shape.half_dash)                  ,size(size."0.25")                  )     ELEMENT: point(position(x*y))   GRAPH: end()   COMMENT: TOP LETTERBOX   GRAPH:     begin(origin(10%, 0%), scale(70%, 10%))     COORD:   rect(dim(1))     GUIDE:   axis(dim(1), ticks(null()))     ELEMENT: schema(position(bin.quantile.letter(x))                    ,size(size."80%")                    ,color.interior(color.lightgray)                    )   GRAPH: end()   COMMENT: RIGHT LETTERBOX   GRAPH:     begin(origin(80%, 10%), scale(10%, 70%))     COORD:   transpose(rect(dim(1)))     GUIDE:   axis(dim(1), ticks(null()))     ELEMENT: schema(position(bin.quantile.letter(y))                    ,size(size."80%")                    ,color.interior(color.lightgray)                    )   GRAPH: end() END GPL. John F Hall wrote > This is not for analysis, but a revised version of my SPSS tutorial *4.5.1 > Graphic teaching aid for regression and > correlation*[http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/4.5.1_graphic_teaching_aid_for_regression_and_correlation.pdf]. > This was written on artficial data, but I now want to produce similar > charts > on real data, particularly the regression lines of X on Y and Y on X, plus > the overlay of both regression lines to show that Pearson's */r/*// is the > cosine of the angle between the two.With these data from 21 countries in > Europe, plus UK and USA (N=23) V1 = Homicide rate; V2 Gini coefficient.1.0       > 34.940.7       30.251.7       28.531.6       33.681.0       26.630.8       > 29.022.2       27.741.3       33.781.0       31.141.6       34.481.1       > 32.302.0       42.780.9       34.410.9       28.730.6       25.861.1       > 33.251.2       35.841.6       27.320.9       35.791.0       26.810.7       > 32.721.2       34.814.8       40.46. . and this syntax:DATASET ACTIVATE > DataSet0.STATS REGRESS PLOT YVARS=homicide XVARS=Gini /OPTIONS > CATEGORICAL=BARS GROUP=1 BOXPLOTS INDENT=15 YSCALE=75 /FITLINES > APPLYTO=TOTAL. > I can produce the following chart: > <http://spssx-discussion.1045642.n5.nabble.com/file/t27438/Scatterplot.jpg>  > > I can then copy the chart to Word,  Right-click >> Wrap text >> Behind and > manually add:1) a horizontal line through mean y with repeated > underscores > ____ 2) a vertical line through mean x with repeated rows of  "¦"¦ ¦ ¦ > Can SPSS produce a chart with > a) a vertical line through mean x > b) a horizontal line through mean y (preferably both together)?   > After that, can I get a chart with:   > a) horizontal lines from each data point to mean x and > b) vertical lines from each data point to mean y? > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/> > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA >  (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

## Re: Lines on scatterplot

 In reply to this post by John F Hall After a couple of off-list exchanges (too complex for me to post via Nabble) Dr Peder Rogman "PRogman" supplied the following syntax which has produced exactly what I need.   * Encoding: UTF-8.DATASET CLOSE ALL.PRESERVE.SET DECIMAL=DOT. DATA LIST free /Homicide (F8.1) Gini (F8.2).BEGIN DATA1.0 34.94 0.7 30.25 1.7 28.53 1.6 33.68 1.0 26.630.8 29.02 2.2 27.74 1.3 33.78 1.0 31.14 1.6 34.481.1 32.30 2.0 42.78 0.9 34.41 0.9 28.73 0.6 25.861.1 33.25 1.2 35.84 1.6 27.32 0.9 35.79 1.0 26.810.7 32.72 1.2 34.81 4.8 40.46END DATA.RESTORE. DATASET NAME JFH.*Get mean values.AGGREGATE  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES  /BREAK=  /homicide_mean=MEAN(homicide)  /Gini_mean=MEAN(Gini). *Updated GGRAPH.GGRAPH  /GRAPHDATASET   NAME          = "graphdataset"   VARIABLES     = Gini           [NAME = 'x'   ]                   Gini_mean      [NAME = 'x_mn']                   homicide       [NAME = 'y'   ]                   homicide_mean  [NAME = 'y_mn']   MISSING       = LISTWISE   REPORTMISSING = NO  /GRAPHSPEC   DEFAULTTEMPLATE = yes   SOURCE        = INLINE.BEGIN GPL  SOURCE:  s=userSource(id("graphdataset"))  DATA:    x   =col(source(s), name("x"))  DATA:    x_mn=col(source(s), name("x_mn"))  DATA:    y   =col(source(s), name("y"))  DATA:    y_mn=col(source(s), name("y_mn"))   TRANS:   yr  =eval(-1.591 + 0.091 * x)  TRANS:   xr  =eval(29.075 + 2.348 * y)   COMMENT:  MAIN GRAPH  GRAPH:     begin(origin(10%, 10%), scale(70%, 70%))    GUIDE:   axis(dim(1), label("Gini"))    GUIDE:   axis(dim(2), label("Homicide"))     COMMENT: vertical x-mean line    ELEMENT: line(position(x_mn*y)                 ,size(size."1")                 ,color(color.red))                 )    COMMENT: horizontal y-mean line    ELEMENT: line(position(x*y_mn)                 ,size(size."1")                 ,color(color.blue))                 )    COMMENT: vertical lines    ELEMENT: edge(position(link.join(x*(y_mn+y)))                 ,shape(shape.half_dash)                 ,size(size."0.50")                 ,color(color.blue)                 )    COMMENT: horizontal lines    ELEMENT: edge(position(link.join((x_mn+x)*y))                 ,shape(shape.half_dash)                 ,size(size."0.50")                 ,color(color.red)                 )    ELEMENT: point(position(x*y)                  )     ELEMENT: line(position(x*yr)                 ,color(color.green)                 ,size(size."2")                 )     ELEMENT: line(position(xr*y)                 ,color(color.purple)                 ,size(size."2")                 )  GRAPH: end()   COMMENT: TOP LETTERBOX  GRAPH:     begin(origin(10%, 0%), scale(70%, 10%))    COORD:   rect(dim(1))    GUIDE:   axis(dim(1), ticks(null()))    ELEMENT: schema(position(bin.quantile.letter(x))                   ,size(size."80%")                   ,color.interior(color.lightblue)                   )  GRAPH: end()   COMMENT: RIGHT LETTERBOX  GRAPH:     begin(origin(80%, 10%), scale(10%, 70%))    COORD:   transpose(rect(dim(1)))    GUIDE:   axis(dim(1), ticks(null()))    ELEMENT: schema(position(bin.quantile.letter(y))                   ,size(size."80%")                   ,color.interior(color.lightblue)                   )  GRAPH: end()END GPL. [NB: The output needs a note to explain that the boxplots show median and IQR, not mean and sd.]   The exercise will form part of a new tutorial in which PRogman's contribution will be handsomely acknowledged.  He's a star!  The data, extracted from World Bank World Development Indicators, are in Table 8.2: Income inequality and homicide rates in a selection of economically developed countries (page 175) of an impressive new textbook: Robert de Vries: Author profile, Kent.ac.uk )Critical Statistics: Seeing Beyond the Headlines: Publisher) (Macmillan, Red Globe Press, 2018)The companion site (https://www.macmillanihe.com/companion/De-Vries-Critical-Statistics/) has links to all the URLs featured in the book and my initial comments can be seen on https://surveyresearch.weebly.com/de-vries-2018.html  It's long time since I did any programming (in Algol on a KDF9, 1964-68; on a PDP11, 1968-70: input and output on 8-hole paper tape) but I'll see if I can parse the syntax and play with the elements to see what I get. No boxplots outsideNo equations on the chartLabels indicating country (but might be too cluttered)Build up chart one step at a time (but can be done by reverse editing the spv chart) Next step would be an arc showing the angle between the lines and its cosine (Pearson's r) and an animated applet to get the mean lines to rotate and stabilise when the (elastic) tension has levelled out.  Nice little project for a graphics student? John F Hall  MA (Cantab) Dip Ed (Dunelm)[Retired academic survey researcher] Email:          [hidden email] Website:     Journeys in Survey Research  ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD