

SPSSXL Digest  10 May 2020 to 11 May 2020 (#202097)
I am calculating percentages in IBM Statistics 26 and finding a small number of cases where I am expecting to yield a whole number percent but am not getting
this result.
For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would get a value of 56 with no resulting decimal when performing the same computation in
Excel or SQL. In other cases, such as 2/4 I would get 50 with no resulting decimals which matches MS products. This difference seems to only occur in a handful of cases by I’m not sure why.
I know I have come across this issue before, but can’t remember why this occurs. I believe IBM also has some documentation on this issue but can’t seem to locate
it from a previous conversation with them. I also can’t seem to put in a support request with IBM at the moment.
Does anyone know the reason for this difference and if there is a workaround to yield the same result as MS products?
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Calculations in SPSS Statistics are done in double precision floating point hardware. That allows for about 1516 significant figures. The value of 14/25 is, in greater precision 0.560000000000000053290705182007513940334320068359375
which has to be rounded or truncated to fit in a double precision floating point number, so if rounded, it will end in 1 while if truncated it will end in 0.
Statistics uses the IEEE standard for floating point arithmetic. Excel might use truncation or it might just be doing the calculation in a different order
Note that a difference this tiny is utterly negligible except in the case where you are comparing two floating point numbers with fractional parts for exact equality. That is generally not a good thing to do as even a difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal difference. On Tue, May 12, 2020 at 11:25 AM Veena Nambiar < [hidden email]> wrote:
I am calculating percentages in IBM Statistics 26 and finding a small number of cases where I am expecting to yield a whole number percent but am not getting
this result.
For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would get a value of 56 with no resulting decimal when performing the same computation in
Excel or SQL. In other cases, such as 2/4 I would get 50 with no resulting decimals which matches MS products. This difference seems to only occur in a handful of cases by I’m not sure why.
I know I have come across this issue before, but can’t remember why this occurs. I believe IBM also has some documentation on this issue but can’t seem to locate
it from a previous conversation with them. I also can’t seem to put in a support request with IBM at the moment.
Does anyone know the reason for this difference and if there is a workaround to yield the same result as MS products?
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reminder of what Jon is talking about: values of 1/2, 1/4, ...,
(powers of 2) are represented precisely in binary notation of
the usual computer arithmetic; whereas 1/10, 1/100, ..., are not.
You do run into less chance of a problem if you divide by 10 or 25, etc.,
/last/ in your computation. For integer compares of floating point
numbers to be secure (in some sense), you can test for a narrow range
or round or truncate.

Rich Ulrich
Calculations in SPSS Statistics are done in double precision floating point hardware. That allows for about 1516 significant figures. The value of 14/25 is, in greater precision
0.560000000000000053290705182007513940334320068359375
which has to be rounded or truncated to fit in a double precision floating point number, so if rounded, it will end in 1 while if truncated it will end in 0.
Statistics uses the IEEE standard for floating point arithmetic. Excel might use truncation or it might just be doing the calculation in a different order
Note that a difference this tiny is utterly negligible except in the case where you are comparing two floating point numbers with fractional parts for exact equality. That is generally not a good thing to
do as even a difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal difference.
On Tue, May 12, 2020 at 11:25 AM Veena Nambiar < [hidden email]> wrote:
I am calculating percentages in IBM Statistics 26 and finding a small number of cases where I am expecting to yield a whole number percent but am not getting
this result.
For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would get a value of 56 with no resulting decimal when performing the same computation
in Excel or SQL. In other cases, such as 2/4 I would get 50 with no resulting decimals which matches MS products. This difference seems to only occur in a handful of cases by I’m not sure why.
I know I have come across this issue before, but can’t remember why this occurs. I believe IBM also has some documentation on this issue but can’t seem
to locate it from a previous conversation with them. I also can’t seem to put in a support request with IBM at the moment.
Does anyone know the reason for this difference and if there is a workaround to yield the same result as MS products?
===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Administrator

Rich's comments prompted me to try this:
NEW FILE.
DATASET CLOSE ALL.
DATA LIST FREE / junk(F1).
BEGIN DATA
1 1 1
END DATA.
COMPUTE x1 = 14/25*100.
COMPUTE x2 = 100*14/25.
COMPUTE flag1 = x1 EQ 56.
COMPUTE flag2 = x2 EQ 56.
FORMATS flag1 flag2 (F1).
LIST.
Here is the output:
junk x1 x2 flag1 flag2
1 56.00 56.00 0 1
1 56.00 56.00 0 1
1 56.00 56.00 0 1
Rich Ulrich wrote
> Reminder of what Jon is talking about: values of 1/2, 1/4, ...,
> (powers of 2) are represented precisely in binary notation of
> the usual computer arithmetic; whereas 1/10, 1/100, ..., are not.
>
> You do run into less chance of a problem if you divide by 10 or 25, etc.,
> /last/ in your computation. For integer compares of floating point
> numbers to be secure (in some sense), you can test for a narrow range
> or round or truncate.
>
> 
> Rich Ulrich
>
> ________________________________
> From: SPSSX(r) Discussion <
> SPSSXL@.UGA
> > on behalf of Jon Peck <
> jkpeck@
> >
> Sent: Tuesday, May 12, 2020 1:55 PM
> To:
> SPSSXL@.UGA
> <
> SPSSXL@.UGA
> >
> Subject: Re: Issue with Computation Differences Between IBM Statistics 26
> and MS Products
>
> Calculations in SPSS Statistics are done in double precision floating
> point hardware. That allows for about 1516 significant figures. The
> value of 14/25 is, in greater precision
> 0.560000000000000053290705182007513940334320068359375
> which has to be rounded or truncated to fit in a double precision floating
> point number, so if rounded, it will end in 1 while if truncated it will
> end in 0.
>
> Statistics uses the IEEE standard for floating point arithmetic. Excel
> might use truncation or it might just be doing the calculation in a
> different order
>
> Note that a difference this tiny is utterly negligible except in the case
> where you are comparing two floating point numbers with fractional parts
> for exact equality. That is generally not a good thing to do as even a
> difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal
> difference.
>
> On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <
> VNambiar@.ca
> <mailto:
> VNambiar@.ca
> >> wrote:
>
>
>
> I am calculating percentages in IBM Statistics 26 and finding a small
> number of cases where I am expecting to yield a whole number percent but
> am not getting this result.
>
>
>
> For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would
> get a value of 56 with no resulting decimal when performing the same
> computation in Excel or SQL. In other cases, such as 2/4 I would get 50
> with no resulting decimals which matches MS products. This difference
> seems to only occur in a handful of cases by I’m not sure why.
>
>
>
> I know I have come across this issue before, but can’t remember why this
> occurs. I believe IBM also has some documentation on this issue but can’t
> seem to locate it from a previous conversation with them. I also can’t
> seem to put in a support request with IBM at the moment.
>
>
>
> Does anyone know the reason for this difference and if there is a
> workaround to yield the same result as MS products?
>
> ===================== To manage your subscription to SPSSXL, send a
> message to
> LISTSERV@.UGA
> <mailto:
> LISTSERV@.UGA
> > (not to SPSSXL), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSXL For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
>
> 
> Jon K Peck
> jkpeck@
> <mailto:
> jkpeck@
> >
>
> ===================== To manage your subscription to SPSSXL, send a
> message to
> LISTSERV@.UGA
> <mailto:
> LISTSERV@.UGA
> > (not to SPSSXL), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSXL For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
> =====================
> To manage your subscription to SPSSXL, send a message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSXL
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


If you do it with the syntax below, you will see, as I said in my earlier post, that x1 and x2 are not quite equal, and x1 is the rounded value of the higher precision value. But the difference between x1 and x2 is infinitesimal.
COMPUTE x1 = 14/25*100. COMPUTE x2 = 100*14/25. COMPUTE flag1 = x1 EQ 56. COMPUTE flag2 = x2 EQ 56. FORMATS flag1 flag2 (F1). format x1 x2 (f30.20). compute equal = x1 eq x2. LIST.
Rich's comments prompted me to try this:
NEW FILE.
DATASET CLOSE ALL.
DATA LIST FREE / junk(F1).
BEGIN DATA
1 1 1
END DATA.
COMPUTE x1 = 14/25*100.
COMPUTE x2 = 100*14/25.
COMPUTE flag1 = x1 EQ 56.
COMPUTE flag2 = x2 EQ 56.
FORMATS flag1 flag2 (F1).
LIST.
Here is the output:
junk x1 x2 flag1 flag2
1 56.00 56.00 0 1
1 56.00 56.00 0 1
1 56.00 56.00 0 1
Rich Ulrich wrote
> Reminder of what Jon is talking about: values of 1/2, 1/4, ...,
> (powers of 2) are represented precisely in binary notation of
> the usual computer arithmetic; whereas 1/10, 1/100, ..., are not.
>
> You do run into less chance of a problem if you divide by 10 or 25, etc.,
> /last/ in your computation. For integer compares of floating point
> numbers to be secure (in some sense), you can test for a narrow range
> or round or truncate.
>
> 
> Rich Ulrich
>
> ________________________________
> From: SPSSX(r) Discussion <
> SPSSXL@.UGA
> > on behalf of Jon Peck <
> jkpeck@
> >
> Sent: Tuesday, May 12, 2020 1:55 PM
> To:
> SPSSXL@.UGA
> <
> SPSSXL@.UGA
> >
> Subject: Re: Issue with Computation Differences Between IBM Statistics 26
> and MS Products
>
> Calculations in SPSS Statistics are done in double precision floating
> point hardware. That allows for about 1516 significant figures. The
> value of 14/25 is, in greater precision
> 0.560000000000000053290705182007513940334320068359375
> which has to be rounded or truncated to fit in a double precision floating
> point number, so if rounded, it will end in 1 while if truncated it will
> end in 0.
>
> Statistics uses the IEEE standard for floating point arithmetic. Excel
> might use truncation or it might just be doing the calculation in a
> different order
>
> Note that a difference this tiny is utterly negligible except in the case
> where you are comparing two floating point numbers with fractional parts
> for exact equality. That is generally not a good thing to do as even a
> difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal
> difference.
>
> On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <
> VNambiar@.ca
> <mailto:
> VNambiar@.ca
> >> wrote:
>
>
>
> I am calculating percentages in IBM Statistics 26 and finding a small
> number of cases where I am expecting to yield a whole number percent but
> am not getting this result.
>
>
>
> For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would
> get a value of 56 with no resulting decimal when performing the same
> computation in Excel or SQL. In other cases, such as 2/4 I would get 50
> with no resulting decimals which matches MS products. This difference
> seems to only occur in a handful of cases by I’m not sure why.
>
>
>
> I know I have come across this issue before, but can’t remember why this
> occurs. I believe IBM also has some documentation on this issue but can’t
> seem to locate it from a previous conversation with them. I also can’t
> seem to put in a support request with IBM at the moment.
>
>
>
> Does anyone know the reason for this difference and if there is a
> workaround to yield the same result as MS products?
>
> ===================== To manage your subscription to SPSSXL, send a
> message to
> LISTSERV@.UGA
> <mailto:
> LISTSERV@.UGA
> > (not to SPSSXL), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSXL For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
>
> 
> Jon K Peck
> jkpeck@
> <mailto:
> jkpeck@
> >
>
> ===================== To manage your subscription to SPSSXL, send a
> message to
> LISTSERV@.UGA
> <mailto:
> LISTSERV@.UGA
> > (not to SPSSXL), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSXL For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
> =====================
> To manage your subscription to SPSSXL, send a message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSXL
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Thanks, Jon. This is helpful. For my purposes, these small differences do end up mattering since we are dual processing data and both processors need to have
an exact match on our results.
I’ve applied the following code to the percentages I’m calculating to employ what they call whole number
rounding as required by our agency business rules:
Compute Rate2= TRUNC(Rate1) + (Rate1 > TRUNC(Rate1)).
In the case below, the percentage (Rate2) end ups being 57 versus 56 which is what the other person
I’m processing with yields in SQL.
It’s only typically a very few cases that our results are mismatched, but those differences are important
for our purposes.
I’m not quite sure how to resolve the issue because it’s difficult to say what each package is doing
differently – but we need to ensure a consistent, explainable outcomes for these percentages every time. I was asked several questions  will using a different data type such as float or decimal in SPSS would make a difference? Would setting the variable to
ratio make a difference? Would lowering the number of decimals change anything? Can FuzzBits be used in the Trunc command? I don’t think any of these suggestions would work and I’m ensure about the FuzzBits piece and its applicability in SPSS. One suggestion
was to simply truncate the rate to enforce a match which would work in that case in particular but wouldn’t for other cases – for instance 195/201*100 would yield 97.01492537313433 and truncated would yield 97 instead of 98 which is what I want and is what
I yield using the code above.
If anyone has any ideas for me please let me know! Thanks so much!
From: Jon Peck [mailto:[hidden email]]
Sent: Tuesday, May 12, 2020 10:56 AM
To: Veena Nambiar
Cc: SPSS List
Subject: [EXTERNAL] Re: [SPSSXL] Issue with Computation Differences Between IBM Statistics 26 and MS Products
Calculations in SPSS Statistics are done in double precision floating point hardware. That allows for about 1516 significant figures. The value of 14/25 is, in greater precision
0.560000000000000053290705182007513940334320068359375
which has to be rounded or truncated to fit in a double precision floating point number, so if rounded, it will end in 1 while if truncated it will end in 0.
Statistics uses the IEEE standard for floating point arithmetic. Excel might use truncation or it might just be doing the calculation in a different order
Note that a difference this tiny is utterly negligible except in the case where you are comparing two floating point numbers with fractional parts for exact equality. That is generally not a good thing to do as even a difference like 14
* 100/25 vs 14/25 * 100 can produce an infinitesimal difference.
On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <[hidden email]> wrote:
I am calculating percentages in IBM Statistics 26 and finding a small number of cases where I am
expecting to yield a whole number percent but am not getting this result.
For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would get a value of 56 with no
resulting decimal when performing the same computation in Excel or SQL. In other cases, such as 2/4 I would get 50 with no resulting decimals which matches MS products. This difference seems to only occur in a handful of cases by I’m not sure why.
I know I have come across this issue before, but can’t remember why this occurs. I believe IBM also
has some documentation on this issue but can’t seem to locate it from a previous conversation with them. I also can’t seem to put in a support request with IBM at the moment.
Does anyone know the reason for this difference and if there is a workaround to yield the same result
as MS products?
===================== To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


First, note that the result Statistics produces is mathematically correct as far as is possible with double precision floating point arithmetic. My earlier post showed the value to more precision (calculated with special code outside of Statistics that is thousands of times slower). The Statistics result is more accurate than the Excel result, but the difference is way too tiny for anyone to argue that there is a difference that matters. At least that's my opinion.
Statistics has only one numeric data type: all numbers are double precision floating point values. The number of decimals in the format is just a display issue: the displayed values would be the same in Statistics and Excel to more decimals than you would ever want.
I'm not sure what you want to do, but if you want to discuss further, let's take this off line ( [hidden email]).
Thanks, Jon. This is helpful. For my purposes, these small differences do end up mattering since we are dual processing data and both processors need to have
an exact match on our results.
I’ve applied the following code to the percentages I’m calculating to employ what they call whole number
rounding as required by our agency business rules:
Compute Rate2= TRUNC(Rate1) + (Rate1 > TRUNC(Rate1)).
In the case below, the percentage (Rate2) end ups being 57 versus 56 which is what the other person
I’m processing with yields in SQL.
It’s only typically a very few cases that our results are mismatched, but those differences are important
for our purposes.
I’m not quite sure how to resolve the issue because it’s difficult to say what each package is doing
differently – but we need to ensure a consistent, explainable outcomes for these percentages every time. I was asked several questions  will using a different data type such as float or decimal in SPSS would make a difference? Would setting the variable to
ratio make a difference? Would lowering the number of decimals change anything? Can FuzzBits be used in the Trunc command? I don’t think any of these suggestions would work and I’m ensure about the FuzzBits piece and its applicability in SPSS. One suggestion
was to simply truncate the rate to enforce a match which would work in that case in particular but wouldn’t for other cases – for instance 195/201*100 would yield 97.01492537313433 and truncated would yield 97 instead of 98 which is what I want and is what
I yield using the code above.
If anyone has any ideas for me please let me know! Thanks so much!
From: Jon Peck [mailto:[hidden email]]
Sent: Tuesday, May 12, 2020 10:56 AM
To: Veena Nambiar
Cc: SPSS List
Subject: [EXTERNAL] Re: [SPSSXL] Issue with Computation Differences Between IBM Statistics 26 and MS Products
Calculations in SPSS Statistics are done in double precision floating point hardware. That allows for about 1516 significant figures. The value of 14/25 is, in greater precision
0.560000000000000053290705182007513940334320068359375
which has to be rounded or truncated to fit in a double precision floating point number, so if rounded, it will end in 1 while if truncated it will end in 0.
Statistics uses the IEEE standard for floating point arithmetic. Excel might use truncation or it might just be doing the calculation in a different order
Note that a difference this tiny is utterly negligible except in the case where you are comparing two floating point numbers with fractional parts for exact equality. That is generally not a good thing to do as even a difference like 14
* 100/25 vs 14/25 * 100 can produce an infinitesimal difference.
On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <[hidden email]> wrote:
I am calculating percentages in IBM Statistics 26 and finding a small number of cases where I am
expecting to yield a whole number percent but am not getting this result.
For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would get a value of 56 with no
resulting decimal when performing the same computation in Excel or SQL. In other cases, such as 2/4 I would get 50 with no resulting decimals which matches MS products. This difference seems to only occur in a handful of cases by I’m not sure why.
I know I have come across this issue before, but can’t remember why this occurs. I believe IBM also
has some documentation on this issue but can’t seem to locate it from a previous conversation with them. I also can’t seem to put in a support request with IBM at the moment.
Does anyone know the reason for this difference and if there is a workaround to yield the same result
as MS products?
===================== To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command
INFO REFCARD


=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Administrator

The comparison of SPSS and Excel made me curious about what would happen
using Stata (another package I have). First, here is my revised SPSS code
and output, with Jon's addition of the variable "equal". (I changed F30.20
to F24.30, but I don't think that will change anything important.)
COMPUTE x1 = 14/25*100.
COMPUTE x2 = 100*14/25.
COMPUTE flag1 = x1 EQ 56.
COMPUTE flag2 = x2 EQ 56.
COMPUTE equal = x1 EQ x2.
FORMATS flag1 flag2 equal (F1) / x1 x2 (F24.20).
LIST x1 to equal.
x1 x2 flag1 flag2 equal
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
Now using Stata. One difference is that Stata has two IEEE 7542008
floatingpoint types: float and double. "float" variables (the default for
new numeric variables) are stored in 4 bytes, whereas "double" variables are
stored in 8 bytes. (Source:
https://blog.stata.com/2012/04/02/thepenultimateguidetoprecision/)
* Example 1 using 'float' variables (the default)
clear
set obs 3
generate x1 = 14/25*100
generate x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
OUTPUT for Example 1:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
* Example 2 using 'double' variables
clear
set obs 3
generate double x1 = 14/25*100
generate double x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
Output for Example 2:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
Unlike SPSS, Stata is not affected by the order of operations: 14/25*100
returns the same value as 100*14/25, regardless of which numeric type I use
(float or double), so long as I use the same type for both x1 and x2.
The second thing that caught my eye is the differences between SPSS and
Stata for the values that do have some nonzero decimals showing.
Stata: 56.00000000000000710543
SPSS: 56.0000000000000100
I think this note explains the difference:
https://www.ibm.com/support/pages/spssdoesnotdisplaymore16digitsdataviewFor anyone who wants more info about precision in Stata, this looks good:
https://www.statajournal.com/sjpdf.html?articlenum=pr0025HTH.
Jon Peck wrote
> First, note that the result Statistics produces is mathematically correct
> as far as is possible with double precision floating point arithmetic. My
> earlier post showed the value to more precision (calculated with special
> code outside of Statistics that is thousands of times slower). The
> Statistics result is more accurate than the Excel result, but the
> difference is way too tiny for anyone to argue that there is a difference
> that matters. At least that's my opinion.
>
> Statistics has only one numeric data type: all numbers are double
> precision
> floating point values. The number of decimals in the format is just a
> display issue: the displayed values would be the same in Statistics and
> Excel to more decimals than you would ever want.
>
> I'm not sure what you want to do, but if you want to discuss further,
> let's take this off line (
> jkpeck@
> ).
>
>
>
>
>
> On Tue, May 12, 2020 at 6:41 PM Veena Nambiar <
> VNambiar@.ca
> > wrote:
>
>> Thanks, Jon. This is helpful. For my purposes, these small differences do
>> end up mattering since we are dual processing data and both processors
>> need
>> to have an exact match on our results.
>>
>>
>>
>> I’ve applied the following code to the percentages I’m calculating to
>> employ what they call whole number rounding as required by our agency
>> business rules:
>>
>>
>>
>> Compute Rate2= TRUNC(Rate1) + (Rate1 > TRUNC(Rate1)).
>>
>>
>>
>> In the case below, the percentage (Rate2) end ups being 57 versus 56
>> which
>> is what the other person I’m processing with yields in SQL.
>>
>>
>>
>> It’s only typically a very few cases that our results are mismatched, but
>> those differences are important for our purposes.
>>
>>
>>
>> I’m not quite sure how to resolve the issue because it’s difficult to say
>> what each package is doing differently – but we need to ensure a
>> consistent, explainable outcomes for these percentages every time. I was
>> asked several questions  will using a different data type such as float
>> or
>> decimal in SPSS would make a difference? Would setting the variable to
>> ratio make a difference? Would lowering the number of decimals change
>> anything? Can FuzzBits be used in the Trunc command? I don’t think any of
>> these suggestions would work and I’m ensure about the FuzzBits piece and
>> its applicability in SPSS. One suggestion was to simply truncate the rate
>> to enforce a match which would work in that case in particular but
>> wouldn’t
>> for other cases – for instance 195/201*100 would yield 97.01492537313433
>> and truncated would yield 97 instead of 98 which is what I want and is
>> what
>> I yield using the code above.
>>
>>
>>
>> If anyone has any ideas for me please let me know! Thanks so much!
>>
>>
>>
>>
>>
>>
>>
>> *From:* Jon Peck [mailto:
> jkpeck@
> ]
>> *Sent:* Tuesday, May 12, 2020 10:56 AM
>> *To:* Veena Nambiar
>> *Cc:* SPSS List
>> *Subject:* [EXTERNAL] Re: [SPSSXL] Issue with Computation Differences
>> Between IBM Statistics 26 and MS Products
>>
>>
>>
>> Calculations in SPSS Statistics are done in double precision floating
>> point hardware. That allows for about 1516 significant figures. The
>> value of 14/25 is, in greater precision
>>
>> 0.560000000000000053290705182007513940334320068359375
>>
>> which has to be rounded or truncated to fit in a double precision
>> floating
>> point number, so if rounded, it will end in 1 while if truncated it will
>> end in 0.
>>
>>
>>
>> Statistics uses the IEEE standard for floating point arithmetic. Excel
>> might use truncation or it might just be doing the calculation in a
>> different order
>>
>>
>>
>> Note that a difference this tiny is utterly negligible except in the case
>> where you are comparing two floating point numbers with fractional parts
>> for exact equality. That is generally not a good thing to do as even a
>> difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal
>> difference.
>>
>>
>>
>> On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <
> VNambiar@.ca
> >
>> wrote:
>>
>>
>>
>> I am calculating percentages in IBM Statistics 26 and finding a small
>> number of cases where I am expecting to yield a whole number percent but
>> am
>> not getting this result.
>>
>>
>>
>> For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would
>> get a value of 56 with no resulting decimal when performing the same
>> computation in Excel or SQL. In other cases, such as 2/4 I would get 50
>> with no resulting decimals which matches MS products. This difference
>> seems
>> to only occur in a handful of cases by I’m not sure why.
>>
>>
>>
>> I know I have come across this issue before, but can’t remember why this
>> occurs. I believe IBM also has some documentation on this issue but can’t
>> seem to locate it from a previous conversation with them. I also can’t
>> seem
>> to put in a support request with IBM at the moment.
>>
>>
>>
>> Does anyone know the reason for this difference and if there is a
>> workaround to yield the same result as MS products?
>>
>> ===================== To manage your subscription to SPSSXL, send a
>> message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text
>> except the command. To leave the list, send the command SIGNOFF SPSSXL
>> For
>> a list of commands to manage subscriptions, send the command INFO REFCARD
>>
>>
>>
>>
>> 
>>
>> Jon K Peck
>>
> jkpeck@
>>
>
>
> 
> Jon K Peck
> jkpeck@
>
> =====================
> To manage your subscription to SPSSXL, send a message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSXL
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


I haven't read the article Bruce cites, but strictly speaking, I think the Stata result is wrong. As I said in my previous post, the correct value to the number of digits displayed is 56.0000000000000053290705182007513940334320068359375
not 56.00000000000000710543
And, since a double has only 53 bits for the precision, numbers should not be shown with more digits than that can hold. But I'm not going to lose any sleep over this. The comparison of SPSS and Excel made me curious about what would happen
using Stata (another package I have). First, here is my revised SPSS code
and output, with Jon's addition of the variable "equal". (I changed F30.20
to F24.30, but I don't think that will change anything important.)
COMPUTE x1 = 14/25*100.
COMPUTE x2 = 100*14/25.
COMPUTE flag1 = x1 EQ 56.
COMPUTE flag2 = x2 EQ 56.
COMPUTE equal = x1 EQ x2.
FORMATS flag1 flag2 equal (F1) / x1 x2 (F24.20).
LIST x1 to equal.
x1 x2 flag1 flag2 equal
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
Now using Stata. One difference is that Stata has two IEEE 7542008
floatingpoint types: float and double. "float" variables (the default for
new numeric variables) are stored in 4 bytes, whereas "double" variables are
stored in 8 bytes. (Source:
https://blog.stata.com/2012/04/02/thepenultimateguidetoprecision/)
* Example 1 using 'float' variables (the default)
clear
set obs 3
generate x1 = 14/25*100
generate x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
OUTPUT for Example 1:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
* Example 2 using 'double' variables
clear
set obs 3
generate double x1 = 14/25*100
generate double x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
Output for Example 2:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
Unlike SPSS, Stata is not affected by the order of operations: 14/25*100
returns the same value as 100*14/25, regardless of which numeric type I use
(float or double), so long as I use the same type for both x1 and x2.
The second thing that caught my eye is the differences between SPSS and
Stata for the values that do have some nonzero decimals showing.
Stata: 56.00000000000000710543
SPSS: 56.0000000000000100
I think this note explains the difference:
https://www.ibm.com/support/pages/spssdoesnotdisplaymore16digitsdataview
For anyone who wants more info about precision in Stata, this looks good:
https://www.statajournal.com/sjpdf.html?articlenum=pr0025
HTH.
Jon Peck wrote
> First, note that the result Statistics produces is mathematically correct
> as far as is possible with double precision floating point arithmetic. My
> earlier post showed the value to more precision (calculated with special
> code outside of Statistics that is thousands of times slower). The
> Statistics result is more accurate than the Excel result, but the
> difference is way too tiny for anyone to argue that there is a difference
> that matters. At least that's my opinion.
>
> Statistics has only one numeric data type: all numbers are double
> precision
> floating point values. The number of decimals in the format is just a
> display issue: the displayed values would be the same in Statistics and
> Excel to more decimals than you would ever want.
>
> I'm not sure what you want to do, but if you want to discuss further,
> let's take this off line (
> jkpeck@
> ).
>
>
>
>
>
> On Tue, May 12, 2020 at 6:41 PM Veena Nambiar <
> VNambiar@.ca
> > wrote:
>
>> Thanks, Jon. This is helpful. For my purposes, these small differences do
>> end up mattering since we are dual processing data and both processors
>> need
>> to have an exact match on our results.
>>
>>
>>
>> I’ve applied the following code to the percentages I’m calculating to
>> employ what they call whole number rounding as required by our agency
>> business rules:
>>
>>
>>
>> Compute Rate2= TRUNC(Rate1) + (Rate1 > TRUNC(Rate1)).
>>
>>
>>
>> In the case below, the percentage (Rate2) end ups being 57 versus 56
>> which
>> is what the other person I’m processing with yields in SQL.
>>
>>
>>
>> It’s only typically a very few cases that our results are mismatched, but
>> those differences are important for our purposes.
>>
>>
>>
>> I’m not quite sure how to resolve the issue because it’s difficult to say
>> what each package is doing differently – but we need to ensure a
>> consistent, explainable outcomes for these percentages every time. I was
>> asked several questions  will using a different data type such as float
>> or
>> decimal in SPSS would make a difference? Would setting the variable to
>> ratio make a difference? Would lowering the number of decimals change
>> anything? Can FuzzBits be used in the Trunc command? I don’t think any of
>> these suggestions would work and I’m ensure about the FuzzBits piece and
>> its applicability in SPSS. One suggestion was to simply truncate the rate
>> to enforce a match which would work in that case in particular but
>> wouldn’t
>> for other cases – for instance 195/201*100 would yield 97.01492537313433
>> and truncated would yield 97 instead of 98 which is what I want and is
>> what
>> I yield using the code above.
>>
>>
>>
>> If anyone has any ideas for me please let me know! Thanks so much!
>>
>>
>>
>>
>>
>>
>>
>> *From:* Jon Peck [mailto:
> jkpeck@
> ]
>> *Sent:* Tuesday, May 12, 2020 10:56 AM
>> *To:* Veena Nambiar
>> *Cc:* SPSS List
>> *Subject:* [EXTERNAL] Re: [SPSSXL] Issue with Computation Differences
>> Between IBM Statistics 26 and MS Products
>>
>>
>>
>> Calculations in SPSS Statistics are done in double precision floating
>> point hardware. That allows for about 1516 significant figures. The
>> value of 14/25 is, in greater precision
>>
>> 0.560000000000000053290705182007513940334320068359375
>>
>> which has to be rounded or truncated to fit in a double precision
>> floating
>> point number, so if rounded, it will end in 1 while if truncated it will
>> end in 0.
>>
>>
>>
>> Statistics uses the IEEE standard for floating point arithmetic. Excel
>> might use truncation or it might just be doing the calculation in a
>> different order
>>
>>
>>
>> Note that a difference this tiny is utterly negligible except in the case
>> where you are comparing two floating point numbers with fractional parts
>> for exact equality. That is generally not a good thing to do as even a
>> difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal
>> difference.
>>
>>
>>
>> On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <
> VNambiar@.ca
> >
>> wrote:
>>
>>
>>
>> I am calculating percentages in IBM Statistics 26 and finding a small
>> number of cases where I am expecting to yield a whole number percent but
>> am
>> not getting this result.
>>
>>
>>
>> For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would
>> get a value of 56 with no resulting decimal when performing the same
>> computation in Excel or SQL. In other cases, such as 2/4 I would get 50
>> with no resulting decimals which matches MS products. This difference
>> seems
>> to only occur in a handful of cases by I’m not sure why.
>>
>>
>>
>> I know I have come across this issue before, but can’t remember why this
>> occurs. I believe IBM also has some documentation on this issue but can’t
>> seem to locate it from a previous conversation with them. I also can’t
>> seem
>> to put in a support request with IBM at the moment.
>>
>>
>>
>> Does anyone know the reason for this difference and if there is a
>> workaround to yield the same result as MS products?
>>
>> ===================== To manage your subscription to SPSSXL, send a
>> message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text
>> except the command. To leave the list, send the command SIGNOFF SPSSXL
>> For
>> a list of commands to manage subscriptions, send the command INFO REFCARD
>>
>>
>>
>>
>> 
>>
>> Jon K Peck
>>
> jkpeck@
>>
>
>
> 
> Jon K Peck
> jkpeck@
>
> =====================
> To manage your subscription to SPSSXL, send a message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSXL
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


I did read both the blog and the journal article. I'm unhappy.
I'm not sure why the "float" example shows equality, but I will
accept that "internal roundoff" seems to work advantageously.
My bigger problem is the doubleprecision example.
Writing (100*14)/25  to force the order of computation  should
give an exact, integer answer in double precision. Does it? You
show that 100*14/25 has trailing digits, which must be wrong
if multiplication were done first.
So, it looks to me like Stata is giving division a higher precedence
than multiplication, performing it first, instead of taking the formula
in order.

Rich Ulrich
The comparison of SPSS and Excel made me curious about what would happen
using Stata (another package I have). First, here is my revised SPSS code
and output, with Jon's addition of the variable "equal". (I changed F30.20
to F24.30, but I don't think that will change anything important.)
COMPUTE x1 = 14/25*100.
COMPUTE x2 = 100*14/25.
COMPUTE flag1 = x1 EQ 56.
COMPUTE flag2 = x2 EQ 56.
COMPUTE equal = x1 EQ x2.
FORMATS flag1 flag2 equal (F1) / x1 x2 (F24.20).
LIST x1 to equal.
x1 x2 flag1 flag2 equal
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
Now using Stata. One difference is that Stata has two IEEE 7542008
floatingpoint types: float and double. "float" variables (the default for
new numeric variables) are stored in 4 bytes, whereas "double" variables are
stored in 8 bytes. (Source:
https://blog.stata.com/2012/04/02/thepenultimateguidetoprecision/)
* Example 1 using 'float' variables (the default)
clear
set obs 3
generate x1 = 14/25*100
generate x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
OUTPUT for Example 1:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
* Example 2 using 'double' variables
clear
set obs 3
generate double x1 = 14/25*100
generate double x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
Output for Example 2:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
Unlike SPSS, Stata is not affected by the order of operations: 14/25*100
returns the same value as 100*14/25, regardless of which numeric type I use
(float or double), so long as I use the same type for both x1 and x2.
The second thing that caught my eye is the differences between SPSS and
Stata for the values that do have some nonzero decimals showing.
Stata: 56.00000000000000710543
SPSS: 56.0000000000000100
I think this note explains the difference:
https://www.ibm.com/support/pages/spssdoesnotdisplaymore16digitsdataview
For anyone who wants more info about precision in Stata, this looks good:
https://www.statajournal.com/sjpdf.html?articlenum=pr0025
HTH.
Jon Peck wrote
> First, note that the result Statistics produces is mathematically correct
> as far as is possible with double precision floating point arithmetic. My
> earlier post showed the value to more precision (calculated with special
> code outside of Statistics that is thousands of times slower). The
> Statistics result is more accurate than the Excel result, but the
> difference is way too tiny for anyone to argue that there is a difference
> that matters. At least that's my opinion.
>
> Statistics has only one numeric data type: all numbers are double
> precision
> floating point values. The number of decimals in the format is just a
> display issue: the displayed values would be the same in Statistics and
> Excel to more decimals than you would ever want.
>
> I'm not sure what you want to do, but if you want to discuss further,
> let's take this off line (
> jkpeck@
> ).
>
>
>
>
>
> On Tue, May 12, 2020 at 6:41 PM Veena Nambiar <
> VNambiar@.ca
> > wrote:
>
>> Thanks, Jon. This is helpful. For my purposes, these small differences do
>> end up mattering since we are dual processing data and both processors
>> need
>> to have an exact match on our results.
>>
>>
>>
>> I’ve applied the following code to the percentages I’m calculating to
>> employ what they call whole number rounding as required by our agency
>> business rules:
>>
>>
>>
>> Compute Rate2= TRUNC(Rate1) + (Rate1 > TRUNC(Rate1)).
>>
>>
>>
>> In the case below, the percentage (Rate2) end ups being 57 versus 56
>> which
>> is what the other person I’m processing with yields in SQL.
>>
>>
>>
>> It’s only typically a very few cases that our results are mismatched, but
>> those differences are important for our purposes.
>>
>>
>>
>> I’m not quite sure how to resolve the issue because it’s difficult to say
>> what each package is doing differently – but we need to ensure a
>> consistent, explainable outcomes for these percentages every time. I was
>> asked several questions  will using a different data type such as float
>> or
>> decimal in SPSS would make a difference? Would setting the variable to
>> ratio make a difference? Would lowering the number of decimals change
>> anything? Can FuzzBits be used in the Trunc command? I don’t think any of
>> these suggestions would work and I’m ensure about the FuzzBits piece and
>> its applicability in SPSS. One suggestion was to simply truncate the rate
>> to enforce a match which would work in that case in particular but
>> wouldn’t
>> for other cases – for instance 195/201*100 would yield 97.01492537313433
>> and truncated would yield 97 instead of 98 which is what I want and is
>> what
>> I yield using the code above.
>>
>>
>>
>> If anyone has any ideas for me please let me know! Thanks so much!
>>
>>
>>
>>
>>
>>
>>
>> *From:* Jon Peck [mailto:
> jkpeck@
> ]
>> *Sent:* Tuesday, May 12, 2020 10:56 AM
>> *To:* Veena Nambiar
>> *Cc:* SPSS List
>> *Subject:* [EXTERNAL] Re: [SPSSXL] Issue with Computation Differences
>> Between IBM Statistics 26 and MS Products
>>
>>
>>
>> Calculations in SPSS Statistics are done in double precision floating
>> point hardware. That allows for about 1516 significant figures. The
>> value of 14/25 is, in greater precision
>>
>> 0.560000000000000053290705182007513940334320068359375
>>
>> which has to be rounded or truncated to fit in a double precision
>> floating
>> point number, so if rounded, it will end in 1 while if truncated it will
>> end in 0.
>>
>>
>>
>> Statistics uses the IEEE standard for floating point arithmetic. Excel
>> might use truncation or it might just be doing the calculation in a
>> different order
>>
>>
>>
>> Note that a difference this tiny is utterly negligible except in the case
>> where you are comparing two floating point numbers with fractional parts
>> for exact equality. That is generally not a good thing to do as even a
>> difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal
>> difference.
>>
>>
>>
>> On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <
> VNambiar@.ca
> >
>> wrote:
>>
>>
>>
>> I am calculating percentages in IBM Statistics 26 and finding a small
>> number of cases where I am expecting to yield a whole number percent but
>> am
>> not getting this result.
>>
>>
>>
>> For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would
>> get a value of 56 with no resulting decimal when performing the same
>> computation in Excel or SQL. In other cases, such as 2/4 I would get 50
>> with no resulting decimals which matches MS products. This difference
>> seems
>> to only occur in a handful of cases by I’m not sure why.
>>
>>
>>
>> I know I have come across this issue before, but can’t remember why this
>> occurs. I believe IBM also has some documentation on this issue but can’t
>> seem to locate it from a previous conversation with them. I also can’t
>> seem
>> to put in a support request with IBM at the moment.
>>
>>
>>
>> Does anyone know the reason for this difference and if there is a
>> workaround to yield the same result as MS products?
>>
>> ===================== To manage your subscription to SPSSXL, send a
>> message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text
>> except the command. To leave the list, send the command SIGNOFF SPSSXL
>> For
>> a list of commands to manage subscriptions, send the command INFO REFCARD
>>
>>
>>
>>
>> 
>>
>> Jon K Peck
>>
> jkpeck@
>>
>
>
> 
> Jon K Peck
> jkpeck@
>
> =====================
> To manage your subscription to SPSSXL, send a message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSXL
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Forcing the operation order with parentheses does give an exact integervalued result as expected.
As far as the order of operations in the absence of parenthesizing overrides is concerned, the CSR says. The order of execution is as follows: functions; exponentiation; multiplication, division, and unary –; and addition and subtraction. • Operators at the same level are executed from left to right.
I did read both the blog and the journal article. I'm unhappy.
I'm not sure why the "float" example shows equality, but I will
accept that "internal roundoff" seems to work advantageously.
My bigger problem is the doubleprecision example.
Writing (100*14)/25  to force the order of computation  should
give an exact, integer answer in double precision. Does it? You
show that 100*14/25 has trailing digits, which must be wrong
if multiplication were done first.
So, it looks to me like Stata is giving division a higher precedence
than multiplication, performing it first, instead of taking the formula
in order.

Rich Ulrich
The comparison of SPSS and Excel made me curious about what would happen
using Stata (another package I have). First, here is my revised SPSS code
and output, with Jon's addition of the variable "equal". (I changed F30.20
to F24.30, but I don't think that will change anything important.)
COMPUTE x1 = 14/25*100.
COMPUTE x2 = 100*14/25.
COMPUTE flag1 = x1 EQ 56.
COMPUTE flag2 = x2 EQ 56.
COMPUTE equal = x1 EQ x2.
FORMATS flag1 flag2 equal (F1) / x1 x2 (F24.20).
LIST x1 to equal.
x1 x2 flag1 flag2 equal
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
56.0000000000000100 56.0000000000000000 0 1 0
Now using Stata. One difference is that Stata has two IEEE 7542008
floatingpoint types: float and double. "float" variables (the default for
new numeric variables) are stored in 4 bytes, whereas "double" variables are
stored in 8 bytes. (Source:
https://blog.stata.com/2012/04/02/thepenultimateguidetoprecision/)
* Example 1 using 'float' variables (the default)
clear
set obs 3
generate x1 = 14/25*100
generate x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
OUTPUT for Example 1:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
56.00000000000000000000 56.00000000000000000000 1 1
1
* Example 2 using 'double' variables
clear
set obs 3
generate double x1 = 14/25*100
generate double x2 = 100*14/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte equal = x1 == x2
format x1 x2 %24.20f
list, clean noobs
Output for Example 2:
. list, clean noobs
x1 x2 flag1 flag2
equal
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
56.00000000000000710543 56.00000000000000710543 0 0
1
Unlike SPSS, Stata is not affected by the order of operations: 14/25*100
returns the same value as 100*14/25, regardless of which numeric type I use
(float or double), so long as I use the same type for both x1 and x2.
The second thing that caught my eye is the differences between SPSS and
Stata for the values that do have some nonzero decimals showing.
Stata: 56.00000000000000710543
SPSS: 56.0000000000000100
I think this note explains the difference:
https://www.ibm.com/support/pages/spssdoesnotdisplaymore16digitsdataview
For anyone who wants more info about precision in Stata, this looks good:
https://www.statajournal.com/sjpdf.html?articlenum=pr0025
HTH.
Jon Peck wrote
> First, note that the result Statistics produces is mathematically correct
> as far as is possible with double precision floating point arithmetic. My
> earlier post showed the value to more precision (calculated with special
> code outside of Statistics that is thousands of times slower). The
> Statistics result is more accurate than the Excel result, but the
> difference is way too tiny for anyone to argue that there is a difference
> that matters. At least that's my opinion.
>
> Statistics has only one numeric data type: all numbers are double
> precision
> floating point values. The number of decimals in the format is just a
> display issue: the displayed values would be the same in Statistics and
> Excel to more decimals than you would ever want.
>
> I'm not sure what you want to do, but if you want to discuss further,
> let's take this off line (
> jkpeck@
> ).
>
>
>
>
>
> On Tue, May 12, 2020 at 6:41 PM Veena Nambiar <
> VNambiar@.ca
> > wrote:
>
>> Thanks, Jon. This is helpful. For my purposes, these small differences do
>> end up mattering since we are dual processing data and both processors
>> need
>> to have an exact match on our results.
>>
>>
>>
>> I’ve applied the following code to the percentages I’m calculating to
>> employ what they call whole number rounding as required by our agency
>> business rules:
>>
>>
>>
>> Compute Rate2= TRUNC(Rate1) + (Rate1 > TRUNC(Rate1)).
>>
>>
>>
>> In the case below, the percentage (Rate2) end ups being 57 versus 56
>> which
>> is what the other person I’m processing with yields in SQL.
>>
>>
>>
>> It’s only typically a very few cases that our results are mismatched, but
>> those differences are important for our purposes.
>>
>>
>>
>> I’m not quite sure how to resolve the issue because it’s difficult to say
>> what each package is doing differently – but we need to ensure a
>> consistent, explainable outcomes for these percentages every time. I was
>> asked several questions  will using a different data type such as float
>> or
>> decimal in SPSS would make a difference? Would setting the variable to
>> ratio make a difference? Would lowering the number of decimals change
>> anything? Can FuzzBits be used in the Trunc command? I don’t think any of
>> these suggestions would work and I’m ensure about the FuzzBits piece and
>> its applicability in SPSS. One suggestion was to simply truncate the rate
>> to enforce a match which would work in that case in particular but
>> wouldn’t
>> for other cases – for instance 195/201*100 would yield 97.01492537313433
>> and truncated would yield 97 instead of 98 which is what I want and is
>> what
>> I yield using the code above.
>>
>>
>>
>> If anyone has any ideas for me please let me know! Thanks so much!
>>
>>
>>
>>
>>
>>
>>
>> *From:* Jon Peck [mailto:
> jkpeck@
> ]
>> *Sent:* Tuesday, May 12, 2020 10:56 AM
>> *To:* Veena Nambiar
>> *Cc:* SPSS List
>> *Subject:* [EXTERNAL] Re: [SPSSXL] Issue with Computation Differences
>> Between IBM Statistics 26 and MS Products
>>
>>
>>
>> Calculations in SPSS Statistics are done in double precision floating
>> point hardware. That allows for about 1516 significant figures. The
>> value of 14/25 is, in greater precision
>>
>> 0.560000000000000053290705182007513940334320068359375
>>
>> which has to be rounded or truncated to fit in a double precision
>> floating
>> point number, so if rounded, it will end in 1 while if truncated it will
>> end in 0.
>>
>>
>>
>> Statistics uses the IEEE standard for floating point arithmetic. Excel
>> might use truncation or it might just be doing the calculation in a
>> different order
>>
>>
>>
>> Note that a difference this tiny is utterly negligible except in the case
>> where you are comparing two floating point numbers with fractional parts
>> for exact equality. That is generally not a good thing to do as even a
>> difference like 14 * 100/25 vs 14/25 * 100 can produce an infinitesimal
>> difference.
>>
>>
>>
>> On Tue, May 12, 2020 at 11:25 AM Veena Nambiar <
> VNambiar@.ca
> >
>> wrote:
>>
>>
>>
>> I am calculating percentages in IBM Statistics 26 and finding a small
>> number of cases where I am expecting to yield a whole number percent but
>> am
>> not getting this result.
>>
>>
>>
>> For instance, 14/25*100 yields 56.00000000000001 instead of 56. I would
>> get a value of 56 with no resulting decimal when performing the same
>> computation in Excel or SQL. In other cases, such as 2/4 I would get 50
>> with no resulting decimals which matches MS products. This difference
>> seems
>> to only occur in a handful of cases by I’m not sure why.
>>
>>
>>
>> I know I have come across this issue before, but can’t remember why this
>> occurs. I believe IBM also has some documentation on this issue but can’t
>> seem to locate it from a previous conversation with them. I also can’t
>> seem
>> to put in a support request with IBM at the moment.
>>
>>
>>
>> Does anyone know the reason for this difference and if there is a
>> workaround to yield the same result as MS products?
>>
>> ===================== To manage your subscription to SPSSXL, send a
>> message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text
>> except the command. To leave the list, send the command SIGNOFF SPSSXL
>> For
>> a list of commands to manage subscriptions, send the command INFO REFCARD
>>
>>
>>
>>
>> 
>>
>> Jon K Peck
>>
> jkpeck@
>>
>
>
> 
> Jon K Peck
> jkpeck@
>
> =====================
> To manage your subscription to SPSSXL, send a message to
> LISTSERV@.UGA
> (not to SPSSXL), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSXL
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Administrator

Right you are, Rich, division, then multiplication. Wellspotted. See
section 13.2.5 here:
https://www.stata.com/manuals/u13.pdfAnd you're also right about using parentheses to force the order of
computation.
* Example 3 using 'double' variables
clear
set obs 3
generate double x1 = 14/25*100
generate double x2 = 100*14/25
generate double x3 = (100*14)/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte flag3 = x3 == 56
generate byte eq12 = x1 == x2
generate byte eq13 = x1 == x3
generate byte eq23 = x2 == x3
format x1x3 %24.20f
list x1x3, clean noobs
list flag1flag3 eq12eq23, clean noobs
OUTPUT:
. list x1x3, clean noobs
x1 x2
x3
56.00000000000000710543 56.00000000000000710543
56.00000000000000000000
56.00000000000000710543 56.00000000000000710543
56.00000000000000000000
56.00000000000000710543 56.00000000000000710543
56.00000000000000000000
. list flag1flag3 eq12eq23, clean noobs
flag1 flag2 flag3 eq12 eq13 eq23
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 0 0
Rich Ulrich wrote
> I did read both the blog and the journal article. I'm unhappy.
> I'm not sure why the "float" example shows equality, but I will
> accept that "internal roundoff" seems to work advantageously.
>
> My bigger problem is the doubleprecision example.
>
> Writing (100*14)/25  to force the order of computation  should
> give an exact, integer answer in double precision. Does it? You
> show that 100*14/25 has trailing digits, which must be wrong
> if multiplication were done first.
>
> So, it looks to me like Stata is giving division a higher precedence
> than multiplication, performing it first, instead of taking the formula
> in order.
>
> 
> Rich Ulrich


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Bruce, thanks for the confirmation.
I'm a bit surprised to see that Stata gives multiply/divide and
add/subtract separate precedence, each, instead of following
the high school algebra rules that I assumed were universal.
If you don't want to worry about it, use extra parentheses. I've
sometimes been generous with parentheses in COMPUTE statements,
just for clarity. That now seems like a good idea, for this other reason.
Giving each symbol its own order  I wonder if they did that
intentionally (Why?), or if it was a side effect of "easier programming"?
The errors you might get from overflow for multiply/divide (done
in the wrong order) are more obvious than the add/subtract problems
illustrated in one of your references. But the latter (rarely) can matter, too.

Rich Ulrich
Right you are, Rich, division, then multiplication. Wellspotted. See
section 13.2.5 here:
https://www.stata.com/manuals/u13.pdf
And you're also right about using parentheses to force the order of
computation.
* Example 3 using 'double' variables
clear
set obs 3
generate double x1 = 14/25*100
generate double x2 = 100*14/25
generate double x3 = (100*14)/25
generate byte flag1 = x1 == 56
generate byte flag2 = x2 == 56
generate byte flag3 = x3 == 56
generate byte eq12 = x1 == x2
generate byte eq13 = x1 == x3
generate byte eq23 = x2 == x3
format x1x3 %24.20f
list x1x3, clean noobs
list flag1flag3 eq12eq23, clean noobs
OUTPUT:
. list x1x3, clean noobs
x1 x2
x3
56.00000000000000710543 56.00000000000000710543
56.00000000000000000000
56.00000000000000710543 56.00000000000000710543
56.00000000000000000000
56.00000000000000710543 56.00000000000000710543
56.00000000000000000000
. list flag1flag3 eq12eq23, clean noobs
flag1 flag2 flag3 eq12 eq13 eq23
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 0 0
Rich Ulrich wrote
> I did read both the blog and the journal article. I'm unhappy.
> I'm not sure why the "float" example shows equality, but I will
> accept that "internal roundoff" seems to work advantageously.
>
> My bigger problem is the doubleprecision example.
>
> Writing (100*14)/25  to force the order of computation  should
> give an exact, integer answer in double precision. Does it? You
> show that 100*14/25 has trailing digits, which must be wrong
> if multiplication were done first.
>
> So, it looks to me like Stata is giving division a higher precedence
> than multiplication, performing it first, instead of taking the formula
> in order.
>
> 
> Rich Ulrich


Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

Sent from: http://spssxdiscussion.1045642.n5.nabble.com/
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD

