distance between zipcodes

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

distance between zipcodes

Maguin, Eugene

Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.

Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Jon K Peck
I did something like this a few years ago.  If you have a zip codes table with lat/long values, you can use the SPSSINC TRANS extension command with the extendedTransforms.ellipseDist function (or use spherical distances) to compute the distances.  But do you really want all by all?  That's going to be millions of numbers.  Note also that zip code areas can have funny shapes, so, especially for close areas, the distances won't be super accurate.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 11:43 AM
Subject:        [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.
Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Maguin, Eugene

Thanks, Jon.  No, not all by all. The input file would be id, zip1, zip2 and the output file would be id, zip1, zip2, distance. I understand your point about accuracy but we don’t have street addresses. Ok, so a python routine can do the computation given lat/long numbers. Do there exist files of lat/long numbers for zip code centers (however those centers are defined)?

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Wednesday, September 02, 2015 1:55 PM
To: [hidden email]
Subject: Re: distance between zipcodes

 

I did something like this a few years ago.  If you have a zip codes table with lat/long values, you can use the SPSSINC TRANS extension command with the extendedTransforms.ellipseDist function (or use spherical distances) to compute the distances.  But do you really want all by all?  That's going to be millions of numbers.  Note also that zip code areas can have funny shapes, so, especially for close areas, the distances won't be super accurate.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 11:43 AM
Subject:        [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.
Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email](not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Mark Miller
Census has ZCTA (Zip code Tabulation Areas) with with Shapefiles for mappping.
I also have some old files which have lat/long centroid of ZCTA.
I do not recall if these appear in more recent issues of thie product.
ZCTA does not include all zipcodes.

... Mark Miller


On Wed, Sep 2, 2015 at 11:18 AM, Maguin, Eugene <[hidden email]> wrote:

Thanks, Jon.  No, not all by all. The input file would be id, zip1, zip2 and the output file would be id, zip1, zip2, distance. I understand your point about accuracy but we don’t have street addresses. Ok, so a python routine can do the computation given lat/long numbers. Do there exist files of lat/long numbers for zip code centers (however those centers are defined)?

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Wednesday, September 02, 2015 1:55 PM
To: [hidden email]
Subject: Re: distance between zipcodes

 

I did something like this a few years ago.  If you have a zip codes table with lat/long values, you can use the SPSSINC TRANS extension command with the extendedTransforms.ellipseDist function (or use spherical distances) to compute the distances.  But do you really want all by all?  That's going to be millions of numbers.  Note also that zip code areas can have funny shapes, so, especially for close areas, the distances won't be super accurate.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: <a href="tel:720-342-5621" value="+17203425621" target="_blank">720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 11:43 AM
Subject:        [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.
Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email](not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Mark Miller
After checking my own files, I have Zipcode files from 2004 thru 2012
which contain (supposedly) Lat/Long for centroids.
At least one of these is a SAS data filewhich is easily converted to SPSS.
There are 33233 Zipcodes listed in the SAS file.

... Mark Miller


On Wed, Sep 2, 2015 at 11:22 AM, Mark Miller <[hidden email]> wrote:
Census has ZCTA (Zip code Tabulation Areas) with with Shapefiles for mappping.
I also have some old files which have lat/long centroid of ZCTA.
I do not recall if these appear in more recent issues of thie product.
ZCTA does not include all zipcodes.

... Mark Miller


On Wed, Sep 2, 2015 at 11:18 AM, Maguin, Eugene <[hidden email]> wrote:

Thanks, Jon.  No, not all by all. The input file would be id, zip1, zip2 and the output file would be id, zip1, zip2, distance. I understand your point about accuracy but we don’t have street addresses. Ok, so a python routine can do the computation given lat/long numbers. Do there exist files of lat/long numbers for zip code centers (however those centers are defined)?

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Wednesday, September 02, 2015 1:55 PM
To: [hidden email]
Subject: Re: distance between zipcodes

 

I did something like this a few years ago.  If you have a zip codes table with lat/long values, you can use the SPSSINC TRANS extension command with the extendedTransforms.ellipseDist function (or use spherical distances) to compute the distances.  But do you really want all by all?  That's going to be millions of numbers.  Note also that zip code areas can have funny shapes, so, especially for close areas, the distances won't be super accurate.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: <a href="tel:720-342-5621" value="+17203425621" target="_blank">720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 11:43 AM
Subject:        [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.
Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email](not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Jon K Peck
In reply to this post by Maguin, Eugene
Here's an snippet example using a zipcode file I found somewhere on the net and squirreled away.

get file="c:/data/zipcodes.sav".
dataset name zipcodes.
dataset activate zipcodes.
sort cases by zipcode.

dataset activate main.
MATCH FILES /FILE=*
  /TABLE='zipcodes'
  /BY zipcode.

spssinc trans result=distance
/formula "extendedTransforms.ellipseDist(latitude, longitude, lat2, long2)".


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 12:21 PM
Subject:        Re: [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Thanks, Jon.  No, not all by all. The input file would be id, zip1, zip2 and the output file would be id, zip1, zip2, distance. I understand your point about accuracy but we don’t have street addresses. Ok, so a python routine can do the computation given lat/long numbers. Do there exist files of lat/long numbers for zip code centers (however those centers are defined)?
 
Gene Maguin
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck
Sent:
Wednesday, September 02, 2015 1:55 PM
To:
[hidden email]
Subject:
Re: distance between zipcodes

 
I did something like this a few years ago.  If you have a zip codes table with lat/long values, you can use the SPSSINC TRANS extension command with the extendedTransforms.ellipseDist function (or use spherical distances) to compute the distances.  But do you really want all by all?  That's going to be millions of numbers.  Note also that zip code areas can have funny shapes, so, especially for close areas, the distances won't be super accurate.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM

peck@...
phone: 720-342-5621





From:        
"Maguin, Eugene" <emaguin@...>
To:        
[hidden email]
Date:        
09/02/2015 11:43 AM
Subject:        
[SPSSX-L] distance between zipcodes
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (
http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.
Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to
LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Maguin, Eugene

Jon,

I ruthlessly avoided having anything to do with python or extension commands. Time to do something different because of this problem.

 

This is the key:

spssinc trans result=distance
/formula "extendedTransforms.ellipseDist(latitude, longitude, lat2, long2)".

So. I’m guessing that this is an python extension command. I looked in the python reference and see that the appendix F lists a spssinc trans function. Does that imply that this formula is somewhere on my spss install?

 

I see the Run Scripts in Utilities. It wants a file name. What’s the file name? I assume my verison of 23 will run this. True assumption?

 

Thanks, Gene Maguin

 

 

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Wednesday, September 02, 2015 3:08 PM
To: [hidden email]
Subject: Re: distance between zipcodes

 

Here's an snippet example using a zipcode file I found somewhere on the net and squirreled away.

get file="c:/data/zipcodes.sav".
dataset name zipcodes.
dataset activate zipcodes.
sort cases by zipcode.

dataset activate main.
MATCH FILES /FILE=*
  /TABLE='zipcodes'
  /BY zipcode.

spssinc trans result=distance
/formula "extendedTransforms.ellipseDist(latitude, longitude, lat2, long2)".


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 12:21 PM
Subject:        Re: [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Thanks, Jon.  No, not all by all. The input file would be id, zip1, zip2 and the output file would be id, zip1, zip2, distance. I understand your point about accuracy but we don’t have street addresses. Ok, so a python routine can do the computation given lat/long numbers. Do there exist files of lat/long numbers for zip code centers (however those centers are defined)?
 
Gene Maguin
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck
Sent:
Wednesday, September 02, 2015 1:55 PM
To:
[hidden email]
Subject:
Re: distance between zipcodes

 
I did something like this a few years ago.  If you have a zip codes table with lat/long values, you can use the SPSSINC TRANS extension command with the extendedTransforms.ellipseDist function (or use spherical distances) to compute the distances.  But do you really want all by all?  That's going to be millions of numbers.  Note also that zip code areas can have funny shapes, so, especially for close areas, the distances won't be super accurate.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621





From:        
"Maguin, Eugene" <[hidden email]>
To:        
[hidden email]
Date:        
09/02/2015 11:43 AM
Subject:        
[SPSSX-L] distance between zipcodes
Sent by:        
"SPSSX(r) Discussion" <
[hidden email]>






Is there a method in spss or on the web somewhere that reads a file of five digit zip codes and returns (writes back) a file of the distance between them? Somebody has pointed out that this website (
http://www.melissadata.com/lookups/zipdistance.asp) will return a distance for a pair of typed/copied in zip codes (it may do more but in return for something). We have, potentially, several thousand to do.
Thanks, Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email](not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email](not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email](not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

David Marso
Administrator
When I run it as is I receive the error:

Warnings
No module named extendedTransforms

There is also nothing I see in the available Extensions on the IBM site when I connect via the Utilities>Extension Bundles...
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Andy W
In reply to this post by Maguin, Eugene
If you have the lat and lon's already, all you need is the law of cosines to calculate the distance. This webpage, http://www.movable-type.co.uk/scripts/latlong.html, shows it in Excel, which is pretty easy to port to SPSS. SPSS does not have ACOS,  but this tech note shows how to compute it, https://www-304.ibm.com/support/docview.wss?uid=swg21476208.

Given the coarseness of zipcodes, you don't need to worry about problems with calculating small distances, http://gis.stackexchange.com/a/4911/751. I'm not sure about the error introduced by assuming the earth is a perfect sphere, but I imagine it isn't that big either.

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Jon K Peck
In reply to this post by David Marso
extendedTransforms.py is a utility module, not an extension command itself, so it is in the Utilities collection, which is accessible via the Downloads page here.
https://www.ibm.com/developerworks/community/wikis/home?lang=en#/wiki/We70df3195ec8_4f95_9773_42e448fa9029/page/Downloads%20for%20IBM%C2%AE%20SPSS%C2%AE%20Statistics

or directly here
https://www.ibm.com/developerworks/community/files/app?lang=en#/file/abea0af7-da27-4dd1-80f9-958b935eeb48

I think we install it starting with V23, but I'm not positive about that.  It would need to be saved to a location on the Python search path such as the python\lib\site-packages directory under the Statistics installation (as of V22).

For those interested, here is a list of the functions in that module
"""Functions designed to be used with the trans module to carry out one or more transformations on casewise data.
search:                       search a string for a match to a regular expression, case sensitive or not
subs:                         replace occurrences of a regular expression pattern with specified values
templatesub:                  substitue values in a template expression
levenshteindistance:          calculate similarity between two strings
soundex:                      calculate the soundex value of a string (a rough phonetic encoding)
nysiis:                       enhanced sound encoding (claimed superior to soundex for surnames)
soundexallwords:              calculate the soundex value for each word in a string and return a blank-separated string
median:                       median of a list of values
mode:                         mode of a list of values
multimode:                    up to n modes of a list of values
matchcount:                   compare value with list of values and count matches using
                                  standard or custom comparison function
strtodatetime:                convert a date/time string to an SPSS datetime value using a pattern
datetimetostr:                convert an SPSS date/time value to a string using a pattern
lookup:                       return a value from a table lookup
vlookup:                      return a value from a table lookup (more convenient than lookup w SPSSINC TRANS)
vlookupinterval:              return a value from a table lookup using intervals
sphDist:                      calculate distance between two points on earth using spherical approximation
ellipseDist:                  calculate distance between two points on earth using ellipsoidal approximation
jaroWinkler                   calculate Jaro-Winkler string similarity measure
extractDummies                extract a set of binary variables from a value coded in powers of 2
packDummies                   pack a sequence of numeric and/or string values into a single float
translatechar                 map characters according to a conversion table
countWkdays                   count number of days between two dates that are not excluded
vlookupgroupinterval          return a value associated with a group and a set of intervals for that group
countDaysWExclusions          count days in interval exclusing specificied weekdays and other dates
DiceStringSimilarity          compare strings using Dice bigram metric.
Dictdict                      find best match of strings using Dice metric
setRandomSeed                 initialize random number generator
invGaussian                   inverse Gaussian distribution random numbers
triangular                    triangular random numbers



Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        David Marso <[hidden email]>
To:        [hidden email]
Date:        09/02/2015 04:59 PM
Subject:        Re: [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




When I run it as is I receive the error:

Warnings
No module named extendedTransforms

There is also nothing I see in the available Extensions on the IBM site when
I connect via the Utilities>Extension Bundles...




-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/distance-between-zipcodes-tp5730565p5730569.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Andy W
In reply to this post by Andy W
Here is a macro that uses the law of cosines I mentioned.

DEFINE !CosDist (Lat1 = !TOKENS(1)
                /Lon1 = !TOKENS(1)
                /Lat2 = !TOKENS(1)
                /Lon2 = !TOKENS(1)
                /Rad = !DEFAULT(6371000) !TOKENS(1)
                /Res = !TOKENS(1))
COMPUTE #ToRad = ( 4*ARTAN(1) )/180.
COMPUTE #L1R = !Lat1*#ToRad.
COMPUTE #L2R = !Lat2*#ToRad.
COMPUTE #Lo1R = !Lon1*#ToRad.
COMPUTE #Lo2R = !Lon2*#ToRad.
COMPUTE #S = SIN(#L1R)*SIN(#L2R).
COMPUTE #C = COS(#L1R)*COS(#L2R)*COS(#Lo2R-#Lo1R).
COMPUTE !Res = (2*ARTAN(1) - ARSIN(#S + #C))*!Rad.
!ENDDEFINE.

I compared this to a sample of zipcode distances for New York to one location, https://dl.dropboxusercontent.com/u/3385251/Cosine_Distances.sps, to see what the error between this and the "extendedTransforms.ellipseDist" function. (See this blogpost for background, https://andrewpwheeler.wordpress.com/2014/11/19/using-the-google-distance-api-in-spss-plus-some-eda-of-travel-time-versus-geographic-distance/.)

For that sample, the average error was around 500 meters, but grew with the distance. The percent error was always less than 0.3% in that sample (which I saw that number somewhere else as well, so it might be a universal rule-of-thumb). So if the typical distances are around 20 kilometers, the error from using the law of cosines above is likely to be around 60 meters. For 500 kilometers, the error would be 1500 meters, etc. Probably reasonable for zipcode distances, as I would guess the coarseness of them makes around a 1 kilometer average error even in city areas where zips are smaller.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Kirill Orlov
I remember something related was published on Raynald's site long ago.
http://www.spsstools.net/Syntax/Compute/ComputeDistancesOnEarth.sps

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Jon K Peck
This is the complete syntax for looking up the zipcode coordinates and calculating the distance using a database of zipcodes.

* zip code file with variables zipcode, latitude, longitude (and others).
get file="c:/data/zipcodes.zsav".
dataset name zipcodes.

* test data.
data list list/id zip1 zip2(3F5.0).
begin data
1 60093 60090
2 44074 60090
3 07090 60093
4 87506 60093
5 60093 87506
6 87506 87501
7 87506 87506
end data.
dataset name clients.
format zip1 zip2(N5).

* map zip1 and zip2 to latitude, longitude coordinates.
* Note that all references to variables must match the letter case exactly.
* The terms in square brackets list the looked up values to return.
spssinc trans result=lat1 long1
/initial "extendedTransforms.vlookup('zipcode', ['latitude', 'longitude'], 'zipcodes')"
/formula func(zip1).

spssinc trans result=lat2 long2
/initial "extendedTransforms.vlookup('zipcode', ['latitude', 'longitude'], 'zipcodes')"
/formula func(zip2).

* Calculate the distance between the zipcodes using the coordinates.
* Coordinates are in degrees, so inradians is set to false.
spssinc trans result=distance
/formula "extendedTransforms.ellipseDist(lat1, long1, lat2, long2, inradians=False)".

* Just for curiosity, calculate the distance using the spherical approximation.
spssinc trans result=sphdistance
/formula "extendedTransforms.sphDist(lat1, long1, lat2, long2, inradians=False)".


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Kirill Orlov <[hidden email]>
To:        [hidden email]
Date:        09/03/2015 08:01 AM
Subject:        Re: [SPSSX-L] distance between zipcodes
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I remember something related was published on Raynald's site long ago.
http://www.spsstools.net/Syntax/Compute/ComputeDistancesOnEarth.sps

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: distance between zipcodes

Richard Ristow
In reply to this post by Maguin, Eugene
At 04:32 PM 9/2/2015, Maguin, Eugene wrote:

>I ruthlessly avoided having anything to do with python or extension
>commands. Time to do something different because of this problem.

As noted by others, if spherical approximation is good enough, the
problem is easily amenable to native SPSS code.  Here's a solution
that has been posted two or three times in the past.  It's not
wrapped in a macro, but since the core is a single COMPUTE statement,
I don't think it needs to be.  It's from a (probably over-elaborate)
post I wrote on the subject, back in 2009(*):

FAQ: Computing distance from latitude and longitude (DRAFT)

Sections below are
===== (1) Solution in native SPSS transformation code
===== (2) Using Python code from Developer Central
===== (3) Test data and test run of native SPSS code


===== (1) Native SPSS code =====================================
Earth-radius values are from
http://en.wikipedia.org/wiki/Earth_radius; see also
http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html.

*  ...............   Initialize constants      ................. .


*  The following code >REQUIRES< that angles in the base system  .
*  (SPSS) be in radians, so that the trigonometric distance is   .
*  in radians, and can be multiplied directly by the Earth's     .
*  radius.                                                       .

DO IF $CASENUM EQ 1.
*  These initializations >MUST< be performed:                    .
*  #EarthRad is the Earth's radius in whatever units you please; .
*  the calculated distance will be in those units:               .
*              6,372.7976    km,                                 .
*              3,959.873     statute miles,                      .
*              3,441.035     nautical miles.                     .
.  COMPUTE
    #EarthRad =  3959.873  /* statute miles */.

*  #AngleCvt is the number of your angle units (degrees, here)   .
*  in one of SPSS's angle units (radians). It uses that          .
*  ARCTAN(1)is PI/4 radians or (in any angle measure) 1/8 circle .
.
.  COMPUTE
    #AngleCvt =  360  /* Number of input units in a full circle */
               /(8*ARTAN(1)).

END IF.

*  ...............   Compute distance          ................  .
*  Compute distance between points with coordinates              .
*  (lat1,lon1) and (lat2,lon2)                                   .

compute distance = #EarthRad*
     (2*artan(1)-arsin(      sin(lat1/#AngleCvt)   /* (sin(lat1)  */
                            *sin(lat2/#AngleCvt)   /* .sin(lat2)  */
                          +  cos(lat1/#AngleCvt)   /* +cos(lat1)  */
                            *cos(lat2/#AngleCvt)   /* .cos(lat2)  */
                            *cos(lon2/#AngleCvt    /* .cos(long2  */
                                -lon1/#AngleCvt)   /*     -long1))*/
                       )).

FORMAT    Distance (F7.2).


===== (2) Using Python code ====================================
 From Peck, Jon, "Re: Function for arc cosine", to SPSSX-L Thu, 7 Jun
2007 09:53:06 -0500""

"In the extendedTransforms module on SPSS Developer Central (
www.spss.com/devcentral), there are two functions that implement
distance calculations on Earth latitude and longitude coordinates.

sphDist:     calculate distance between two points on earth using
spherical approximation
ellipseDist: calculate distance between two points on earth using
ellipsoidal approximation

"Here is a simple usage example for just a single distance pair.
.............

begin program.
import spss
import extendedTransforms

fromloc = (41.90, 87.65)
toloc = (41.73, 71.43)
dist1 = extendedTransforms.ellipseDist(fromloc[0], fromloc[1],
toloc[0], toloc[1], inradians=False)
dist2 = extendedTransforms.sphDist(fromloc[0], fromloc[1], toloc[0],
toloc[1], inradians=False)
print dist1, dist2

end program.

(3) =====  Test data, and test run  =================================
The 'given' values which are compared with the calculation are,

. Providence to Chicago distance, from Jon Peck's posting "Re:
Function for arc cosine", Thu, 7 Jun 2007 09:53:06 -0500

. Others, arbitrary test points with distance calculated at site
http://www.movable-type.co.uk/scripts/latlong.html. It's not clear
why tiny discrepancies remain.


DATA LIST LIST /
    City1    lat1    lon1 City2    lat2    lon2   GivenDist
    (A4,     F6.2,   F6.2,A4,      F6.2,   F6.2,  F7.2).
BEGIN DATA
    Pvd     41.90   87.65 Chi     41.73   71.43   836.27
    A1      42.00   80.00 A2      39.00   70.00   564.33
    B1      44.00   70.00 B2      49.00   85.00   791.00
END DATA.

.  /*--  LIST /*-*/.


*  ...............   Initialize constants      ................. .

*  The following code >REQUIRES< that angles in the base system  .
*  (SPSS) be in radians, so that the trigonometric distance is   .
*  in radians, and can be multiplied directly by the Earth's     .
*  radius.                                                       .

DO IF $CASENUM EQ 1.
*  These initializations >MUST< be performed:                    .
*  #EarthRad is the Earth's radius in whatever units you please; .
*  the calculated distance will be in those units:               .
*              6,372.7976    km,                                 .
*              3,959.873     statute miles,                      .
*              3,441.035     nautical miles.                     .
.  COMPUTE
    #EarthRad =  3959.873  /* statute miles */.

*  #AngleCvt is the number of your angle units (degrees, here)   .
*  in one of SPSS's angle units (radians). It uses that          .
*  ARCTAN(1)is PI/4 radians or (in any angle measure) 1/8 circle .
.
.  COMPUTE
    #AngleCvt =  360  /* Number of input units in a full circle */
               /(8*ARTAN(1)).

END IF.

*  ...............   Compute distance          ................  .
*  Compute distance between points with coordinates              .
*  (lat1,lon1) and (lat2,lon2)                                   .

compute distance = #EarthRad*
     (2*artan(1)-arsin(      sin(lat1/#AngleCvt)   /* (sin(lat1)  */
                            *sin(lat2/#AngleCvt)   /* .sin(lat2)  */
                          +  cos(lat1/#AngleCvt)   /* +cos(lat1)  */
                            *cos(lat2/#AngleCvt)   /* .cos(lat2)  */
                            *cos(lon2/#AngleCvt    /* .cos(long2  */
                                -lon1/#AngleCvt)   /*     -long1))*/
                       )).

FORMAT    GivenDist(F7.2).

COMPUTE   DeltaPct = 100*(Distance/GivenDist-1).
FORMATS   DeltaPct (PCT7.2).

LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD