Quantcast

read file names into spss

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

read file names into spss

Zhenzhen Wang
Hi,

My problem is a little bit complicated. I'll try to express it clearly.

1, I have a lot of text files. Actually each file represents a paper.

2, The rule of naming is "journal-year-issue-startpage-type". For example, a paper published in Journal of Communication might be named as "JOC-2000-1-20-a.txt".

3, I want to read these file names into spss, so that I can match or compare them with references exported from database.

4, I think the first step is to read each file name into a spss file, then add the files up. I wrote a syntax to do the first step. But I keep getting error messages.
   The red part is where I got error messages. I want to attach my syntax and test data but the mail system does not allow it, so I attach it as part of this letter.

Can you help me? Thank you very much!

Wang Zhenzhen


Each test file contains some words, but I don't really care what are the words. The names of the test files is:
AJC-2000-1-1-a
AJC-2000-1-7-b
AJC-2000-1-2-a
AJC-2000-1-2-b
(they are in a folder named AJC)
EJOC-2000-1-5-a
EJOC-2000-2-3-a
EJOC-2000-2-10-b
(they are in a folder named EJOC)

The syntax is as below.


define !path()'d:\test\'
!enddefine.

set mprint=off.
define !readname (journal=!charend('/')/type=!cmdend)
   !do !var1=1 !to 2.
       !do !var2=1 !to 10.
               get data
                   /TYPE=TXT
                   /FILE=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',!var1,'-',!var2,'-', !type,'.txt'))
                   /DELCASE=VARIABLES 1
                   /DELIMITERS=' '
                   /ARRANGEMENT=DELIMITED
                   /FIRSTCASE=1
                   /IMPORTCASE=first 1
                   /VARIABLES=word A100.
        string filename (a50).
        compute filename=!concat(!journal, '-2000-', !var1,'-',!var2,'-', !type).
               save outfile=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-', !var1,'-',!var2,'-', !type,'.sav'))
           !doend.
       !doend.
!enddefine.

set mprint=on.
!readname journal=AJC/type=a.
!readname journal=AJC/type=b.
!readname journal=EJOC/type=a.
!readname journal=EJOC/type=b.

-- 
Department of Media and Communication
City University of Hong Kong


--
Department of Media and Communication
City University of Hong Kong

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: read file names into spss

David Marso
Administrator
**ALWAYS** post your error messages!!!!!
When I run your MACRO I get:
>Error.  Command name: GET DATA
>(2269) Failure opening file: d:\test\AJC\AJC-2000-1-1-a.txt
>This command not executed.
Confirm these files exist under the subdirectory d:\test\AJC

Zhenzhen Wang wrote
Hi,

My problem is a little bit complicated. I'll try to express it clearly.

1, I have a lot of text files. Actually each file represents a paper.

2, The rule of naming is "journal-year-issue-startpage-type". For example,
a paper published in Journal of Communication might be named as
"JOC-2000-1-20-a.txt".

3, I want to read these file names into spss, so that I can match or
compare them with references exported from database.

4, I think the first step is to read each file name into a spss file, then
add the files up. I wrote a syntax to do the first step. But I keep getting
error messages.
   The red part is where I got error messages. I want to attach my syntax
and test data but the mail system does not allow it, so I attach it as part
of this letter.

Can you help me? Thank you very much!

Wang Zhenzhen


*Each test file contains some words, but I don't really care what are the
words. The names of the test files is:*
AJC-2000-1-1-a
AJC-2000-1-7-b
AJC-2000-1-2-a
AJC-2000-1-2-b
(they are in a folder named AJC)
EJOC-2000-1-5-a
EJOC-2000-2-3-a
EJOC-2000-2-10-b
(they are in a folder named EJOC)

*The syntax is as below.*


define !path()'d:\test\'
!enddefine.

set mprint=off.
define !readname (journal=!charend('/')/type=!cmdend)
   !do !var1=1 !to 2.
       !do !var2=1 !to 10.
               get data
                   /TYPE=TXT

 /FILE=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',!var1,'-',!var2,'-',
!type,'.txt'))
                   /DELCASE=VARIABLES 1
                   /DELIMITERS=' '
                   /ARRANGEMENT=DELIMITED
                   /FIRSTCASE=1
                   /IMPORTCASE=first 1
                   /VARIABLES=word A100.
        string filename (a50).
        compute filename=!concat(!journal, '-2000-', !var1,'-',!var2,'-',
!type).
               save
outfile=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',
!var1,'-',!var2,'-', !type,'.sav'))
           !doend.
       !doend.
!enddefine.

set mprint=on.
!readname journal=AJC/type=a.
!readname journal=AJC/type=b.
!readname journal=EJOC/type=a.
!readname journal=EJOC/type=b.

--
Department of Media and Communication
City University of Hong Kong


--
Department of Media and Communication
City University of Hong Kong
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: read file names into spss

Albert-Jan Roskam
The SAVE OUTFILE command does not end with a period. Also, perhaps a CACHE/EXECUTE *might* be needed after the GET DATA command. For database queries it's definitely needed; in this case however, I am not sure (but it won't hurt).

Maybe SPSSINC PROCESS FILES could be used instead of the macro (or of course, Python).
 
Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: David Marso <[hidden email]>
To: [hidden email]
Sent: Wednesday, January 11, 2012 7:21 PM
Subject: Re: [SPSSX-L] read file names into spss

**ALWAYS** post your error messages!!!!!
When I run your MACRO I get:
>Error.  Command name: GET DATA
>(2269) Failure opening file: d:\test\AJC\AJC-2000-1-1-a.txt
>This command not executed.
Confirm these files exist under the subdirectory d:\test\AJC


Zhenzhen Wang wrote

>
> Hi,
>
> My problem is a little bit complicated. I'll try to express it clearly.
>
> 1, I have a lot of text files. Actually each file represents a paper.
>
> 2, The rule of naming is "journal-year-issue-startpage-type". For example,
> a paper published in Journal of Communication might be named as
> "JOC-2000-1-20-a.txt".
>
> 3, I want to read these file names into spss, so that I can match or
> compare them with references exported from database.
>
> 4, I think the first step is to read each file name into a spss file, then
> add the files up. I wrote a syntax to do the first step. But I keep
> getting
> error messages.
>    The red part is where I got error messages. I want to attach my syntax
> and test data but the mail system does not allow it, so I attach it as
> part
> of this letter.
>
> Can you help me? Thank you very much!
>
> Wang Zhenzhen
>
>
> *Each test file contains some words, but I don't really care what are the
> words. The names of the test files is:*
> AJC-2000-1-1-a
> AJC-2000-1-7-b
> AJC-2000-1-2-a
> AJC-2000-1-2-b
> (they are in a folder named AJC)
> EJOC-2000-1-5-a
> EJOC-2000-2-3-a
> EJOC-2000-2-10-b
> (they are in a folder named EJOC)
>
> *The syntax is as below.*
>
>
> define !path()'d:\test\'
> !enddefine.
>
> set mprint=off.
> define !readname (journal=!charend('/')/type=!cmdend)
>    !do !var1=1 !to 2.
>        !do !var2=1 !to 10.
>                get data
>                    /TYPE=TXT
>
>
> /FILE=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',!var1,'-',!var2,'-',
> !type,'.txt'))
>                    /DELCASE=VARIABLES 1
>                    /DELIMITERS=' '
>                    /ARRANGEMENT=DELIMITED
>                    /FIRSTCASE=1
>                    /IMPORTCASE=first 1
>                    /VARIABLES=word A100.
>        string filename (a50).
>        compute filename=!concat(!journal, '-2000-', !var1,'-',!var2,'-',
> !type).
>                save
> outfile=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',
> !var1,'-',!var2,'-', !type,'.sav'))
>            !doend.
>        !doend.
> !enddefine.
>
> set mprint=on.
> !readname journal=AJC/type=a.
> !readname journal=AJC/type=b.
> !readname journal=EJOC/type=a.
> !readname journal=EJOC/type=b.
>
> --
> Department of Media and Communication
> City University of Hong Kong
>
>
> --
> Department of Media and Communication
> City University of Hong Kong
>


--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/read-file-names-into-spss-tp5137141p5137617.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: read file names into spss

David Marso
Administrator
Do all of the files blahblahblah-1-1.txt ... blahblahblah-2-10.txt exist?  I'll bet not, hence all the errors...
Verify the conditions of your data arrangements.
------------------
The following will suffice rather than the rather than the **VERBOSE** GET DATA .........
------------------
define !readname (journal=!charend('/')/type=!cmdend)
!do !var1=1 !to 2.
  !do !var2=1 !to 10.
    data list file=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',!var1,'-',!var2,'-', !type,'.txt'))/word (A100).          
    string filename (a50).
    compute filename=!concat(!journal, '-2000-', !var1,'-',!var2,'-', !type).
    save outfile=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-', !var1,'-',!var2,'-', !type,'.sav'))
  !doend.
!doend.
!enddefine.
Albert-Jan Roskam wrote
The SAVE OUTFILE command does not end with a period. Also, perhaps a CACHE/EXECUTE *might* be needed after the GET DATA command. For database queries it's definitely needed; in this case however, I am not sure (but it won't hurt).

Maybe SPSSINC PROCESS FILES could be used instead of the macro (or of course, Python).

 
Cheers!!
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


>________________________________
> From: David Marso <[hidden email]>
>To: [hidden email] 
>Sent: Wednesday, January 11, 2012 7:21 PM
>Subject: Re: [SPSSX-L] read file names into spss
>
>**ALWAYS** post your error messages!!!!!
>When I run your MACRO I get:
>>Error.  Command name: GET DATA
>>(2269) Failure opening file: d:\test\AJC\AJC-2000-1-1-a.txt
>>This command not executed.
>Confirm these files exist under the subdirectory d:\test\AJC
>
>
>Zhenzhen Wang wrote
>>
>> Hi,
>>
>> My problem is a little bit complicated. I'll try to express it clearly.
>>
>> 1, I have a lot of text files. Actually each file represents a paper.
>>
>> 2, The rule of naming is "journal-year-issue-startpage-type". For example,
>> a paper published in Journal of Communication might be named as
>> "JOC-2000-1-20-a.txt".
>>
>> 3, I want to read these file names into spss, so that I can match or
>> compare them with references exported from database.
>>
>> 4, I think the first step is to read each file name into a spss file, then
>> add the files up. I wrote a syntax to do the first step. But I keep
>> getting
>> error messages.
>>    The red part is where I got error messages. I want to attach my syntax
>> and test data but the mail system does not allow it, so I attach it as
>> part
>> of this letter.
>>
>> Can you help me? Thank you very much!
>>
>> Wang Zhenzhen
>>
>>
>> *Each test file contains some words, but I don't really care what are the
>> words. The names of the test files is:*
>> AJC-2000-1-1-a
>> AJC-2000-1-7-b
>> AJC-2000-1-2-a
>> AJC-2000-1-2-b
>> (they are in a folder named AJC)
>> EJOC-2000-1-5-a
>> EJOC-2000-2-3-a
>> EJOC-2000-2-10-b
>> (they are in a folder named EJOC)
>>
>> *The syntax is as below.*
>>
>>
>> define !path()'d:\test\'
>> !enddefine.
>>
>> set mprint=off.
>> define !readname (journal=!charend('/')/type=!cmdend)
>>    !do !var1=1 !to 2.
>>        !do !var2=1 !to 10.
>>                get data
>>                    /TYPE=TXT
>>
>>
>> /FILE=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',!var1,'-',!var2,'-',
>> !type,'.txt'))
>>                    /DELCASE=VARIABLES 1
>>                    /DELIMITERS=' '
>>                    /ARRANGEMENT=DELIMITED
>>                    /FIRSTCASE=1
>>                    /IMPORTCASE=first 1
>>                    /VARIABLES=word A100.
>>         string filename (a50).
>>         compute filename=!concat(!journal, '-2000-', !var1,'-',!var2,'-',
>> !type).
>>                save
>> outfile=!path+!QUOTE(!concat(!journal,'\',!journal,'-2000-',
>> !var1,'-',!var2,'-', !type,'.sav'))
>>            !doend.
>>        !doend.
>> !enddefine.
>>
>> set mprint=on.
>> !readname journal=AJC/type=a.
>> !readname journal=AJC/type=b.
>> !readname journal=EJOC/type=a.
>> !readname journal=EJOC/type=b.
>>
>> --
>> Department of Media and Communication
>> City University of Hong Kong
>>
>>
>> --
>> Department of Media and Communication
>> City University of Hong Kong
>>
>
>
>--
>View this message in context: http://spssx-discussion.1045642.n5.nabble.com/read-file-names-into-spss-tp5137141p5137617.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD
>
>
>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Loading...