Detect words in string variables with keyword list

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Detect words in string variables with keyword list

Wes_77
Hello everyone!

I am pretty new at this and have difficulties figuring out the following
problem:

I have a dataset with various string variables in form of social media
comments. I want to detect certain words within these the comments based on
a keyword list, which contains over 11000 words. Basically everything I'd
need is a new variable, which would show me a 1 (for keyword detected) or a
0 (for keyword not detected).

Example:

*Keyword List
*
alright
bright
custom
....
....
zinc

Comment 1: "I think he was *alright*." (VAR=1)
Comment 2: "I like her." (VAR=0)
Comment 3: "I should really take *zinc* for my health." (VAR=1)



I read through some similar threads but they contained instruction for
programming in Python. Is there an easier way to figure this out?

Thanks,
Wes



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Detect words in string variables with keyword list

Jon Peck
You really want to create 11000 variables?

On Sun, Jun 10, 2018 at 9:10 PM Wes_Mantooth <[hidden email]> wrote:
Hello everyone!

I am pretty new at this and have difficulties figuring out the following
problem:

I have a dataset with various string variables in form of social media
comments. I want to detect certain words within these the comments based on
a keyword list, which contains over 11000 words. Basically everything I'd
need is a new variable, which would show me a 1 (for keyword detected) or a
0 (for keyword not detected).

Example:

*Keyword List
*
alright
bright
custom
....
....
zinc

Comment 1: "I think he was *alright*." (VAR=1)
Comment 2: "I like her." (VAR=0)
Comment 3: "I should really take *zinc* for my health." (VAR=1)



I read through some similar threads but they contained instruction for
programming in Python. Is there an easier way to figure this out?

Thanks,
Wes



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Detect words in string variables with keyword list

David Marso
Administrator
In reply to this post by Wes_77
11000 variables? Horrible [expletives deleted] idea!!
<http://spssx-discussion.1045642.n5.nabble.com/basic-string-question-td5735995.html
Basic syntax solution in the middle of the mess.



 author=&quot;Wes_77&quot;>
Hello everyone!

I am pretty new at this and have difficulties figuring out the following
problem:

I have a dataset with various string variables in form of social media
comments. I want to detect certain words within these the comments based on
a keyword list, which contains over 11000 words. Basically everything I'd
need is a new variable, which would show me a 1 (for keyword detected) or a
0 (for keyword not detected).

Example:

*Keyword List
*
alright
bright
custom
....
....
zinc

Comment 1: "I think he was *alright*." (VAR=1)
Comment 2: "I like her." (VAR=0)
Comment 3: "I should really take *zinc* for my health." (VAR=1)



I read through some similar threads but they contained instruction for
programming in Python. Is there an easier way to figure this out?

Thanks,
Wes



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@.UGA (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Detect words in string variables with keyword list

Ives, Melissa L
In reply to this post by Jon Peck

Given the example, it seems to me that the OP wants a single variable that is coded 1 if any of the 11000 words are found and 0 if none are found.​


--------------

Melissa Ives



From: SPSSX(r) Discussion <[hidden email]> on behalf of Jon Peck <[hidden email]>
Sent: Sunday, June 10, 2018 11:13 PM
To: [hidden email]
Subject: Re: [SPSSX-L] Detect words in string variables with keyword list
 
You really want to create 11000 variables?

On Sun, Jun 10, 2018 at 9:10 PM Wes_Mantooth <[hidden email]> wrote:
Hello everyone!

I am pretty new at this and have difficulties figuring out the following
problem:

I have a dataset with various string variables in form of social media
comments. I want to detect certain words within these the comments based on
a keyword list, which contains over 11000 words. Basically everything I'd
need is a new variable, which would show me a 1 (for keyword detected) or a
0 (for keyword not detected).

Example:

*Keyword List
*
alright
bright
custom
....
....
zinc

Comment 1: "I think he was *alright*." (VAR=1)
Comment 2: "I like her." (VAR=0)
Comment 3: "I should really take *zinc* for my health." (VAR=1)



I read through some similar threads but they contained instruction for
programming in Python. Is there an easier way to figure this out?

Thanks,
Wes



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Detect words in string variables with keyword list

John F Hall

No, I don't think so.  Need to reduce the list or, perhaps ask for help from Cambridge Analytica, Google, MI6, CIA or FBI? 

Seriously have a look at Atlas.ti [https://atlasti.com/product/v8-windows/]

Lesley Andres, Designing and Doing Survey Research, (Sage 2012) has a couple of short sections introducing SPSS (quantitative) and Atlas.ti (text) to analyse survey data, and describing recent extremely promising developments enabling communication between them.

 

John F Hall  MA (Cantab) Dip Ed (Dunelm)

[Retired academic survey researcher]

 

Email:          [hidden email]

Website:     Journeys in Survey Research

Course:       Survey Analysis Workshop (SPSS)

Research:   Subjective Social Indicators (Quality of Life)

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Ives, Melissa L
Sent: 11 June 2018 15:57
To: [hidden email]
Subject: Re: Detect words in string variables with keyword list

 

Given the example, it seems to me that the OP wants a single variable that is coded 1 if any of the 11000 words are found and 0 if none are found.​

 

--------------

Melissa Ives

 


From: SPSSX(r) Discussion <[hidden email]> on behalf of Jon Peck <[hidden email]>
Sent: Sunday, June 10, 2018 11:13 PM
To: [hidden email]
Subject: Re: [SPSSX-L] Detect words in string variables with keyword list

 

You really want to create 11000 variables?

 

On Sun, Jun 10, 2018 at 9:10 PM Wes_Mantooth <[hidden email]> wrote:

Hello everyone!

I am pretty new at this and have difficulties figuring out the following
problem:

I have a dataset with various string variables in form of social media
comments. I want to detect certain words within these the comments based on
a keyword list, which contains over 11000 words. Basically everything I'd
need is a new variable, which would show me a 1 (for keyword detected) or a
0 (for keyword not detected).

Example:

*Keyword List
*
alright
bright
custom
....
....
zinc

Comment 1: "I think he was *alright*." (VAR=1)
Comment 2: "I like her." (VAR=0)
Comment 3: "I should really take *zinc* for my health." (VAR=1)



I read through some similar threads but they contained instruction for
programming in Python. Is there an easier way to figure this out?

Thanks,
Wes



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

 



This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD