# Opinion Mining / Feedback Mining - How to derive relative positiveness among Words using Statistical/Analytical Models?

One of my Data Science pet projects for this weekend was "Opinion / Feedback Mining"

This idea is to understand the polarities of words (Positive, Negativity, Neutral)in sentences and determine the overall polarity of the sentence.

After successful in finding polarities, I am stuck with another thought - Which of the Positive Words are MORE positive in relative to others.

For Example, Great vs Best ! Which one is more Positive ? Is there a proven way to assign "Weights" to positive words in a Statistical/Analytical way to determine more Positiveness ?

Any thoughts on this would be greatly appreciated !!

Tags: Bag, Data, Mining, NLP, Opinion, Polarity, Science, Word, Words, of

Views: 703

### Replies to This Discussion

I'm not quite sure of a proven way, by this I am assuming you mean existing algorithm.

But just brain storming here--in survey analysis we typically use likert-type responses, I prefer 5 levels--Strongly Positive, Positive, Neutral, Negative, Strongly Negative). A down side to this is that it assumes one weighted point separates each category ex (scores of 5,4,3,2,1) respectively.

A plus side is that after quantifying, one can perform the usual statistical tests on the weighted values (scores).

As for the Great versus Best question, I would argue there is a difference but it may be difficult to do. It almost sounds like you need to draw the line somewhere and assign words that are Strongly Positive from the existing positive word list or lookup you are using--which sounds like it could take an eternity. Not sure if there is a list out there for Very positive words, though, I know they have some for positive words.

Again, just brain storming.

Thank you so much Roque for your response.

Yes, as you mentioned in Sentiment Analysis we assign separate weighted points / levels to determine Positive vs. Neutral vs. Negative. (or may be use "Bag Of Words"). I have not used Lookup / bag of words as it makes the model static but I programmatically worked it out to be dynamically intelligent to determine the nature of the words.

Now Imagine, I am left out with only Positive words and I am trying to ascertain the positiveness versus more positiveness to create connections between words to roll up to greater attributes.Think of this data set as Employee's internal feedback.

The final end product of my Idea/project may be a network graph of these positive+more positive attributes to create an individual's profile / Identity. Hope I am able to put my thoughts clearly here.

Anybody worked on such problem statements or statistical methods ? Any Thoughts on this would be greatly helpful.

Roque Graciani said:

I'm not quite sure of a proven way, by this I am assuming you mean existing algorithm.

But just brain storming here--in survey analysis we typically use likert-type responses, I prefer 5 levels--Strongly Positive, Positive, Neutral, Negative, Strongly Negative). A down side to this is that it assumes one weighted point separates each category ex (scores of 5,4,3,2,1) respectively.

A plus side is that after quantifying, one can perform the usual statistical tests on the weighted values (scores).

As for the Great versus Best question, I would argue there is a difference but it may be difficult to do. It almost sounds like you need to draw the line somewhere and assign words that are Strongly Positive from the existing positive word list or lookup you are using--which sounds like it could take an eternity. Not sure if there is a list out there for Very positive words, though, I know they have some for positive words.

Again, just brain storming.