Sentiment analysis of your Reddit account


#1

Originally published at: http://boingboing.net/2017/01/04/sentiment-analysis-of-your-red.html


#2

Bogus.

It claimed the following sentence of mine almost pegged the negativity meter:

-0.8625  Does anyone know how badly hurt this poor guy was?

The rest that are flagged as negative are at worst neutral.


#3

I’ve written some fiction (in /r/writingprompts and /r/feghoot mainly), and some of those sentences are weird out of context. I was like, “I wrote that?!” for the two “most negative” comments before I realized that they were each part of dark short stories.


#4

Less bogus and more just extremely rudimentary. Sentiment analysis is still in a pretty nascent form and does a relatively poor job with very short clips like yours. In it’s most basic form it’s essentially just looking at the major words of your sentence and applying a weight to them. So your sentence would look like:

Which are all pretty negative with no real positive words to change the score so it’s a very negative sentence. To figure out that this is a neutral question with some concern you’d have to teach the computer that ‘poor guy’ in many contexts and especially in questions can indicate concern. That’s really tough to do because you can’t really afford to spend time programming all these particular cases because you’ll just be chasing edge cases forever.


#5

So what you’re saying is that it’s so rudimentary, it borders on being virtually useless?


#6

I have a couple sentences describing a recipe that are rated negative because they talk about a violent boil and eating (cranberry) skin.


#7

Yeah, the theme here is that it’s not doing any semantic analysis at all. Every one of the phrases it flagged as highly negative was me talking about some positive way of thinking about something that could be thought about as negative.


#8

I love it …

One of my negative statements was:

“Fire! Fire! Help me! 123 Carrendon Road.”

And positive statements from the same comment:

“All the best, Maurice Moss”

I expect the first part of that posting was too nutral to show:

“Dear Sir/Madam”

Somehow I think the analysis is a bit faulty but it says I’m slightly positive overall so I will take it.


#9

This sentiment analysis stuff is a joke. A sentence is more than just the sum of the words in the sentence, and it also depends on context gleaned from sentences before and after as well. To do it for real, one would need significant human input, classifying sentences by sentiment, and then feeding those sentences into a feature extractor and then something like a Support Vector Machine for ranked classification. The propellerheads who like these sentiment analysis projects don’t seem to like the human input part, and they don’t give as much thought to feature extraction as I’d like either.


#10

I guess the best thing this demonstrates is how useless scoring individual words is without considering context.

Though overall, maybe someone who says ‘asshole’ a lot is more negative than someone who says ‘love’ a lot, the aggregate score could be more valid than the individual post score, as you’d expect.

Looking at the most positive user: apparently a chatbot, or is indistinguishable from one. Which I guess you’d expect from a bot that was trying to chat up dumb guys.

The most negative user is being a total pottymouth, but he’s dealing with ‘steal and repost for upvotes’ people, who are a cancer.


#11

Correct. This is what it calculated to be one of my most positive sentences:

Well you’ve certainly convinced me.

I guarantee that once put into context, that is not even a slightly positive sentence.

(Its most negative rated one is me discussing a local mobile carrier. I’m actually being relatively positive, but the sentence includes the word “dead”, so…)


#12


#13

All my negative ones were about how to “fight”, “beat”, “defeat” or “attack” a video game boss, so not too much to worry about there.

Of course, “video game boss” is my secret code for women/Jews/blacks/gays/liberals, so it’s probably on the money /s


#14

On the other hand, this is among the most positive, and it’s a work of art:


#15

Like most things in the whole big data/machine learning space it’s useful in its own little corner. Sentiment analysis is best suited for limited situations like reviews. Most reviews are relatively focused on the items being reviewed so you don’t run into issues like the sentence you mentioned where it’s a question with sympathy.


#16

Which is why I said this implementation is bogus. If it’s not designed to read ambiguity, then why use it for that? And I’d like to think that the online reviews I leave are as nuanced as the comments I leave elsewhere.


#17

White Nationalist video games. Make America Game Again.


#18

This topic was automatically closed after 5 days. New replies are no longer allowed.