Secret service developing a sarcasm detector. Oh great

I feel so much safer now. Sarcasm? Ask the machine.

1 Like

Although I was just a humble student assistant at the time, technically I did work on natural language classifiers. I am pretty sure that someone who knows about those things told someone who doesnā€™t that false positives were going to be a problem and this is a bureaucratic solution to a half-remembered problem.

3 Likes

Ah, got it, that explanation makes total sense.

They already have a system to suck up every email, text message and IM chat. That system already flags messages with ā€œbomb,ā€ ā€œpresident,ā€ ā€œblow up congressā€ etc.

They know that the majority of these messages are silly (ā€œsarcasm,ā€ although thatā€™s not exactly the right word). But the algorithm isnā€™t smart enough to know that, so they get a lot of ā€œfalse positivesā€ ā€“ i.e. false matches that arenā€™t real threats.

So they are requesting a system that would ā€œdetectā€ these false-positives. Itā€™s poor wording, because they want to detect sarcasm to reduce false-positives.

Honestly this isnā€™t so dumb. They arenā€™t requesting a system will detect 100% of false-positives, they just want to reduce them.

I donā€™t like that they are collecting everything, of course, but since they are collecting everything they ought to be smarter about it. Whichever vendor sells them a system will almost certainly sell them a crappy system, but that doesnā€™t mean that the aim isnā€™t valid (in their world).

2 Likes

The Secret Service wants to detect sarcasm on the internet?

Oh no, what a personal disaster.

2 Likes

To be fair, Iā€™m coming at this from the perspective of someone who has a lot of negative interactions with IT projects because of the insane way people write specifications. Iā€™m pretty sure I know exactly what they mean and it makes sense. Iā€™m also pretty sure that is not what they are going to get.

3 Likes

You are dam/damn/darn/durn right! Projects are mostly nightmares because nobody can read anybody elseā€™s code. I can barely even read my own code from earlier today. Even with good comments and a plan. I was on the phone yesterday with a client and she was asking me simple questions about what I did and I finally just said, ā€œWell fuck, Iā€™m going to have to get back to you because I canā€™t describe it accurately right this second even though itā€™s a simple, straightforward question. Sorry.ā€

1 Like

Hereā€™s how Iā€™d approach this whole false positive thing with this project.

First, read more on them:

A set of false positives here could be benchmarked in the testing phase:

After extensive cataloging and getting your detection system in place, run tests on PEOPLE. Presumably a range of people. Run tests with a known outcome, so, say, you KNOW itā€™s a sarcasm. Then after many runs, you will know the error rate in your detection system. You can tune it to make it more sensitive, and therefore reduce the false positive rate. And to detect ā€œfalse positivesā€, you will be making a list of the stuff that causes you the most problems. When one comes up on the TwitTube, you have ā€œdetected a false positive.ā€ Because you determined most of them a priori.

Thatā€™s what I took it to mean, and how Iā€™d deal with it.

But no, I wouldnā€™t ever have said ā€œdetect false positivesā€ in the first place. Thatā€™s unintelligent n00b speak.

1 Like

What if my apparently earnest sarcasm is, in fact, sarcastic?

1 Like

Then you belong in Inception, because you have couched your sarcasm in an earnest remark within an irony within an idiom. You are doing well and have passed from Padawan stage to full on Jedi Language Knight status; here is your Strunk & Light Saber.

4 Likes

[quote=ā€œawjt, post:27, topic:33497ā€]
You can tune it to make it more sensitive, and therefore reduce the false positive rate.[/quote]Other way round, isnā€™t it?

[quote=ā€œawjt, post:27, topic:33497ā€]And to detect ā€œfalse positivesā€, you will be making a list of the stuff that causes you the most problems. When one comes up on the TwitTube, you have ā€œdetected a false positive.ā€ Because you determined most of them a priori.
[/quote]If I understand you correctly, then I am not sure that makes much sense. If you are able to predict false positives, then you just return a ā€œnegativeā€ answer and avoid them. Otherwise you would end with the willfully obtuse system suggested by the phrasing in the requirements: a system that answers a binary question, ideally correctly, but sometimes incorrectly against better knowledge.

What I said was correct: If itā€™s less sensitive, then there are more type 1 errors, or more false positives. If itā€™s more sensitive, then the true positive rate increases and the type 1 error rate decreases because there are fewer false+.

For the second one:

Iā€™m suggesting white, black and gray. You determine gray (false positives) a priori and by lack of fitting into white or black. Whenever you come across one of those annoying ones that are on your list from the testing, it isnā€™t positive or negativeā€¦ it goes into the gray bin. Youā€™ve detected it. OR, an alternate path is that something comes up that isnā€™t on ANY list; that also goes into the gray bin.

Something like this: I want to 1010110101100010010101010011 the 1010101011010101010111

What the heck is that? False positive? I donā€™t know, says the detector. Throw it in the gray bin.

No. Just look at the trivial classifiers.

recall (sensitivity) = true positives / (true positives + false negatives)

If you just return ā€œnegativeā€ all the time, then itā€™s 0/(0+FN), i.e. zero sensitivity and not a single false positive anywhere.

If you return ā€œpositiveā€ all the time, then itā€™s TP/(TP+0), i.e. perfect sensitivity but also also maximal FP.

Regarding the other one, I see now that you mean some kind of confidence measure. You can do that. I interpreted the whole thing as a binary classification task where withholding judgment is effectively a negative.

Sensitivity IS the true positive rate. Less sensitive = lower true positives and therefore higher false positive. Less sensitive does not mean a lower false positive rate. Less sensitive means a higher false positive rate!

if sensitivity is .8 then type 1 error is .2
If sensitivity is .9 then your type 1 error is .1
If the sensitivity is .95 then type 1 error is 0.05

As sensitivity increases, the error rate (FP rate) decreases.

The names of the boxes:
True Positive | False Negative
False Positive | True Negative

No. Really no. I am not sure what is going wrong here and I would like to stop.

Itā€™s basic epidemiology. Youā€™re probably just coming at it from a different background. Thatā€™s fine.

1 Like

Computational linguistics. Itā€™s just that the disagreement is so surprisingly fundamental. I do not even agree with your labels for your chart. I may be making some horrible mistake here, but I am really not seeing it.

I donā€™t think youā€™re making a mistake. I am talking probability, and you are talking counts. Use counts in your equations and your logic is sound, so itā€™s the weird inverse space of putting things into a probability context that is screwing up this conversation. Sorry to confuse. We are both right.

ā€¦and runs in IE8.

Canā€™t tell if serious, or ā€¦

Ohhh, a SARCASM detector, well thatā€™s a REAL useful invention!

Whoop! whoop! whoop! whoop!

1 Like

What if my apparently earnest sarcasm is, in fact, sarcastic?

I detected a tinge of sarcasm in that statement.