Originally published at: https://boingboing.net/2019/06/25/pictionary-considered-harmful.html
…
Is this intended to respond to volume and tone, or is it actually monitoring speech and looking for threatening words?
Because it’s quite easy to be aggressive towards someone while speaking in calm, measured tones.
Looking at the article, it looks like it’s a a ‘black box’ ML trained algorithm, and we all know how well those don’t work…
Like almost everything labeled as “AI”, these are just going to generate more noise for people to ignore. These devices are bullshit, fear-based cash grabs.
This. My guess is that someone who works for the school system’s budgetary review committee’s relative or similar is in the “aggression detection microphones” business.
If you’ve ever been in a school or worked with kids, you understand what a loud and chaotic environment that is on the best day. (imagine an AI that could pick up “aggression” sounds in a crowded lunchroom…).
We found that higher-pitched, rough and strained vocalizations tended to trigger the algorithm. For example, it frequently triggered for sounds like laughing, coughing, cheering and loud discussions. While female high school students tended to trigger false positives when singing, laughing and speaking, their high-pitched shrieking often failed to do so.
Those two assertions don’t seem congruent.
Also, does anyone else find a technical document describing the voices of female HS students as “high-pitched shrieking” a little… gross?
Given that the shrieking was mentioned in contrast to “singing, laughing and speaking”; my read wasn’t ‘shrill shrieks, natural communication method of the late-larval female human’; but ‘the system is total trash at female students because it tends to pick up their normal behavior as a trigger; and ignore the sounds to be expected if a group of them saw a gunman or similarly worrisome thing’.
Wording could be better; but with the benefit of the immediate context it seems reasonable enough: screaming and shrieking are sounds you might want to be alerted to, which women and girls usually produce at higher pitches; while “singing, laughing and speaking” are things they do basically all the time(at least in the hall between classes) and so are really bad stimuli to generate false positives to.
On another matter: has anyone tested these aggression-detection systems on the school ‘resource officers’? If a cop asserting his authoritay doesn’t set them off they are truly hopeless…
I have to wonder if someone was cheaping out on mics and expecting magic AI to make up the difference; or if I just misjudge how distinctive firearm discharged are(especially if they’ve bounced off a few walls and such).
According to the source I checked(DECIPHERING GUNSHOT RECORDINGS; ROBERT C. MAHER AND STEVEN R. SHAW, link when I’m not on a phone); gunshots(at least from weapons not designed for discretion) have several characteristics that slammed doors definitely don’t; but which normal, inexpensive, recording systems aren’t necessarily going to be good at picking up. Peak volumes up to 150dB will clip pretty decisively(and potentially exceed the level at which a given mic tops out, those sorts of volume levels aren’t the ones that people normally care most about); and initial shockwaves of ~300 microseconds, 5 milliseconds for the muzzle blast; are also really fast rise times for normal mics and sample rates.
I’d expect better(if by no means perfect) discrimination between things like doors and hardware refined over some centuries to burn all its propellant as rapidly and uniformly as possible if nice recording gear is being used, carefully; but someone just throwing more or less stock audio recording equipment at the problem and expecting the AI to fix it? FFS guys…
This topic was automatically closed after 5 days. New replies are no longer allowed.