Originally published at: https://boingboing.net/2018/07/29/ai-system-looks-at-faces-and-r.html
…
I dunno, she seems to me to have an average level of responsibility, but a pretty high degree of commonness. Who programmed this thing?
Technology is wasted on mankind. Whatever aliens gave it to us should take it back and go home. How is this not phrenology all over again.
Yeah its rating you, but is it even accurate? It’s not like there is any verification of the results. Its making guesses, and you might as well be flipping coins if you’re trying to figure out someone’s internal motivations based on how a computer read their smile. I don’t think the system is impossible to build I’m just skeptical as to how they got their training data. Pulling tagged pictures from google for image recognition is one thing, how do you get a database of faces matched to honesty?
I do have to give them points for the retro interface design – boxes drawn with IBM-PC special characters and so forth – makes it look like something that would be on Max Headroom or something.
I think Cesare Lombroso would be very proud of this sort of thing.
It’s not all that difficult, this is perfectly routine in machine learning scenarios of this nature - first get a database of a few hundred thousand people. Then for each of the subjects, ask/hire someone who knows them well to their rate them on each axis.
You now have exact, objective, perfectly accurate ratings for each characteristic of each subject. You now track each subject, obtaining a variety of photos from different devices - CCTVs, government employees, social media, nanny cams, baby monitors, Amazon cylinders, etc.
At this point you have now have a robust set of photos, a perfectly accurate set of ratings, and can start the machine learning, training the programs parameters so that its algorithms produce identical ratings of the photographs of the subjects to their actual real-life ratings.
At this point, you do need to undergo a trial of the trained system. Typically, this is done by finding secondary subjects, having the system derive their characteristics from photos of them, and if necessary, modify the secondary subject until their characteristics match the ratings the system provided.
Once done, the system is now ready for commercial release.
User capacity for negative score-boarding has leveled up !
Now those creepy mall directories can be much better judges of character!
Well my skepticism was more at how one would obtain that data, because as you yourself described I imagine would not be cheap at all. How long would it take any team you can imagine to interview 100 thousand people (one for each subject in the data)? a team of 1000 could interview 100 people each, but then you’ve hired 1000 interviewers to conduct 100 interviews, how long would that take? Where would you do it? To be accurate toward your goal you’d need to conduct it in the real world and not some prearranged place where the subjects were aware of the data collection because then they would change their behavior in response to being studied. But then if you did do it in the real world how would you follow up to get information on people many (if not most) who would want nothing to do with your study? There are so many ways where obtaining the data would be difficult/time consuming/prone to bias that I’m not sure without investing huge amounts of time and money that you could just quickly get a database that actually works - as in really works.
The probelm, as we are seeing, is that the tech doesnt need to work, it just needs to give the buyer the answers that suit their needs.
Obviously experts skilled in converting analog characteristics into digital ratings produce higher quality data but are also rather more expensive, especially since, as you point out, it is imperative that the subjects not realize they are under surveillance.
Less accurate, but considerably cheaper data collection can be achieved by interviewing a wider collection of those who know the subject well (neighbors, coworkers, children, parents, spouses, etc.). The experts can then synthesize metrics for the characteristics that may be marginally less accurate, but considerably cheaper to derive.
But yes, systems like these aren’t cheap to produce, which is why they need come from only the most reliable and well-funded organizations such as Amazon or Facebook, who also have adequate access to the data streams necessary for success.
Are there any studies that describe how accurate neighbors etc are at rating people?
Neighbours are indeed the “weakest link”. That’s why you need multiple reports, including from those who know the subject better. And yes, sufficient financial or (if government-sponsored) political incentives to motivate those close enough to the subject (preferably cohabitating) to covertly provide good quality data isn’t going to be very cheap.
Still, with enough secondary sources, you should be able to have multiple experts reading the same reports produce ratings that are within say 2 or 3 of the subject’s actual objective aggressiveness or kindness rating (on a 1-100) scale.
Finally, although it’s not really good for the learning model, if it really appears that there are questions about the accuracy of the initial ratings, you can take steps to modify the subjects in order to bring them into close conformance with the rating established for them, but that’s not cheap, and it means another level of feedback in the training.
As I said - you really need to be fairly well funded for any of these efforts to succeed.
Then - there are no studies showing the reliability of these people for rating these characteristics.
Increasing the amount of bad data doesn’t make it more accurate.
My first thought was “how did they get ncurses to play with graphics?” but yeah, that looks more like a DOS/Qbasic type interface.
What? Come now - quantification of humans characteristics isn’t particularly hard. Let’s take the article: Kindness, Happiness, Commonness, Responsibility, Attractiveness, Sociability, Introversion, Aggressiveness, Weirdness, and Emotional Stability.
Do you really think that even a talented amateur like yourself couldn’t determine ratings for those who were close to you that weren’t within a few points of their actual rating?
For example, I just tried it with my children on the most difficult one: weirdness. My oldest I was pretty certain was an 18, my youngest a 12. I then compared with their actual ratings - I was a 1 point high for the oldest and 3 points high on the younger (although I blame youth, which can result in changes in characteristics during the maturation process). And I’m not all that observant a person.
If you want to get more detail about just how amazingly accurate humans can be at rating characteristics in other humans, google Poe’s law.
It sounds to me like they don’t even expect it to work. But why worry about basic functionality when you can talk about Minority Report and Bladerunner?
Have you ever met a real live human being? I’m not sure we’re capable of generating reports with that kind of accuracy, and we’re really not all that good at objective.
The Stasi ran quite a large experiment in East Germany. It was eventually shut down for being unpopular.