UK police train machine-learning model using Experian data that stereotypes people based on garden size, first names, and postal codes

Originally published at:


I wish they would use more reasonable data points, like shoe and hat size…




“If it was good enough for the Victorians” is increasingly the motto for the 21st century.


First names… that’s too clever by half. You get to racially profile without using race as a data point.


Dependent greys? Oh, it’s “Ageing social renters with high levels of need in centrally located developments of small units”

(Roger from American Dad)


Don’t worry, they’ll change all the category names to randomly assigned numbers now. Then everything will surely be OK?

For those who aren’t familiar - TRW Inc., the US aerospace & semiconductor company, used to have an “Information Systems and Services” subsidary amongst its many holdings, that basically was a credit reporting agency. TRW sold it to Mitt Romney and Thomas Lee under the name Experian, and they in turn sold it to the English conglomerate GUS who were pioneers in collecting and collating consumer demographics without permission. Since then they’ve become a true multinational, with 50% of their subsidiarys based in tax havens and a headquarters in Ireland.


Funny how all the profiles excerpted in the post ranged from what some call ‘underclass’ through ‘poor-ish’ to ‘just about managing’. Not the broadest corss-section of society. Maybe a broadish cross-section of former offenders needing help? Either way, this is just another example of roboticisation - robots replacing people. In the old days (earlier this century) the UK had enough (just) police to justify claims of proactive community policing, and enough probation staff (in govt employ, before being privatised) to actually help stop/reduce reoffending, probably as successfully as this pile of algorithmic shit.

I’ve heard there’s a special type of knot with which to do your laces that will help identify you and the authority which is rightfully your due.

1 Like

Yeah, they’re explicitly doing so to determine both class and race - I mean, one name data point is specifically “Family/personal names linked to ethnicity”. Many of the other data points are clearly designed to get at those as well, but via more round-about ways. What’s worse is they’re also explicitly tying those names to stereotypes; “Denise” = “low income worker” = “heavy TV viewer”.


Knowing Roger, is that finger going somewhere or pointing? I’m guessing going…

So what’s up with garden size? Right now I have 3 marijuana plants (legal of course) and a spider plant. I’m guessing that leaves me a lot of free time to commit crimes.

1 Like

The logical outcome of this, intentionally or not, is even worse than that already very unsavory result. Not only are they seeking to sort people based on this data, but by training their networks on this, governments will be training their automation to track, police and allocate people in a way that reinforces the model. Even if they don’t want to (and you can be sure many do), they’re creating systems that punish divergence from and reinforce adherence to expectations, effectively criminalizing class mobility.

Oh but wait, it gets even worse. Once even technically illiterate power brokers come to understand this, they’ll realize they can stack the deck for or against any group they want purely by tweaking the data-sets. Rather than relying on the fickle mistress of public opinion to hand them the power to oppress, they’ll be able to order it be programed in the very machinery of the state and its economic apparatus.



Mosaic has been in common use by financial institutions, MTOs and others for 20 years. I’m glad that police are using available information sources to protect the public. It isn’t stereotyping by any definition. Would you rather police protected victims of domestic violence by sending military armed cruisers to target random ethnic minority youths?

Interestingly, the police are using data that would be illegal for banks to use to make lending decisions on. In the UK, you can’t use race or gender for any credit scoring activity (That’s why Experian sell this as a marketing database- they have other tools for credit decisions).

What’s really surprising is that Durham police are using this openly themselves. The various English police forces generally use ACPO to cover up any secret data gathering like this

It could be the UK sense of “garden,” with the meaning that “yard” has in the US, rather than cultivating plants other than simple ground cover, like a lawn. In the US your backyard can have a garden, in the UK your back garden is your backyard - including a lawn, other cultivated plants, and maybe a patio.

Or maybe it IS actually a reference to flower or vegetable gardens, I dunno.

1 Like

That’s it. The context that makes this useful is that in the UK, land is expensive, so larger gardens are a status symbol, and correlate to more expensive / older houses.

Garden size can also be picked up from publicly available maps, which makes it very useful for this sort of fine-grained geographical targeting. Want to tell the difference between rich and poor residences in the same postal code?- look at the size of the houses and gardens on the map.


I’m sure that careful use of the chav-squared test kept this exercise at the highest levels of statistical rigor and objective data science.


That is what I took it to mean, an economic class indicator. Though I think it would be great if they had a database of the nicest kitchen gardens in an area for growing fruits and vegetables, for aficionados. I guess that’s what county fairs are for though. :grinning:

1 Like

Yeah, it’s already built into the system - leaping from indicators of race/class to stereotypes about race/class and then using that as a basis for how those people are treated (regardless of reality). If the “data” indicates someone is more likely to be a criminal, and you accordingly put more resources into policing them, then they’re more likely to be arrested, regardless of what they do, than anyone else, reinforcing the connection in the system, “validating” it. We’ve done that in the US without AI, but once you start talking about “data” it all seems so neutral and scientific.