A machine learning system trained on scholarly journals could correct Wikipedia's gendered under-representation problem

Originally published at: https://boingboing.net/2018/08/08/lacunae-r-us.html


I can’t wait for the alt-right to get wind of this, filtered through Limbaugh or maybe Alex Jones.

I wish folks like you with an influential voice would spend more creative energy on getting a progressive tax in place and maybe a decent, $20 minimum wage than worry about correcting gender representation imbalances in wiki…

1 Like

If you look at the contents of current biology journals you will see a large portion of oriental names, but only a handful among the hundreds here:

Yeah, I’d love to see how they avoid false hits on that.

In most Western countries women are underrepresented in science. Which is s far greater problem in itself.

Ah, I posted in the wrong thread! Will repost.

I wish people would stop trying to get progressives to throw one group under the bus in favor of another. But we can’t both get our wish.


One problem is that many Asian names tend to be poor at actually identifying individuals. I know three Ming Li’s personally. With Western names, typically a last name plus a first and middle initial makes one unique in a citation index. Perhaps the solution is to do away with names as citation metrics. ORCHID (https://orcid.org/) aims to provide a unique id number to every researcher.

So most of the data it’s trained on is written by men and most of the data it’s processing is written by men. The algorithm could work perfectly, but it’s built on sand.

This is the Fallacy of Relative Privation, also called the Appeal to Worse Problems.

Rebuttals for this fallacy (not necessarily rebuttals for this instance)

  • Not all problems have the same solution, working on one solution does not preclude working on another solution at the same time.
  • Some times problems do have the same solution, but it’s better to prove the solution on a smaller problem before trying it on a larger problem.
  • Some problems are intractable on their own, but solving other problems can open up new avenues to resolve the larger problems.
  • Throwing more effort into a single solution often sees diminishing returns, at some point it is better to make moderate progress on two problems than make no progress on one problem and slightly better than moderate progress on the other.
  • Not everyone shares your passions, people tend to be more productive if they follow what they are passionate about.
    More specifically with the last point, the cost for Cory to highlight that some one else did a lot of hard work they likely were passionate about is very low.
    The cost to not highlight people doing good things is high. It discourages, or at least fails to encourage, others from doing good things.
    The cost of asking those who did the actual work to stop doing what they are passionate about is very high. It discourages people to seek out and follow their passions, it pushes people into tasks they may be less suited to wasting some of their effort, it assume more people doing this one thing will result in an equal amount of progress.

This topic was automatically closed after 5 days. New replies are no longer allowed.