# Using statistics to estimate the true scope of the secret killings at the end of the Sri Lankan civil war

Originally published at: https://boingboing.net/2018/12/12/using-statistics-to-estimate-t.html

a goddamned massacre took place

I’m sure i’m a google search away, but at work currently, and i’m woefully ignorant of Sri Lankan history and politics. Why would the government disappear 400-500 people? Were they known dissidents/opposition? Was it random to scare others into cowering to the government?

Seems this method assumes that lists are independent, and that the probability that a victim will end up on them is equal. One can easily imagine a situation where a certain region is so dangerous that no one making these lists go there and thus none of the victims end up on any of them, while safer regions with fewer victims will be more carefully studied ensuring that victims there end up on several of them. You may thus have several lists that look very similar yet miss the majority of the victims.

This is a fairly obvious point so I hope the authors has taken it into account, but I’m not entirely sure how it can be done.

Great question! There are two key statistical assumptions in this method: list independence (the probability of being captured on one list should be independent of the probability of being captured on other lists); and capture homogeneity (that the probability of being observed should be constant across the population within each list). Both are generally false, so we can’t do the estimate with only two lists.

With additional lists, we can use poisson log-linear modeling to estimate list interactions (see the Rcapture package in R, I can only post 2 links) for a frequentist approach; estimate all possible log-linear models and average across them; or partition the matched data into latent classes, within each of which the independence model holds. I’ve linked to the relevant R packages for each approach, and they all have the math stat citations if you’d like to dive deeper.

For this project, I used LCMCR because it is the best at handling >5 lists, and we had 7. There are other reasons to prefer LCMCR, but the computational tractability with this many lists won this time.

Final act of a long and brutal civil war.

Well yeah, but the reasoning as to why is what i’m curious about. Perhaps there’s no clear answer as to why?

Almost 26 year civil war, with a nasty portion of ethnic nationalist bloodletting in an awkwardly amalgamated postcolonial state. The record of the government forces during the war doesn’t suggest that they were unduly concerned about false positives; but they had pretty solid reasons to suspect that people grabbed from the area where the LTTE had been encircled and eventually ground down would include a fair few committed opponents of theirs.

The government was the Good Guys and the other side were Terrorists according to the West.

The suicide bomb vest WAS invented over there.

This was not the first government-sanctioned killing of civilians, by a long shot. When the civil war broke out, there was widespread rioting and arson directed against Tamils, with some evidence the government coordinated and sanctioned it. In the late 1980s, still during the civil war with the Tamils, there was a Communist uprising the south. In putting it down, the government killed and disappeared people fairly indiscriminately. There’s a pattern that continued through multiple governments.

I still remember the internment camps that no one seemed to care about at the time. I’m very curious to know how many of the displaced ever got their homes back.

This seems like an interesting idea – using the discrepancies in what is known to detect the size of what is unknown – though I am not sure if I’ve correctly grasped what the study is doing.

Would it be a reasonable summary to say that you take a number of lists that should in principle contain the same names, and analyse the statistics of the disagreements to estimate what proportion of names the lists collectively fail to capture?

What appeals about it is that presumably, there are people actively trying to keep names from being reported, but unless their efforts are 100% successful, this kind of study can’t easily be fooled about the total numbers.

Though if there are desaparecidos whose names have been kept off every list, this approach still can’t detect the size of that cohort. You still periodically hear about thousands more people being found to have been disappeared by Pinochet; it seems mad that it would be so hard to get a number even decades after the regime is gone, but apparently you can disappear without anyone ever noticing.

Hmm, not exactly. Have a look at this very much non-technical article for an intuitive explanation of the method, along with the basic algebra for the two-list case. It also explains how list dependence works (this was @Bernel 's original question), and how we can simulate the effect of list dependence if we have only two lists and cannot measure it directly.

You’re right that if there is literally a zero probability of documenting a group of people, this method cannot estimate them. In practice, small probabilities are common, but zero is pretty rare. In simulations, we can show that even a 2-3% probability of documentation is sufficient for a reasonable estimate, if we have enough lists of sufficient size.

I have not done estimates for Chile, but in Latin America, I have done reports on conflict-related homicides for Peru, Colombia, El Salvador, and Guatemala, and in those cases, I’ve re-estimated several times as we find new data. With new data, the findings change, but not very much, and not in ways that change the substantive interpretation.

Finally, it is great to see a professional statisticians present a report on the number of people DISAPPEARED at the end of the war in SriLanka, besides the massive number of CIVILIANS killed intentionally with the ‘pretext’ of fighting the rebels.

Since there were questions about WHY, WHEN etc…I’d like to share the link to the Award-winning British Channel4 News documentary AND background.

No Fire Zone - Sri Lanka Killing fields
includes background, short relevant videos, and quotes from prominent people.

PS.
Regarding the NoFireZone site, if the documentary cannot be played in your region, you can watch the previous TV episode version here. It holds BOTH govt. and rebels responsible!

https://www.channel4.com/programmes/sri-lankas-killing-fields

Thank you.