Guesstimate: a spreadsheet for adding up uncertainties

doctorow · December 31, 2015, 9:55pm

William_Holz · December 31, 2015, 11:25pm

This showed up on top of my Medium feed earlier!

Nifty little approach, and very simple and easy to interact with. It’s really elegant!

anon62122146 · January 1, 2016, 4:36am

Cool little web implementation of probabilistic programming. There’s a bunch of research in this field right now, but, as far as I know, still no really fully-developed ecosystems for working with it:

William_Holz · January 1, 2016, 4:41am

I think it’s just a quick and clean Monte Carlo analysis, which is a really good strategy for something like this.

I mean, if you’re planning on getting super-fancy then something like Rapid Miner is the way to go, but for a casual person just looking to get introduced to the concepts and who isn’t doing anything particularly weird? This is an excellent start!

shaddack · January 1, 2016, 4:44am

“The Monte Carlo method does not involve gambling with grant money in a casino.”

gilgongo · January 1, 2016, 1:14pm

Just been playing about with some casual rubbish on it and it’s very nifty indeed. One thing they could introduce to take it to the next level is shocks: so I allow for a random market crash or population collapse in with the monte carlo. That would be dope.

anon62122146 · January 1, 2016, 3:44pm

I just looked at that website, and I have to say that that making that kind of stuff with a drag-and-drop interface strikes me as a pretty terrible idea.

For one thing, when you get beyond some very simple stuff, anyone who is capable of doing it is capable of using a proper programming language, and for another, developing or maintaining anything beyond a certain (very small) size in that kind of environment is pure hell. I’ve done that in LabView, because they have interface modules with basically every commercial instrument, and it sucked. You wind up just looking at a giant mess of many-layered spaghetti that’s impossible to make any sense of, and is far less comprehensible than even badly written code in a text-based language.

William_Holz · January 1, 2016, 4:24pm

I think you’re missing the point of this sort of tool.

It’s not for people who ‘are capable of using a proper programming language’, those are easy to find but more often than not don’t have sufficient industry understanding or comprehension of the eccentricities of the data they’re working with. Those that do are rare and pricey.

Instead, it’s for business intelligence people, data analytic folks, people with industry knowledge and moderate data skills, and the like. They come from backgrounds that don’t generate coding skills and many of them aren’t likely to ever gain them.

Plus you can see and modify the code too, which is actually helpful for learning. Heck, I learned most of my basic SQL skills by tearing the queries out of Business Objects universes and that provided me the foundation to take things to the next level. And I still happily do that sort of thing with complicated schemas with stupid table and field names.

It’s also more focused than LabView, so I’m not sure that’s a sensible comparison except from an outward appearance standpoint. This is for data analytic work and predictive modeling only.

GilbertWham · January 1, 2016, 8:50pm

Not even if your software is really good at guesstimating?

dave_b · January 1, 2016, 10:35pm

Based on its development in the manhatten project I assumed radioactive roullette wheels.

atl · January 2, 2016, 12:16am

I had a friend who did something like this when he was trying to estimate certain things ahead of his wedding but before he had a hard count. He took guests invited, assigned a probability of them showing up, and plotted the numbers from there. My wife and I were assigned a 1.00 confidence, but he had numbers as low as .25 and some higher than 1, since he knew those folks liked to bring along strays. Ended up getting it within 5 people on a wedding of 200 people.

anon62122146 · January 2, 2016, 3:10pm

At my work, almost everyone is someone who started out with relevant specialized physics and/or engineering knowledge and learned the coding part of things on the job, for exactly that reason. What I was also suggesting is that anyone who knows enough math to use statistics productively, can learn enough coding to do stuff like this in a couple weeks.

You can also learn this by examining source code. It’s possible that people only use this for really simple stuff, but what I said about anything over a certain level of complexity being horrifying and incomprehensible in a graphical programming environment still stands. I would say that most problems I work on, even if a lot of parts are simple enough that you could reasonably do them in a graphical environment, also include parts that aren’t, and I would expect that that applies across industries when you’re talking about deriving useful and non-trivial insights from statistics.

William_Holz · January 2, 2016, 3:20pm

Not so much. You’re probably right in industries where the people involved have physics or engineering backgrounds, but that’s not the entire scope of the field.

For example, when I was working on the healthcare pretty much none of the clinicians or those with strong medical backgrounds had any real programming skills. Heck, they’re often not even ‘math’ people. For them the GUI as an entry point is a boon.

So, different part of the VENN diagram. I’m a bit of a code elitist myself but I’m fully aware of the huge number of scenarios where there are people who can bring insights to the table but haven’t the time nor the inclination to learn to code. It’s silly to expect them to become coders when with a couple of hours of mentoring and a few hours of playing they can make useful discoveries.

Besides, look at the code Rapid Miner makes before making any assumptions, as I said in the earlier post it’s specifically focused on data manipulation and model generation…the code it generates is not that bad.

anon62122146 · January 2, 2016, 3:31pm

I would be leery of trusting statistics done by people without the relevant background to understand what the tests they’re doing actually mean.

Fair enough, I haven’t really seen a lot of that code, but I also know I regularly write code for statistical analysis that is complex enough that it would look awful in any graphical representation. Also, when you’re doing real modelling (I’m talking about empirical models here, not physical models) there’s a lot of math that goes into that model and the bounds on its reliability, and I think it’s a pretty bad idea to be basing decisions on models made and interpreted by people who don’t understand that math. I’m not saying you need to work the math out every time, but, I mean, say (back to physical models for this example) you’ve got an engine that does numerical approximations of Maxwell’s equations. You don’t have to set up the equations every time, or even remember how to do that, but you really should know how those equations behave, at least to the point of being able to picture which direction a given change should push your system, if you’re going to be using that software and interpreting its results, if only to catch the inevitable mistakes.

William_Holz · January 2, 2016, 4:16pm

It’s pretty easy to see what they did and train people up with examples that they understand… FAR more effective than having clinical people go through tutorials about finance that have context that’s meaningless to them.

Also, since the first step of analytic work is often finding the useful insight, it’s amazing how much use even the clueless can be. My first stumbling decision tree actually revealed a very useful insight into the strongest cause of high cost pregnancies for us, and that particular factor wasn’t even on the radar (I’d accidentally left my ‘months in system’ field in there from my analytic universe…and once we’d done our digging it turned out that yes, the primary factor WAS lack of good care during the early months of pregnancy and before)

And of course when you find stupid things it’s pretty easy to figure out what’s stupid when you start digging. That’s the same sort of thing we run into with ANY analytic work, after all.

Plus it often gets them interested enough to take things to the next level. I think that for a lot of people that exposed GUI-to-Code option is very helpful for learning while working with data that’s relevant to you.

Going back to my own Business Objects to SQL transition, that’s exactly how I learned to write good SQL and having that GUI to Code view not not only helped me under basic dimensional concepts but also ‘teased’ me enough so that when I wanted to get to more advanced concepts (rolling windows and the like) then I already had a very practical good understanding of the basics.

I probably learned more in a month than I could have in a year of classroom learning, and Sardonic DBA loved my code. (that boy was a prodigy)

I agree there, in fact going a step further I think it’s important to understand the concepts on as fundamental level as is possible rather than just knowing what a particular algorithm does. Sometimes knowing how something was invented and why it was developed provides some crucial insights into why it works the way it does.

But that can be an ongoing learning process rather than a sequential one. Most of the people I’ve worked with haven’t had the option to go back to school and I for one (and many others) simply don’t benefit from classroom learning environments the way we do from incremental contextual discovery.

Nothing gets the juices pumping like having to learn something that’s one level beyond where you are in order to solve a problem that you’re really excited about. Intrinsic motivation rocks.

doctorow · January 5, 2016, 9:55pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
King James Programming: Markov chain trained on the Bible and a comp sci textbook boing	15	3420	December 11, 2013
Probable is a fun website that lets users make predictions on a virtual coin flip boing	5	345	August 4, 2022
A machine-learning system that guesses whether text was produced by machine-learning systems boing	8	833	March 13, 2019
Tor Project is working on a web-wide random number generator boing	14	2270	May 31, 2016
Think like a computer scientist: free, interactive textbook boing	11	1610	October 10, 2016

Guesstimate: a spreadsheet for adding up uncertainties

Related topics