Many years ago I was a consultant at a financial institution in Chicago that was making the move from mainframe technology to an object-oriented approach using C++. During a code review, I asked one of the developers why he chose a particular type of data structure and he replied that he looked at a colleague’s code and just copied her approach.

Following is the lesson I shared with him: every line of code you write involves tradeoffs. These can be tradeoffs of space and time (speed vs. memory), read performance vs. write performance, any number of factors. And the art of good software engineering is understanding what those tradeoffs are and choosing deliberately.

What Is Architecture?

There was a time when it was in vogue for software developers to refer to themselves as “software architects” but few could really articulate what was the role of an architect as opposed to a developer or “coder?” Even today, many years later, I still run into this lack of understanding and frequently use the following thought experiment to illustrate…

Continue reading

The Casino Method For Estimating Uncertainty

“Be approximately right rather than exactly wrong.”

John W. Tukey

How To Improve This Important Skill (That Everyone Is Bad At)

A key skill in the realm of “Data Literacy” is the ability to estimate uncertainty.  Statements like, “we expect gross profits in 2021 to be $1M” are wrong out of the gate because they are point estimates, exactingly precise and impossibly unlikely.  Far more useful are estimates that capture the inherent uncertainty and state that in the form of a confidence interval, such as “we estimate with 90% confidence that profits in 2021 will be between $0.8M and $1.2M,” meaning we believe there is only a 10% chance that the real value will fall outside this range.  The relative width of these confidence intervals carries important information about our uncertainty in our estimate.

The problem is that most people are really, really bad at forming those sorts of uncertainty estimates.  This article will teach you how to do it with much greater accuracy.

Continue reading

How to Forecast Like a Poker Pro

Dogs Playing Poker

The Process of Probabilistic Thinking

In “The Signal and The Noise,” data scientist Nate Silver uses the tales of a professional gambler and a poker champion to illustrate a skill deemed critical to their respective successes: the ability to simultaneously hold in mind multiple hypotheses about the possible outcomes of sporting events and poker hands and update their beliefs accordingly as new data are revealed.  Silver writes, “Successful gamblers—and successful forecasters of any kind—do not think of the future in terms of no-lose bets, unimpeachable theories, and infinitely precise measurements. Successful gamblers, instead, think of the future as speckles of probability, flickering upward and downward like a stock market ticker to every new jolt of information.”

Similarly, in “Superforecasting: The Art and Science of Predicting,” Philip Tetlock writes about the strategies of an unlikely but elite group of forecasters whose accuracy greatly exceeded that of supposed experts and states, “[this book is] about how to be accurate, and the superforecasters show that probabilistic thinking is essential for that.”

This skill of probabilistic thinking is crucial in virtually any context in which we need to reason about an uncertain future.  Humans are notoriously prone to cognitive biases that, once we have a pet theory in mind, lead us to discount information that conflicts with that theory and more heavily weight information that supports it (see Nobel prize winner Daniel Kahneman’s “Thinking Fast and Slow” for more on this fascinating topic).

This article describes a simple process for making and updating your forecasts in a principled, consistent manner that makes maximum use of the available evidence and helps avoid harmful cognitive biases.

Continue reading

When Your Data Speak, Can You Understand?

 “Data are becoming the new raw material of business.”

— Craig Mundie

If your company is like most, these days you have more data than you know what to do with and are collecting it faster than you can imagine.  IBM states the every day we create 2.5 quintillion bytes worth and several studies claim that 99+% of the world’s data was created in just the last two years.

And YOU are being asked to do more with it, to make “data-driven decisions”, whether you’re an individual contributor or the CEO, by more formally incorporating data into your decision-making processes.  As McAfee and Brynjolfsson report, “companies in the top third of their industry in the use of data-driven decision making were, on average, 5% more productive and 6% more profitable than their competitors.”  Businesses have recognized that “data is the new oil,” (a pronouncement credited to Sheffield mathematician Clive Humby who helped establish Tesco’s Clubcard in 1994) and those that aren’t wringing the most value from their data are going to be left in the dust.

Continue reading