Skip to content

On Being a Fool

Nassim Taleb’s latest interview with The Sunday Times is very well done and talks about the underemphasized issue of behavior in the absence of Black Swans. I had been thinking about the same topic over the last few days. Here are my thoughts:

The part of TBS that discusses being a fool in the right places is equally important as the part that discusses being conservative in Extremistan domains. In that respect, the chapter titled “Half and Half, or How to Get Even With the Black Swan” is my favorite and contains a succinct and personal kernel of the whole book. I think even the folks who understand the Extremistan ideas did not take away all that there is in the book on the other aspect: behavior in Mediocristan.

Let us consider a system of thought where “there is no logical reason to prefer one action over another.” In Extremistan domains, where the unknown plays a large role, it is beneficial to lean towards this system of thought. For example, if you are asked to invest in N securities, it is best to either invest in all of them, or none (if that’s a choice), instead of tunneling and showing a preference for a few over others. In this domain it is useful to value all choices equally beforehand, so we should not prefer one action over another.

However, it is possible to take this too far and apply this system of thought to Mediocristan domains. This will clearly lead to inaction. In a transaction involving exchange of goods, money, or time, a person who treats all choices as equal gains nothing from entering the transaction. He will exchange Object X for Object Y, but if he values X and Y equally, he gains nothing from the transaction. En fait, he will incur a transaction fee and come out at a loss. This would imply he is better off doing nothing at all, if he truly values everything equally (including his time). For example, if he knows that an MBA is valued fairly (it probably is, given the competition) and he values money and the MBA education equally, he has no reasons to apply at all.

So Taleb rails against the application of Mediocristan tools in Extremistan domains, but he is careful to point out that Extremistan thinking should not be brought into Mediocristan domains and he stresses the opposite. Many miss this point, and he emphasizes:

Few understand the beauty in the story of Apelles; in fact, most people exercise their error avoidance by repressing the Apelles in them.

If one decides to be hyperconservative in Extremistan domains, it becomes necessary to become hyperaggressive in Mediocristan domains. Otherwise, a part of us, the part designed to take risks, will be amputated.

So it becomes important to cultivate tastes, likes, and dislikes in harmless things. There are perils to being too open minded and valuing everything equally. The process of living in Mediocristan is trading-off what one doesn’t mind losing in order to acquire what one desires. For someone who has no particular likes or dislikes, this becomes very difficult and can lead to inaction. The stronger the likes and dislikes, the easier it will be to make decisions and trade-offs. These trade-offs are acceptable because the worst-case-scenario is clipped, these decision cannot hurt us more than a certain amount. A little diversification will wash the negative consequences out.

Creating an emotional differential between harmless choices in Mediocristan is what drives us - it generates action. This can be cultivated. It is difficult to put this in terms of “Gaussian risks,” so something may be lost in translation when the difference between Mediocristan and Extremistan is introduced.

Both takeaways from the book are important. Applying the one without the other leaves you lacking something important.

Efficiency and Robustness

I took a few business courses at Stanford. They were pretty entertaining and my favorite was a course on Supply-Chain Management. It was a series of case-studies that all started or ended with: “Rob looked out of his office window overlooking the hills in Palo Alto and wondered how he was going to …”

Invariably, the case-studies had consultants who would come in and save the day. More often than not they would either move the supply-chain from an existing centralized system to a more distributed process or vice versa. In both cases there would be very rational sounding reasons for doing so, and indeed the case study would cite much increased growth and productivity at the host company after the consultants has done their thing.

I never quite understood how one would make the decision before-hand on whether to go with a centralized or distributed process. In any case, it was always presented as a binary decision - it was never framed as a tradeoff.

Because that’s what it really is. Anytime somebody moves the slider on the efficiency scale, the system will become more or less robust. The higher the efficiency, the less robust the system. A highly efficient centralized system is much more vulnerable to single-point-of-failure situations. A more distributed approach, while being more resistant to single-point failures, is less efficient. An efficient system will also fail very efficiently.

Unfortunately, competition pushes systems to move towards higher and higher efficiency, all the while ignoring the massive drop in robustness for each player. Robustness is a difficult quality to measure compared to efficiency, and is thus rarely included in cost/benefit analyses. Efficiency gains can be quoted in percentage terms, but robustness measures only rely on “messy” scenario analysis that are difficult to enumerate and ask people to imagine the unknown.

While efficiency has been reified, robustness remains an elusive measure that few take into account until a failure actually occurs.

Entropy, Negentropy, Information, and Uncertainty

I am making some more progress with the entropy angle. But there is enormous confusion in terminology and fundamental concepts when it comes to entropy.

In various journal papers, “entropy” has been taken to mean information, randomness, disorder, uncertainty, increased order. In others, it is “negentropy” that takes these meanings.

A note on the NIH website talks about “Information is not Entropy, Information is not Uncertainty,” and has a nice quote:

The story goes that Shannon didn’t know what to call his measure so he asked von Neumann, who said `You should call it entropy … [since] … no one knows what entropy really is, so in a debate you will always have the advantage’

More as I clarify my thinking and stick to one set of terminology and concepts.

Unrepeatable

One of the pillars of the scientific method is repeatability. It should be possible for an experiment performed in one lab under certain parameters to be repeated under exactly the same conditions in any other lab.

Part of this requires at least some parameters that are controlled and easy manipulated by the scientist. From wikipedia on Dependent and Independent variables:

Dependent variables and independent variables refer to values that change in relationship to each other. The dependent variables are those that are observed to change in response to the independent variables. The independent variables are those that are deliberately manipulated to invoke a change in the dependent variables.

So at least a few parameters need to be easily controlled and manipulated. This allows nice graphs where one axis represents the dependent variable and the other the independent variable.

The result is a plethora of studies where men and mice are put on treadmills and made to walk or jog at a constant pace. The resulting hormone response, weight change, etc. are studied against varying speed, distance, time and so on. These are easily repeatable and result in nice graphs.

The problem is that most natural systems behave as far from equilibrium systems with a lot of “novelty” and enormous variation in the systems. Predators describe levy-distributions in their energy expenditures and predatory ranges. These distributions have highly unstable means and variances (if they exist at all). A levy-distributed range looks like this:

levyflight.png

There are large jumps that are unique. The mean is finite but unstable and converges only with extremely large data sets. The variance doesn’t even exist. So a hypothesis along the lines of:

Levy-flights with infinite variance result in a specific response in other variables.

This can be tested by experimentation, but said experiments will be very difficult to replicate. The sample mean jumps all over the place. There really isn’t any meaningful parameter to be used as an independent variable. Nice graphs are out of the question. Asking people to vary their energy expenditure based on a levy-distribution on a treadmill will result in a lot of data, but no two experiments will be alike - that’s the whole point of the levy-distribution. Even two successive experiments at the same lab will be different (else they aren’t really following a levy-distribution).

I’m not saying the type of studies being done now are useless, I’m just pointing out that there’s a fundamental problem with certain types of hypothesis. This wouldn’t be a problem except that so many systems in nature exhibit this levy-distributed behavior. And there’s a huge hole in experimentation because the graphs don’t come out nice and the experiments aren’t repeatable.

Fractal Productivity?

From Zen and the Art of Motorcycle Maintenance:

When the paper came due she didn’t have it and was quite upset. She had tried and tried but she just couldn’t think of anything to say.

He had already discussed her with her previous instructors and they’d confirmed his impressions of her. She was very serious, disciplined and hardworking, but extremely dull. Not a spark of creativity in her anywhere. Her eyes, behind the thick-lensed glasses, were the eyes of a drudge. She wasn’t bluffing him, she really couldn’t think of anything to say, and was upset by her inability to do as she was told.

It just stumped him. Now he couldn’t think of anything to say. A silence occurred, and then a peculiar answer: “Narrow it down to the main street of Bozeman.” It was a stroke of insight.

She nodded dutifully and went out. But just before her next class she came back in real distress, tears this time, distress that had obviously been there for a long time. She still couldn’t think of anything to say, and couldn’t understand why, if she couldn’t think of anything about all of Bozeman, she should be able to think of something about just one street.

He was furious. “You’re not looking!” he said. A memory came back of his own dismissal from the University for having too much to say. For every fact there is an infinity of hypotheses. The more you look the more you see. She really wasn’t looking and yet somehow didn’t understand this.

He told her angrily, “Narrow it down to the front of one building on the main street of Bozeman. The Opera House. Start with the upper left-hand brick.”

Her eyes, behind the thick-lensed glasses, opened wide. She came in the next class with a puzzled look and handed him a five-thousand-word essay on the front of the Opera House on the main street of Bozeman, Montana. “I sat in the hamburger stand across the street,” she said, “and started writing about the first brick, and the second brick, and then by the third brick it all started to come and I couldn’t stop. They thought I was crazy, and they kept kidding me, but here it all is. I don’t understand it.”

Haven’t really seen this approach emphasized anywhere else though. I mostly see tips about “drawing up an outline first.”

Underpriced Options, Competition, and Fear

I have written about the role of competition and salience before. I argued that the presence of stiff competition can mean that the option is at least priced correctly, if not overpriced. It’s a matter of looking at how skewed the demand-supply distribution is. If demand is much higher than supply (stiff competition), it’s highly likely that the option is priced appropriately or overpriced. I’m talking about real-world options here.

On the other hand, if supply exceeds demand, the situation is ripe for underpriced or “cheap” options. This means that you would know beforehand that there’s a good chance that the ROI is positive. In the case with overpriced options, there is no clear indication beforehand that it’s rational to enter the transaction at all (but salient options with high payoffs will tend to draw hordes of competitors).

I was going through Ben Casnocha’s archives when I read this:

The main thing I’ve learned in my seven years studying and doing presentations is that the standard for presentations / public speaking in the professional world is low, and as such it’s easy to be seen as great. When most people suck at something, all you have to do is suck less. And when I discovered I had a natural knack for communication, I identified speaking / communicating / presenting as one of my natural strengths that, if built upon, could become an unstoppable strength thanks not only to my own capabilities but because of how I would be perceived relative to the masses.

Since public speaking is supposed to be one of the top feared things amongst the population, there is little competition to speak of (relative to less feared things). And since this fear keeps the average quality low, a small edge can be positively leveraged easily.

In competitive environments, a small loss of edge will mean leverage in the opposite direction - being tossed out of the pool. Here, a small edge means an increase in positive leverage.

For me this is exciting because I’m always on the lookout for investing in “cheap” options. I now have a criterion that I can use to ferret out such opportunities. Keeping an eye out for things that people fear.

Of course, it’s very likely that I will initially fear the same things, but if the worst-case scenario is clipped, I will invest in training to overcome the fear. Because I know that I will only have to improve by a little to gain a positively leveraged edge that will cost very little to acquire.

Distributional Entropy, Information, and Fat-Tails

My latest essay goes deeper into the connection between entropy, information, and kurtosis.

I discuss the behavior of kurtosis and entropy in unconstrained probability distributions and investigate the behavior of sample kurtosis in time-series with non-decreasing kurtosis.

The essay: Distributional Entropy, Information, and Fat-Tails.

I will discuss how infinite kurtosis is manifested in the real world in future posts.

Entropy and Kurtosis

I varied the tail-exponent of a Student’s-T distribution, taking it from a near-gaussian distribution to a near-cauchy, fat-tailed distribution and noted the relationship between kurtosis and information entropy. Here’s the plot:

Kurtosis vs. Entropy

The implications are interesting. More on that later.

Earthquakes as Rare Events

The Mercury News is reporting a new study by the US Geological Survey that predicts “California faces a 99.7 percent chance of big quake by 2037.”

How the notion of “probability” is interpreted in these studies is a mystery. The power-law tails on these processes have a very low exponent and the ability to calibrate the probability in the tails shrinks very fast as one moves down the tail.

D. A. Freedman and P.B. Stark at Berkeley looked at this problem in a paper titled “What is the chance of an earthquake?” This 2001 paper studied an earlier similar forecast.

From the paper, on the probability forecast:

There is no straightforward interpretation of the USGS probability forecast. Many steps involve models that are largely untestable; modeling choices often seem arbitrary. Frequencies are equated with probabilities, fiducial distributions are used, outcomes are assumed to be equally likely, and sub jective probabilities are used in ways that violate Bayes rule.

…and:

Philosophical difficulties aside, the numerical probability values seem rather arbitrary.

They conclude with this:

Probabilities are a distraction. Instead of making forecasts, the USGS could help to improve building codes and to plan the government’s response to the next large earthquake. Bay Area
residents should take reasonable precautions, including bracing and bolting their homes as well as securing water heaters, bookcases, and other heavy objects. They should keep first aid supplies, water, and food on hand. They should largely ignore the USGS probability forecast.

Invest in preparation instead of building models and forecasting rare events. This false precision is dangerous if it informs decision making.

Learning in Complex Systems

We cannot learn from history in complex systems, except at the meta level (we can learn that we cannot learn, for example). This follows once one accepts that causality is impossible to pin down given an observation of a phenomenon.

Ericsson et al show that learning, and becoming an expert, requires:

  1. A tight relationship between action and consequence - i.e causality needs to be discoverable
  2. Quick and clear feedback on actions
  3. Repeatability

Let us consider whether a brain surgeon passes these criteria. Causality is indeed easily discoverable, nicking an artery in the theatre will make it clear that a mistake has been made - there is a tight relationship between action and consequence. The feedback is often immediate, at least in terms of the surgical procedure itself. Since there is relatively minor variability between human brains, and since the operation theatre is a controlled environment, the entire process of “brain surgery” is highly repeatable.

This makes brain surgery a classic domain for learning from the past and allows the creation of experts.

Now let us look at managing the environment, (enough people are picking on economists now, so I’ll skip them), and see if it is possible to learn from the past and become an expert in this domain. Are the three criteria satisfied?

First, causality is very difficult to identify. Small changes in any number of parameters, ranging from climate to prey availability can have large and sudden impact on predator populations. Given a phenomenon, it hard to say what caused it. If a human action is involved, outcomes are unpredictable and can spiral out of control when coupled with external variations (see the many instances of predator introduction for hilarious examples) .

Second, the results can take years to manifest themselves. The chain of events precipitated by an action can cascade into a series of relatively small changes until it shows up as a big impact a long way down the daisy-chain. Feedback on actions is far from immediate. Often, the people initiating the action are no longer even in charge anymore. Attempts to control Wolf populations in Yellow Stone National Park had impacts decades down the line.

Third, the environment is not controlled. Conditions that held once need not hold after a period of time has passed. Rivers dry up, predator-prey populations change all the time, and climate changes make any assumptions of stationary processes unjustified. Actions that worked earlier need not work now, because the underlying mechanisms may have changed.

All three conditions for learning and expertise are violated. Under these conditions, it is not possible to learn from the past and become an expert.

An “expert” at managing the environment can only be someone who takes a very fallible approach and shows humility and respect in the face of the unknown.

Other domains where experts cannot be present by the nature of the system itself: The movie industry, publishing, economics, stock-markets, climate, politics, wars, … the list goes on.