The gap between typical and average (my ergodicity primer)

How far can typical and average diverge from each other? In many settings the answer is not much but in many more the answer can be a surprisingly large amount. These different settings are often not differentiated though leading to misleading statistics and beliefs. What differentiates these two types of systems? The answer is ergodicity.

Ergodicity is an idea that seems to be getting more attention recently and is one I came across while watching a lecture by Ole Peters[1]. The basic premise is that there is an important distinction between systems where a typical individual and the average of a group find themselves in very far apart and systems where they are close together. This essay will not be very mathematical and in fact I am going to deliberately try to draw much of my discussion even further away from the technical foundations of the theory and towards a more intuitive understanding of the core concepts. I also want to try and bring the ideas of ergodicity into a few areas that it is often mentioned less in. Of course the typical settings like gambles are some of the best ways to illustrate the concepts and so they do also feature.

Why should you care about ergodicity?

Ergodicity is a concept that is very interlinked with many other concepts like fragility and asymmetry, as such many systems that have properties like these can also be viewed through the lens of ergodicity. What’s more is that ergodicity explicitly makes one consider a type of risk that is often implicitly left out of many discussions. That of the risk of ruin. One of the main reasons for a system being non ergodic is that there exists some sort of threshold from which one cannot return and it is important to distinguish this risk of ruin from the regular risk of loss.

The concept is also one that if you gain some intuitive understanding of will make you more prone to question statements that talk about averages and probabilities. There are many situations where these are useful statements to make but many others where they can be misleading or even damaging. Having some grasp on ergodicity better prepares you to make the distinction.

What is ergodicity

An ergodic game/system is one where

• The game can’t end prematurely. There are no absorbing barriers like death or bankruptcy
• The ensemble (group) average and the time average are the same
• ….therefore a typical individual is likely to experience something close to the average of the group as a whole
• The past can be a useful predictor of the future

What is a non ergodic system?

• The time average and the group average diverge, basically it is unlikely that an individual taking part in the system will actually experience the average for the group as a whole
• The system may well have small and expected gains but over time it becomes very unlikely that these gains will persist
• Probabilistic statements can be misleading. These often talk about the group as a whole but if you only care about what you are likely to experience then they can be unhelpful

The classic example

If you’ve seen/read any other articles on ergodicity you will have almost certainly come across the game where ones wealth grows by 50% if a fair coin comes up heads and loses 40% of their wealth if it comes up tails. If not, never fear for we shall have a quick look at it below.

You may very well say that this sounds like an appealing gamble, after all, the expected value of any toss is +5% so you expect to get on average 5% richer each turn. Consider for a moment though where you expect to be after 2 coin flips. Here you have four equally likely options with your wealth changing accordingly:

• HH +125%
• HT -10%
• TH -10%
• TT -64%

You may already be starting to see that the gains are getting quite concentrated and that most of the time you will lose money but that if you make money you are likely to have made a lot.

The graph below, made by Ole Peters[2], shows the average wealth for an infinity large group (blue) and the expected growth rate of wealth for the typical individual (red)

As we can see the two values diverge pretty dramatically with the group average getting very large but the typical person in the group becoming worth next to nothing. This is happening because the gains are getting very very concentrated so most of the wealth that makes up the average is owned by a very small group of people who are extraordinarily wealthy whilst almost everyone else is worth nothing. This is an example of a non ergodic system.

It is non ergodic because for any individual playing this game repeatedly they are almost certain to have their wealth approximate to zero (time average) but the average wealth for the group as a whole (the group average) is growing. So here we have a system where the typical individual will never experience the average wealth for the group a whole. If you were presented with this gamble and made your decision based off the +5% expected value alone then this would be implicitly assuming that you would be able to gain access to those few runs where people did exceptionally well and share those gains with the multitude of runs where you ended up with nothing. This is not a very intuitive concept at first, especially because in the example above every coin flip you expect to get richer and yet you are almost certain to lose everything.

What makes ergodicity difficult to grasp

The concept of ergodicity places heavy emphasis on the the weakness of expected value, and expected utility, in many situations. Expected value is probably one of the most known statistical ideas and under some situations it does a perfectly good job. The problem is that under some other situations it fails miserably to give an accurate representation of the dynamics of the system and unfortunately these situations are rarely differentiated. Expected value is a framework many people operate with and it is understandably difficult to shake it off. I’m not at all saying that one should abandon it, just that it can be misapplied in some settings.

There can be situations locally optimal actions can lead to almost certain ruin e.g flipping the coin in the first example. This can be a major issue as there is not hard and fast rule as to when you should stop playing. In the first example I would have to put much more consideration into playing if I was told I would have to play 100 flips vs 1 flips. This also leads to situations where you need to make a decision to stop something that not just looks but is very likely to benefit you in the short run.

The idea of accessing the groups average is a bit of a funny one at first. I try to view this as something like if you and your friends all took some particular action would you expect to see a large disparity in results within the group. It is worth noting that the word ‘group’ or ‘ensemble’ can have a few meanings.

The group could be 10 people who all play Russian roulette together or it could mean 30 versions of yourself, one for each of the next 30 years, with each version investing £100 for the year. In the first example you cannot share resources between the group because one of you will be dead and so can’t receive them, in the second example you can’t share resources because we haven’t come up with way of sending money back in time.

Related to the sending money back in time issue, ergodicity requires one to take very seriously the fact that different routes can have very different effects even if they end up in the same place. I’ll look more at this in my retirement example but it can be quite effortful to consider the effect of lots of different datasets that all have the same overall characteristics.[3]

Frames

One of the best ways I found for getting a better intuitive grasp on ergodicity was by reading/coming up with situations that obeyed the central ideas of the concept but looked very different.

Retirement savings[4]

Setup: You started saving for retirement when you were 30 with £120,000 in the bank, retired when you were 60 and you plan to spend £30,000 a year in retirement. Between when you were 30 and 60 the market returned an average of 5% a year and it is expected to return the same average for the next 30 years. However there are two scenarios for these returns.

1. The market returns exactly 5% a year, every year
2. The market has uneven returns with much larger returns towards the end of the 30 year cycle than at the start, it still averages 5% a year over the whole 30 years though.

How much will this effect your spending habits? If you were told that the market averages 5% a year and so assume (1) then you are assuming an ergodic system where you can effectively access future gains in order to smooth year to year returns.

The top two lines show your wealth if you didn’t make any withdrawals, as we can see at the end of every 30 years you will have the same amount. The bottom two lines show what happens if you make your planned withdrawals. As we can see by not being able to get access to the group average (in this case the average returns for all 30 years), which is the light blue line, you end up running out of money in just over 20 years.

In a more general sense this kind of thing happened to LTCM where their system would have returned healthy profits if they had remained solvent long enough but they were unable to profit from the average returns they could have got over the whole 1997-8 crisis as they went bankrupt first.

Health

Lots of decisions when it comes to your health or medicine are non ergodic. 100 people taking a small dose of a toxin (aka a vaccine) each benefit from it but if one person was to take a large dose the results would likely not be so great.

Similarly Nassim Taleb gives an example [5] where each cigarette a smoker smokes would likely win out on a cost benefit analysis but the repeated exposure to the risk leads to serious issues.

Traffic

The two features I think are important to look for to see whether a system is non ergodic are whether a typical user is likely to experience the group average and how the results for a user change as they repeat an interaction. With traffic, assuming we are on a road that will sometimes suffer from congestion, the typical travel time is unlikely to be the same as the average travel time. Sometimes you will moves swiftly through an area other times you may be in near stand still traffic.

Careers

A career would be non ergodic if you would expect the typical practitioner of the career to have a very different experience from the average of all practitioners. For example if you were a hedge fund manager or a writer this would likely be a very non ergodic career choice. The typical writer is unlikely to sell many books however the average for all writers will likely be a few hundred or thousand books but this is made up of most writers selling next to nothing and a few selling millions of copies. One of the more obvious examples of a non ergodic career is that of being an ‘entrepreneur’. The average success of the whole group is likely to be high but most people are either very successful or not very successful, few are in the middle.

Being a taxi driver, is likely to be a very ergodic career path. If you drive a taxi 100 times or get 100 people to drive a taxi once the average amount earned per ride is likely to be fairly similar.

Social networks

If you were to start one social network, before there were any, the chances were it would likely fail. If, however, you were to get 100 people to each start one, one of them may well be facebook. If this was the case the average success/number of users would be very high but it would be made up almost entirely by one company. Social networks work particularly well as they tend to have rather binary outcomes and so they have a very large tendency towards the absorption barrier of not having enough users to survive.

Taming non ergodicity

So can one tame a situation that looks non ergodic? There are ways one can do this in investing like using things like the Kelly Criterion or a barbell strategy where you are optimising your average growth rate of wealth and diversifying but I’ll leave that to some others to explain [6]. Why? Well I want to focus on a part of taming ergodicity that is less focused specifically on investing and that I haven’t seen talked about much.

The problem with non ergodic systems is that the typical individual can’t gain access to the group, but does that mean no one can? There are a whole host of companies that I see as acting like a, to use my allowance of buzzwords, ergodicity meta-layer. They have structured themselves in such a way so that they can gain access to the group average without having to endure anywhere near as much risk as the typical individual.

The most obvious example I see is a VC firm. Here they are knowingly trying to get access to a group of companies they know operate in a very non ergodic system. If you see a fund return 5x it is much more likely this is the result of one or two investments that return say 40x and the rest effectively being a rounding error. To be sure not all firms do this as well as each other and in fact without going to down the rabbit hole the system of VC funds themselves is a non ergodic one, few funds return the average return for the sector, most of the returns are concentrated in a few big winners.

Another, maybe less obvious, example is a company like Substack. Here they get access to the very non ergodic system of newsletters where the typical individual is likely to have few followers, especially paid followers, but a few individuals, like the browser, will have many many paying subscribers. Substack’s position allows them to effectively get a cut of the groups average.

Conclusion

So what do I want you to take away from this?

1. That ergodicity is a concept that classifies systems depending on how the typical individual/experience will differ from the average of the group as a whole.
2. That there are situations that seem to present real gains at almost every opportunity that can still lead to you being almost certainly to lose
3. That when you hear a statement about an average outcome you get a little voice in the back of your head that makes you question whether the average is really that useful a metric in the given situation
4. That there are ways to gain access to the group average it is just difficult to do so, but potentially very lucrative

I also hope you find some level of enjoyment in the concept itself, when I first saw the example at the beginning I immediately made a simulation to play around with flipping coins it and I was amazed by the difference in what I expected and the result. The simulation was similar to questions on compounding, even if you have some idea that things get bigger or less equal the magnitude at which it happens can still be shocking.

Ergodicity is a powerful way of looking at systems that allows you to asses how useful probabilistic statements are and how you might want to reconsider the dynamics at play and I hope I have helped provide some understanding that you may not have had before.

Footnotes

[3] Not necessarily fully related but to give some idea of how many different paths can have the same global descriptors I give you Anscombe’s quartet

[4] heavily inspired by https://taylorpearson.me/ergodicity/