# The Evolution of Risk Preferences

In this post, I explore the evolutionarily adaptive risk preferences under various conditions.

## Fair wager model

To understand the long term effects of risk aversion vs risk neutrality, we consider an evolutionary player with 10 apples initially who repeatedly makes a series of wagers that gets one apple half of the time and pays one apple half of the time. The wager is considered “fair” since the expected payoff is 0. In a population, every player is presented with the same wagers but the results of the wagers are independent or idiosyncratic. The player stops playing when he loses all the apples and dies. The wager repeats 10000 times in a short time and we record the number of apples owned at the end, which is proportional to the population size or fitness. Three thousand simulations are run and the distribution of the number of apples is plotted in figure 1.1.

As plotted in figure 1.1, in the long run the player almost surely dies. The distribution is severely skewed right. The single period expected payoff and long run expected payoff are both 0 apples because starvation does not change expected future payoff since whether wagering or dead, the expected payoff remains 0. Can we say evolution disfavors this kind of wagers since the extinction rate is high? Without additional assumptions, the answer is no. If individuals in a population either always accept this wager or decline this wager, then after many generations more than 99% of the lineages will be from the risk averse ones who decline the wager, which makes risk aversion seem like the dominant strategy. However, if counting the number of surviving individuals, the two risk preferences are equally successful by having equal total population sizes since the expected payoff of both risk preference equals 0. We will use this method to define fitness in the rest of the article and therefore we can assume what evolution maximizes is the long run expected number of offsprings produced. Assuming every offspring is the same, an organism selected by maximizing the long run fitness should maximize the number of his children (and kins) in his lifetime.

Then what makes most organisms risk averse in normal situations? (Harsh conditions can turn individuals into risk seekers.) In a period much shorter than a lifetime, due to limited reproductive capacity, a female’s number of offsprings grows sublinearly with the amount of resources obtained, making the wager unfavorable when already wealthy (figure 1.2). This diminishing marginal fecundity makes reproductive success dependent on the number of apples in stock. In situations with flexible reproduction schedules, such as in the insects, or when kin selection takes place, diminishing marginal fecundity should be less pronounced. Due to diminishing marginal fecundity, when the player already has hundreds of apples, the wager becomes something like 0.4 vs -0.5 in fitness terms and the expected payoff becomes negative. In the long run, every wager faced is biased in this way and we call the long run result diminishing marginal fitness. Since the expected fitness is negative, not making the wager is now a dominant strategy since the relative frequency of those who accept the wager will decrease in the long run. Those who decline the wager show risk averse behavior and the cognitive proxy for this could be either a distaste for risk or loss aversion; the latter further lowers the utility for losing the wager. Note that environmental bottleneck is not another drive for risk aversion unless it disproportionately affects species or groups with higher population size more than less frequent species.

Related to that is the carrying capacity of the environment due to limited food. Let’s say that no more than 100 apples will be awarded (in net amount) regardless of the number of wagers made. Then when the player’s number of apples approaches 100, the wager will simply stop being offered and the player’s wealth stagnates. Again the expected payoff of each wager is not affected and risk aversion is not selected for by this.

Because of the concavity of the fitness curve (figure 1.2), risk aversion should be stronger when facing higher stakes. So fewer people should be willing to make the wager if the stakes increase to 1000 apples vs losing 1000 apples. Moreover, if the wager is broken into a series of ten smaller wagers of 100 apples vs losing 100 apples, more people should accept it as the risk is lower (e.g. chances of gaining or losing 1000 apples is lower). Similarly, if a group of players makes the wager and split the payoff, they should be less risk averse.

Since males have a higher reproductive capacity than females in a polygamous species, males should be less risk averse than females. For male bachelors in these species, the reproductive success can grow superlinearly with resources obtained, making the bachelors risk seeking. Another reason the fitness function can be convex is the fixed cost needed before any mating can occur. This includes resources needed to feed oneself and attract mates. Thus the fitness function can be a sigmoid (S shaped) curve that is convex when fitness is low and concave when fitness is high especially in polygamous species (figure 1.3).

In some cases, current rewards can change future flow of rewards, such as over-extraction in a short period leading to the disappearance of the food source. With diminishing marginal fitness, the player starts to care about preserving the apple tree because if the tree lives and rewards small amounts of apples over a long time, his apple stock will be lower and he will be in the linear region of the fitness curve participating in fair wagers and be as successful as risk avoiders. Generally, if wagers are spread out over a long time, the player is more likely to accept the wagers.

Similarly, in group selection, over-extraction makes life harder for other group members. Members with a large apple stock face a lower marginal fitness than members with few apples. So richer members will voluntarily give up the wagers in order for the poorer members to receive more wagers to maximize group fitness. This is another way that diminishing marginal fitness results in (conditional) risk aversion.

## Favorable wager model

So far, the player does not show any fear of death because either way the expected future payoff is 0 (or less for making wagers if considering diminishing marginal fitness). However, if we change the 10000 wagers to a variety of favorable wagers such as getting 1.1 apples vs losing 0.9 apples, the expected payoff in the long run becomes very large even after considering diminishing marginal fitness. Now death means losing the chance to expect to receive many apples in the future and losing every apple is a step closer to death. If we speak in terms of fitness, even if he would never gain any more apple, the player can gradually convert the apples he already own into fecundity to increase fitness. Therefore, for anyone with positive wealth, death should always be avoided since expected future fitness gain remains positive. In this section, we will show the other ways risk aversion can arise even without diminishing marginal fitness.

To illustrate how risk in favorable wagers can change the long run expected payoff, consider three favorable wagers with 50-50 odds: A – getting 0.6 apples vs losing 0.4 apples, B – getting 1.1 apples vs losing 0.9 apples, and C – getting 2.1 apples vs losing 1.9 apples. All wagers have a single period expected payoff of 0.1 apples but wager A is least risky and wager C is most risky. Figure 2.1 shows the final distribution of the number of apples for each wager. We see that riskier wagers result in a higher death rate which leads to a lower long run expected payoff and risk averse behavior. The expected payoff maximizing player needs to minimize the death rate but once there is no threat of death (e.g. when he owns many apples), the three wagers look the same by having the same expected payoff (around 1000) conditional on survival.

More subtly, the long run death rate does not increase linearly with the magnitude of risk, such as measured by the difference between wager outcomes. The expected payoff as a function of risk level is graphed in figure 2.2. The curve turns at around risk = 1.2 but it’s not a tipping point but rather an inflection point. The shape of the curve could favor a kind of risk preference that ignores risk when it’s small (i.e. below 1.2) and is most sensitive to intermediate risk levels.

Given the risk of the wager, the threat of death only depends on the current wealth level. For Wager B, we try three different numbers of starting apples: 5, 7.5 and 10 and compare the resulting distributions in figure 2.3. As expected, the lower the starting wealth, the higher the death rate. The expected payoffs are 669, 814, and 884 apples respectively. If the player has 7.5 apples and is offered a 50-50 wager of either getting 2.5 apples or losing 2.5 apples, he’s effectively choosing among the three starting conditions. His one-period expected payoff is 0 apples, but his long run expected payoff, compared to the current condition, is (883 + 669) / 2 – 814 = -38 apples! On the surface, the player will act risk aversively by rejecting a fair wager. Cognitively, the player can experience loss aversion and considers losing 2.5 apples worse than not winning 2.5 apples.

Figure 2.4 plots the relationship between starting wealth and expected payoff. The expected payoff starts concave but gradually become linear as the starting wealth increases and death rate drops to nearly 0. This means risk aversion and loss aversion is pronounced when the wealth level is low enough. In the example wager, the initial wealth of 5 to 10 are sufficiently low and causes deaths, which lands in the concave part of the curve and results in risk averse behavior.

In conclusion, The magnitude of the risk and the current wealth both affect the degree of death aversion. Death aversion is strongest when paired with a large expected future fitness gain and/or a low level of current wealth. Death aversion and diminishing marginal fitness are the two things leading to risk averse behavior.  Interestingly, between at least one of these two things will usually happen, making risk aversion universally adaptive.  During good times, more nutrition probably does not offer even more offsprings, so diminishing marginal fitness takes place. During bad times, the future is more likely to be brighter than now, so death aversion takes place. Even during average periods, the expected future fitness change is probably positive since having more offsprings during the rest of the life remains a possibility. Yes, postmenopausal disabled grandmothers are rivaling for resources with their kins and have zero or negative expected future fitness gain, but those only occur occasionally.

## Unfavorable wager model

In history, there are harsh periods when people lose wealth on average. These episodes should be temporary, otherwise nobody would ever survive. The variables that can affect the expected payoff at the end of the harsh period include the magnitude of the risk (difference between winning and losing a wager), the number of wagers offered, starting wealth, and the probability of winning the wagers. Fixing the probability of winning at 50% and expected single wager payoff to -0.1 apples, we run 3000 simulations for each combination of a few common values of these three variables. The ranges of the tested values for these variables are 0.25 to 16 apples, 3 to 243 wagers, and 2 to 32 initial apples, respectively. To save some space, we present the results in a linear regression of the expected payoff on the three variables:

A higher expected payoff at the end of the hard period is caused a higher risk, longer duration in terms of the number of wagers, and a lower starting wealth. Taking the logarithm of the response variable gives a worse fit. Adding second order terms or two-way interaction terms for the covariates gives some statistically significant results and improves the fit (adjusted R-squared 0.992 & 0.975 respectively) but the original regression already fits well and provides a simpler interpretation.

The only less straightforward result is the positive relationship between risk and expected end payoff, which is the opposite of that in the favorable wager model. This relationship is shown in the left graph below when holding the starting wealth at 10 apples and running for 100 periods. The general positive slope means expected payoff maximizing players are now risk loving. The expected payoff flattens at around risk = 20 and would finally approach a slope 1 for very large risk levels. The irregularity at risk = 20 might be because at above this value, the player can suddenly die after 1 wager instead of 2 wagers. In the right graph below, we can the the death rate actually increases with risk level and approaches about 90%. The increase in both death rate and expected payoff is explained by a right skewed distribution of the expected payoff in more risky scenarios. Both graphs actually have an inflection point at x < 0.2 which are not visible.

The expected payoff function with respect to the magnitude of risk or starting wealth now reverses to a convex function which becomes linear as starting wealth increases and death rate drops to nearly 0 (figure 3.3). This convexity makes expected payoff maximizing players risk seeking when their wealth is at small to moderate values. Regardless of weather wagers are favorable, once there is no threat of death or diminishing marginal fitness, players should be risk neutral.

The expected payoff at the end of the harsh episodes may not be proportional to the long run fitness and may thus affect the reliability of the results. For example, the marginal benefit of having an extra apple at the end of the episode is greater when the episode is followed by a huge upturn than when followed by a worse than average period.

## A hunter-gatherer model

I will use another more thematic model to show how diminishing marginal fitness results in risk aversion. A group of hunter-gatherers of 12 people forage on an open savanna. Every period, each person chooses to either gather or cooperatively hunt. The number of people that choose hunting is denoted by n. The payoff for gathering is 1/3 per person without risk. The payoff for hunting is 4 with success probability equal to $1-(\frac{5}{6})^{n}$. Every period, the group consumes food from foraging during the current period that amounts to 0.35 times the group size. If food is not enough, part of the group is cannibalized to feed the rest (it rarely happens in history but models the dynamics well). Each unit of excess food is converted to a newborn who will be foraging in the next period. We simulate this process 5000 times  and describe the distribution of the group size at the end of the simulations. As before, we use expected group size to measure group fitness.

The probability of success in hunting is equal to P(X ≥ 1) when X ~ Binomial (n, 1/6). This simulates n hunters each throwing a spear at a big game (inspired by the board game Neanderthal). The game is captured if and only if at least one spear hits the game’s head, which happens in 1/6 of the throws. Any binomial distribution and threshold is suitable to simulate the hunting risk but this one is chosen due to its gradual PDF. To avoid rounding errors in the group size, I actually model X with Poisson(λ = n * 1/6) to make the group size continuous.

### The results

The expected marginal gain in hunting is less than that in gathering starting from the fifth hunter (figure 4.1). If the hunter-gatherers are risk neutral, they should always send 4 hunters and the rest gather because the total expected payoff is maximized at n = 4.

Figure 4.2 shows the distribution of group size after 100 periods for 5000 simulations. The color represents group size. We can see that the group size is generally the highest around n = 4, which matches the previous analysis about single round expected payoff.

Figure 4.3 confirms this by showing how the mean group size changes with n which peaks at n = 4. This result is robust with respect to consumption rate, initial group size, extinction rate, and number of periods.

Now we limit the maximum number of newborn per period to be 25% of the group size to make the fitness function (food to newborn conversion function) concave as in figure 1.2:

Number of newborns = $(-e^{-r/K}+1)\times K$

where r is the amount of excess food; K is the maximum number of newborn per period. Figure 4.4 shows the mean and median group size with this effect.

The number of hunters that gives the highest expected group size or fitness increases from 4 to 5. Even though more people are hunting, this actually shows a more risk averse behavior behavior because the probability of successful hunts increases from 0.518 to 0.598. For binary outcome random events, the risk is highest when the probability of success is closest to 0.5 and so this increase makes the risk lower. Compared to the original setting without diminishing marginal fitness, the fitness maximizing group of n = 5 here settles for a lower expected payoff (in terms of food) to achieve a lower risk which confirms risk aversion. This result holds even after 1000 periods. But for other K values and food consumption rates, the effect of risk aversion could be weaker so that n = 4 remains the fitness maximizing number of hunters. Lastly, more investigation is needed as to why risk aversion due to death aversion is not observed.

## Future questions

There are a few issues not addressed in this post, which I may address in a future post.

1. Since the goal is to maximize the long run fitness, the age difference between the parent and the offsprings matters. The greater the difference is, the closer it is to pass on the genes to the far future.
2. Wagers don’t have to have 50-50 chances. It is important to know how players weigh different probabilities. Another way to make an unfavorable wager is to make the probability of the bad outcome large and we can test if the expected payoff function stays the same.
3. Environmental fluctuation is not neutral when the expected payoffs are nonzero. A survivor during bad times is worth more fitness than one during good times since the carrying capacity is expected increase during bad times. This may help explain why empirically people are risk averse even without any foreseeable possibility of starvation or losing wealth.
4. People may be able to forecast future environmental trends using the current wagers offered. This may be linked to why people are most sensitive to changes instead of absolute long run wins and losses.
5. Diversity in risk preference in a population may be explained by systematic risk that are positively correlated among individuals.