This week we continue our discussion of probabilistic reserve estimates by taking a look at some of the most important properties of the most common distribution used in probabilistic reserves estimates: The lognormal distribution.

The following is a lognormal distribution meant to represent the full range of possible outcomes (recovered reserve quantities) and the corresponding probability for each outcome. The X-axis shows the magnitude of recovery (barrels recovered) at each point. The Y-axis shows the likelihood of the outcome for each point. The P10 (the "conservative" estimate) is at the 100-million-barrel mark, meaning 10% of the possible outcomes are to the left of this point. The P90 value (the "optimistic" estimate) is at the 1-billion-barrel mark, meaning 90% of the possible outcomes are to the left of this point. The black line in the middle is the P50 value (the "best" estimate), which is also, by definition, the median. The red line is the mean, and the blue line touching the peak is the mode.

The following is a lognormal distribution meant to represent the full range of possible outcomes (recovered reserve quantities) and the corresponding probability for each outcome. The X-axis shows the magnitude of recovery (barrels recovered) at each point. The Y-axis shows the likelihood of the outcome for each point. The P10 (the "conservative" estimate) is at the 100-million-barrel mark, meaning 10% of the possible outcomes are to the left of this point. The P90 value (the "optimistic" estimate) is at the 1-billion-barrel mark, meaning 90% of the possible outcomes are to the left of this point. The black line in the middle is the P50 value (the "best" estimate), which is also, by definition, the median. The red line is the mean, and the blue line touching the peak is the mode.

The first thing to notice is how small the numbers are on the Y-axis. Don’t let this bother you. The important thing is that--no matter what--the area under the curve will always (and

*must**always) add up to one (i.e. 100%). The curve is created with the idea in mind that it will represent all possible outcomes, which is to say 100% of the outcomes. As a result, the numbers on the Y-axis transform themselves to be whatever they need to be to make sure the area under the curve adds up to exactly one. If, for example, we decided to break down this same distribution into a histogram with 50 bins, we would get the following:*The Y-axis scale on the left remains unchanged because here it is being used to represent the continuous distribution; however, the Y-axis scale on the right, which is being used for the histogram representation, has increased dramatically in scale. These higher values make up the fact that we are effectively acting as though there are fewer "outcomes" by lumping all the outcomes from a single range into one of the bins. Since there are now fewer "outcomes" (or columns, units, etc.--however you want to think about it) to add together, the area represented by the columns still adds up to precisely 1 (i.e. 100%). In other words, don't worry too much about the absolute values on the Y-axis scale. What matters are the areas between outcomes.

Back to Figure 1. Notice how, unlike a normal distribution, this lognormal distribution has different values for the mean, median, and mode. Further, the mean is always larger than the median, and the median is always larger than the mode.

If this doesn’t strike you as important, it should. Below we have a similar lognormal distribution where we have the same P10, P50 (median), and P90 values, but where we have adjusted the other parameters to increase the mean from 400 million barrels to 600 million barrels (a 50% increase), making it nearly twice as large as the median. If the expected value for an investment increased 50%--even with everything else remaining the same--you would want to know, wouldn't you?

Back to Figure 1. Notice how, unlike a normal distribution, this lognormal distribution has different values for the mean, median, and mode. Further, the mean is always larger than the median, and the median is always larger than the mode.

If this doesn’t strike you as important, it should. Below we have a similar lognormal distribution where we have the same P10, P50 (median), and P90 values, but where we have adjusted the other parameters to increase the mean from 400 million barrels to 600 million barrels (a 50% increase), making it nearly twice as large as the median. If the expected value for an investment increased 50%--even with everything else remaining the same--you would want to know, wouldn't you?

The median, remember, is the same as the P50 estimate (the so-called "best" estimate). This is the "proved plus probable" case that is "as likely as not to be exceeded." And yet the

To answer this question, we need to understand the difference between “the average

The Average Outcome

Here we have converted Figure 1 again into a histogram representing the outcomes from 1000 different “trials” drawn from the distribution.

*mean*is the "expected value” meaning the average value per outcome that we should expect to achieve from repeat trials. How can these two numbers be so wildly different?To answer this question, we need to understand the difference between “the average

*outcome*” and “the average*value*per outcome.” While this sounds like semantics, the consequences are substantial.The Average Outcome

Here we have converted Figure 1 again into a histogram representing the outcomes from 1000 different “trials” drawn from the distribution.

Each column represents the number of outcomes that fell within the range represented by the width of the column. The tallest column, for example, “captured” 104 of the outcomes from the 1000 trials, and you can see that the column’s height corresponds to the 104-mark on the right-hand-side axis. The blue columns represent the outcomes to the left of the median, while the red columns represent the outcomes to the right of the median.

By adding up the outcomes from each of the seven blue columns to the left of the median, we can see that the number of outcomes tallies to exactly 500.

By adding up the outcomes from each of the seven blue columns to the left of the median, we can see that the number of outcomes tallies to exactly 500.

Similarly, the outcomes represented by the red columns to the right of the median also sum up to 500 outcomes, together covering all of the 1000 trials.

From this perspective, we might think of the outcome represented by line between the blue and red columns as “the average outcome.” For instance, consider the very last column to the right. On the one hand, this column represents outcomes having nearly 1.8 billion barrels in reserves, but on the other hand, it represents just

The Average Of The Outcomes

Contrast this with a case where the magnitude of the outcome does matter. In such cases we often want to know “the average

When you balance a shape on your finger, what matters is not just how much mass is on either side of your finger, but also how far away from the balancing point the mass is located.

Likewise, the mean for a distribution takes into account the magnitude of the outcomes (or the distance along the X-axis away from the mean). The mean, therefore, becomes the visual "balancing point" for the distribution. Whereas the median is the point where you would cut the distribution in half to get equal areas on both sides, the mean as the point where you would balance the distribution on your finger to keep it from toppling over to the left or to the right. We created the figure below to illustrate this point. The stacked squares that make up this quasi-lognormal distribution are blue to the left of the median and red to the right of the median, yet if you weight each square by its distance from the balancing point (the mean), the balance of forces is equal on both sides.

From this perspective, we might think of the outcome represented by line between the blue and red columns as “the average outcome.” For instance, consider the very last column to the right. On the one hand, this column represents outcomes having nearly 1.8 billion barrels in reserves, but on the other hand, it represents just

*one*outcome. To take a specific example, from the perspective of the P50 (or median value) the 1.8-billion-barrel outcome is no more special or significant than one of the 24-million-barrel outcomes in the first blue column on the far left. If we say each outcome is equal with all other outcomes by virtue of being just*one*outcome, then it is fair to say that the median is “the average outcome.”The Average Of The Outcomes

Contrast this with a case where the magnitude of the outcome does matter. In such cases we often want to know “the average

*value*per outcome.” Here, since the value (or magnitude) of an outcome is represented by the distance along the X-axis, we can use a more intuitive parallel from basic physics to understand what's going on.When you balance a shape on your finger, what matters is not just how much mass is on either side of your finger, but also how far away from the balancing point the mass is located.

Likewise, the mean for a distribution takes into account the magnitude of the outcomes (or the distance along the X-axis away from the mean). The mean, therefore, becomes the visual "balancing point" for the distribution. Whereas the median is the point where you would cut the distribution in half to get equal areas on both sides, the mean as the point where you would balance the distribution on your finger to keep it from toppling over to the left or to the right. We created the figure below to illustrate this point. The stacked squares that make up this quasi-lognormal distribution are blue to the left of the median and red to the right of the median, yet if you weight each square by its distance from the balancing point (the mean), the balance of forces is equal on both sides.

With a true lognormal distribution, the tail to the right extends out to infinity, making it possible to have even more extreme differences between the median and the mean values.

Next week we will tie this together with some specific reserves examples and show how these estimates can change shape and converge to a single point over time and as more data is collected.

**Stay Tuned**Next week we will tie this together with some specific reserves examples and show how these estimates can change shape and converge to a single point over time and as more data is collected.