Sunday, August 17, 2014

“Mind and Spirit”–> Conditional Probability Part I

Thomas Bayes.gif

As much as I might wish it, this is not a picture of Thomas Bayes, although it is purposed to be one.  This picture comes from Wikipedia, which reports that the best guess is that this picture was created for a 1936 book called “History of Life Insurance” as a placeholder for somebody to think about Thomas Bayes.

As far as we know, there are no known pictures of this famous clergyman, but his work has inspired a great body of learning.  One that extends the previous subjects that we have discussed about when covering Type 1 and Type 2 thinking or Nassim Nicholas Taleb’s writings.

Born in 1701, Bayes was a son of a clergyman to a family of nonconformists.  In his day, to be a nonconformist meant that you did not necessarily buy off on the idea of the Church of England.  In today’s vernacular, many of the Churches that came out of this movement are known as the Free Churches, which is to say that they believe that the Church should be separated from the control of the state.  The history of being Protestant is the history of fracture.  Rather than being a hindrance, this movement can result in new discovery and stronger thoughts.  I would like to think that Bayes belief in our Lord Jesus Christ and his familiarity with nonconformist thought generated what I consider one of the greatest branches of mathematics.  Bayes gave the world an understanding of how to deal with randomness and statistics.

While Bayes published a one mathematical publication in his life time, what the world now calls “Bayesian Statistics” was left in a document that was willed to a friend, Richard Price, along with 100 pounds at Bayes death.  Price, being a man of good intent, made sure that his paper was first read to the Royal Society in London, then published.  In this paper, An Essay of Solving A Problem in The Doctrine of Chance, Bayes attacks how to understand conditional probability.

If you have taken any probability courses, one of the first things that you are taught is the idea that a random event does not care about what happened before it.  Almost all probability theory will point out that if you have flipped a coin and heads have come up 5 times, the next time you flip a coin, the coin still has a 50/50 chance that it will come up heads.  This fact disturbs most of us, and it should to some extend.  The reason is that in the long run, eventually we know that the coin should come up around 50 percent of the time as heads and 50 percent of the time as tails.  Therefore, we feel that the coin should somehow be “building up” a tails after it has been flipped 5 times.  Our intuition says, “there must be some type of a mechanism to now start pushing the coin back to the tails side.

Now in the long term of flipping, we do expect that the results will be very close to 50 percent tails and 50 percent heads, and this long term trend is called regression to the mean, which is often not understood correctly.  In the most simple case, this means that any random or even partially random event may have events that deviate from the underlying mean, but eventually the events will center around some type of a mean.  However, the regression to the mean is only accomplished by a random 50 percent chance at every flip of our coin.  This idea that each flip is independent of the last flip means that we will often have streaks where the coin may come up heads many times in a row. 

If you don’t understand this, the deviation from the mean will seem to be a pattern, even though none exists.   Taleb has argued that most most stock pickers get confused between these streaks (which we might call “good luck”) and real skill in picking stocks.  I completely agree with him.  Many things in our life is governed by this streaking nature.  image

However, let us look at the chart above.  This is a chart showing my progress from trying to learn how to golf left handed in 2013.  You can see that I started off very poorly with scores at around 145 strokes per round.  However, in April of that year, I had a round at a course where everything went right, and I shot a 102.  Once I had shot this 102, I could have said, “Oh look, I now know how to shoot a 102, so I’ll shoot this from now on.”  However, because this had a large part of random chance in my game, it look me 36 more rounds, or approximately 200 hours of playing golf, before I could equalize this score  However, if you look at the general trend, which is the solid red line, you can see that my mean results were slowly getting  better.  Games that were above or below the mean simply means that I would eventually return to the underlying mean either up or down.

This graph shows the proportion of heads when one tosses a fair coin from once to 500 times. As the tossing times increases, the proportion of heads becomes stable around 0.5.On a coin flip, we can see the exact same idea.  The graph to the right is from a Penn State course on probability.  As you can see, they started flipping a coin.  During the first 100 flips, the coin appeared to be broken, and during the first 50 flips it was very broken.  Somehow, the coin was coming up heads only 40% of the time.  If you can been betting on heads, you would have lost a lot of money.  However, the coin later reverses itself, and the line trends to the high side, until it gets very close to 50 percent heads.  So, while there is a trend under the data, there is no mechanism under the coin flip to push it one way or the other.  In truth, many random events are extremely lumpy.  In other words, these events seem to push one way for a quite a while.  It is impossible to predict when a truly random event will regress to the mean, and certain trends will stay present for a very long time.

From a theological standpoint, how should we as Christians think about probability? 

There is remarkably little on this in the Bible other than Proverb 16:33:

The lot is cast into the lap,
    but its every decision is from the Lord.

What we do know from the Bible is that God clearly has taken charge of what are random events to make decisions in particular events.  There were several times in the Bible that lots (or dice) were used to determine a path that God’s people should take, and God did not criticize them for using this method.  However, I am going to submit that God allows truly random events to happen.  If you read the scripture above from Proverbs, you can see that the language is specific and unique.  The lot is cast.  The lot’s decision is from the Lord.  It does not say, “The Lord causes the lots to fall in a particular way.”  In other words, the Lord chose a particular number to come up, or the number may be truly random.  In other words, I believe there are two cases:

1. Random events that are not random because God is actually controlling the exact turn of the die or lot.  These can be thought of as miracles, where God intervenes in the machinery that he has set up to break his own physical laws.

2. Random events that are random because God is allowing the random events to happen.

Although beyond the scope of this blog, you cannot not have free will without having randomness.  The two are linked and tied together.  Much of our Christian life is our ability to understand that the world that we live in is random and filled with probabilities.  It is our mission to bring our lives to a place where God breaks down the rule of randomness, and he plays with his own machinery to get the results that glorifies himself most.  Probability exists because it is something that we are called on to embrace and attack through prayer and supplication.

However, let us turn from the side bar on chance, free will and probability to get back on subject.

So after taking multiple courses in stats and probability, you’ll start to understand this idea of “random is random.” If you are very lucky, you will have somebody explain that these random events will regress to a mean, and you will feel that you really understand randomness.

The truth is that we have only gone half way in our journey of understanding probability. 

What we have discussed so far is random independent events.  As we discussed, it is easy to get mislead when thinking about these events.

The reason that these are called independent events is that the first coin flip has nothing to do with the second coin flip.  There is nothing tying these things together.  When two events are truly independent, we say that there is no conditions on the event.

We have talked about coin flips, but lets make it just a little more complicated.  Let’s talk about dice, and more specifically, rolling two dice.  The excellent graph to the left, which I took from the internet, shows the results of rolling a black and white die together, at the same time, and getting results at the same time.  Because each side has exactly 1 out of 6 chances of coming up, we can simply start listing all of the combinations that might come up if we roll the dice together.  There are 36 different combinations that can come up.  The 36 combinations can be made up of the values shown in the graphic.  This graphic is called our “solution space” or “solution set.”  In other words, we know that each dice may come up 1 out of 6 ways, and when we put all of these results together, we can see that there are 36 possible ways of combining the dice.  However, not all numbers are of equal weight. 

Why?  To get a 2 on the dice, you need to roll each die and get it to come up as a 1.  This means that you must have the extremely lucky chance of rolling the white die to come up a 1 (which only happens 1 out of 6 times), then you must have the black die also come up as a 1.  This means, on average, you must roll the dice 36 times because you need to roll the white die 6 time to get a 1, and you need to roll the black die 6 times to get a 1.  Hopefully, it makes intuitive sense that the first series that you roll on the white die of 6 times means that you will probably get a 1, but there is only a 1 out of 6 chance that the black die will also be a 1.  Therefore, we need to roll the dice 36 times to get the double 1s.  (And remember, I am simplifying this because these results assume a regression to the mean, but in the instantaneous case, it may take many times more or less to get our result of 2.)

However, getting a 7 is much easier.  If we look at our solution set, we can see that out of our random rolls, we would expect 7 to come up 6 times in 36 rolls.  The chance of getting a 7 is six times more likely than getting a 2.  In other words, when we roll the dice 36 times, we might expect that the 2 result will come up 1 time but the 7 result will come up 6 times.

Once you go through some basic stats and probability courses, you might heuristically start to recognize short cuts to get to the desired results.  In our dice story, a graduate of a Stats 101 course in college might be able to say, “The chance of getting a 2 is 1/6 *1/6 or 1/36.”  This is correct, but you will probably stop them from quickly calculating what is the chance of getting a 10, because you cannot simply take two outlier conditions that are simple and multiply them together.  Instead, you must write out the solution space, then think to yourself how all the solutions could happen, and what is the chance of how it will come up.

Now imagine that the dice are loaded, or have a slight weighting toward one of the numbers.  This weighting will now influence the randomness.  If they are both weighted toward showing a 1 maybe 10% more often, your results would be 1/36 * 110%.  This would mean that you would need to go back to your solution set, and you would need to adjust all of the solutions up (if they had a 1 in them) or down (if they had no 1 in them).

Here is the kicker.  Life is loaded with loaded dice.  As a matter of fact, random numbers are extremely useful to do things like encode information.  If you are into encryption, you will find out that the most powerful way to encrypt something is to have a list of random numbers that only you and a companion have.  Then you encode your message using only the random number by offshifting the letter by the random number.  If the numbers are truly random, nobody can guess what the underlying letters are.  However, if you use a pair of dice to encode the numbers (which have 7 as much more popular than 2), then somebody simply needs to start using a lot of 7s to figure much of your secret message. 

It turns out that creating truly random numbers is so challenging that many programing languages and/or silicon create “pseudo random” numbers.  These numbers look a lot like random numbers, but if we look at them long enough, we can find some type of pattern in them.

So we come to the end of our blog post on random numbers.  We have found out that predicting even simple random events is very difficult.  Humans don’t have brains that are wired to deal even with simple random events.  Many gamblers have lost a lot of money because they kept thinking “my luck has to change.”  All of us gamble in one form or the other every day.  Having some idea of randomness can help us understand the world that we are in and not take too much (or too little) responsibility for what happens to our lives.

What we have looked at is the chance of two dice rolling a number.  Now we will conclude with what Bayes tried to describe.  What happens when the events are no longer independent. 

Let’s say that we don’t throw the dice at the same time.  Let’s say that we throw one die, then the other.  The chances of throwing both dice and coming up with a 1 is 1/36.  However, let’s say that we throw one die and it comes up as a 1.  What is the chance that the other die will come up as a 1?  It should be obvious that we have taken the odds from 1 out of 36 to 1 out of 6.  Once the first die came up as a 1, we increased our chances to get a 2 out of our die by 6 times. 

This is now conditional probability.  The chance of getting a 2 result has changed dramatically because of the conditions.  Once you have established that you could get a 1 on the white dice, the chance of getting a 2 has become much, much more probable.  This change of probability is understandable in our context, but it turns out that this change of conditions that impact the underlying probability is something that our brains simply don’t process well.  As a matter of fact, this conditional probability issue is something that prone to errors, and an area which I am sure you have made mistakes about in the past.  It is this changes in conditions that we will attack in the next section, and we’ll discuss how to think about it.

No comments: