[P&N] Chapter 11: Probability and Possibility

Citation: Edward A. Lee, 2017: Plato and the Nerd - the creative partnership of humans and technology. MIT Press, Cambridge, MA

The Bayesians and the Frequentists

As we discussed in the last post, Laplace’s agenda is actually a philosophical question than a scientific question for that it cannot be falsified.

The fundamental limitation of our understanding regarding this world leads to another model - probabilistic models. We use this model to embrace uncertainty.

We change the goal. Instead of seeking certainty, we seek confidence.

The probability is a formal model for quantifying what we don’t know. In other words, probabilities quantify known unknowns, and possibilities handle unknown unknowns.

But what does probability really mean? Is it subjective or objective? Is it a platonic form or an empirical experience?

Although experts in probability theory pretty much agree on the mathematical machinery that they use, they disagree on the basic meaning of these numbers(probabilities). These experts fall into two camps: the frequentist ant the Bayesians.

Frequentists and Bayesians, form JOSEPH BUCHDAHL

The major difference I found between these two camps is that Bayesians embrace subjectivity while frequentists do not. Moreover, the Bayesian’s approach systematizes learning, which is the process of reducing uncertainty. Bayesian’s method update our belief(prior probability) by calculating the posterior probability using the Bayes rule, check this out.

Both perspectives have merit, but I take the same stand as Lee. He found the Bayesian perspective more compelling for 3 reasons:

  1. The Bayesian approach embraces the notion of information and learning. It is central to most machine learning methods.

  2. It makes more sense when talking about rare events. “The probability of a major earthquake in San Francisco in the next 30 years is 63%.”, for example. “Such statement, if it is backed up by rigorous research, reflects the aggregate opinion of many experts and the use of computer simulation models that are informed by prior experience with real earthquakes.”

  3. It covers situations that seem to be simply not well handled by the frequentist interpretation.

For the 3rd reason, consider the following situation:

A coin is flipped, but the outcome of the flip is obscured by a cup before you observe it. What now is the probability that you will see heads when the cup is removed? No matter how many cup-removal experiments you perform, the outcome will always be the same, so it seems that the frequentist would have to say that the probability of heads is either zero or one, but we don’t know which it is. The Bayesian, however, has no difficulty with this situation. The probability of observing heads is the same as it was before the coin was flipped.

Another subtlety about frequentists is that if the world is really deterministic, then the frequentists are on a rather “slippery slope” - what does “repeat an experiment” even mean?

For continuums, we use probability density functions, which will be skipped here.

Wrap Up

Using Bayesian’s method, we can actually explain dogma. If we are completely sure about something to be true or false, meaning that the prior probability to be either 0 or 1, then there will be space to learn - the posterior probability will always be 0 or 1, the same as the prior probability. This is called dogma by people. So if we have any doubt, we should not set the prior probability to zero or one.

Let’s consider another interesting topic - human-like intelligence. Weizenbaum claims that the appearance of human-like intelligence does not in fact imply the existence of human-like intelligence. He states that once the program generating such human-like intelligence is fully explained, it will lose its magic. In other words, the observer thinks “I could have written that”. He seems to be claiming that if a program is comprehensible, then it must not be intelligent, which is rather disturbing and subtle.

The reason why his statement makes us headache lies on the fact that the notion of understanding human intelligence is self-referential, because the process doing the understanding must itself be intelligent. Then the notion is very likely vulnerable to suffer the sort of incompleteness that Godel found in formal languages. And if the incompleteness is real, we would never fully understand intelligence.

So let’s again try to convince ourselves that cognition is in fact a form of computation, then what evidence is needed to do so? We can hardly come up with one, because even the Turing test tell us nothing about the consciousness. Big relief, huh? Finally, we would not have to worry about this annoying thing. But, really?

· 读书