Chapter 2 Bayes Theory

Bayes’ Theorem was first proposed by Thomas Bayes, an English statistician, philosopher and minister. He famously never published the theorem that bears his name, but they were found and published later by his friend Richard Price. His theory was motivated by his desire to understand creation prove, or calculate the probability of the existence of God. His theorem have been interpreted many ways, but can summarised as the interest in the relative strength of beliefs and hypothesis rather than the frequency of them being true. His work was popularized by the french scholar Pierre-Simon Laplace, who brought to statistics the Bayesian interpretation of probability. Because Bayesians’ measure the relative strength of a hypothesis rather than seeking to succinctly accept or reject the hypothesis as is practiced by frequentists in Null Hypothesis Testing (NHT), a common and somewhat humourous critique of the philosophy is summarized by one statistician is “to be a bayesian is to never have to admit you are wrong.

The Bayesian framework operates under the assumption that what we observe is the truth. It is the only branch of statistics to treat all unobserved quantities as random variables, and utilizes probability distributions to govern their behavior. Models are utilized to describe the system from which our data arises. A common critique of the Bayesian framework is that the results are clouded by utilization of prescribed priors. Indeed, it is possible to skew the results if poor, or biased, priors are used in constructing or fitting the model. However, this is true of all statistical methods.

Proccess models describe the system generating our data, and are known to produce functionally different data types. These may be real numbers, counts, proportions, or ordinal data types. The models may or may not include stoichasticity of the system, i.e. they may be either deterministic or probabalistic in nature.

2.1 Bayes Theorem

Bayes’ Theorem is a mathematical formula used to update probabilities based on new evidence. It describes the probability of an event given prior knowledge and new data. The theorem is expressed as:

P(A|B) = [P(B|A) * P(A)] / P(B)

Where:

P(A|B): Posterior probability of event A given evidence B.
P(B|A): Likelihood of evidence B given event A.
P(A): Prior probability of event A.
P(B): Probability of evidence B (normalizing constant).
In essence, it allows you to revise the probability of a hypothesis (A) given new evidence (B) by combining prior beliefs with the likelihood of observing the evidence. It’s widely used in statistics, machine learning, and decision-making for tasks like spam filtering, medical diagnosis, and more.

2.2 Random variables

Inverse probability theory divides the world into things that are observed and things that are unobserved.

A unobserved quantities are random variables
Values of random variables are governed by chance following some probability process model
Probability distributions quantify “governed by chance” as the Space where chance occurs