The \(p\) value is often used in applications to determine if some improvement
is *significant* or if it was just random chance. Loosely speaking, it defines
how likely it is (under the assumption of an hypothesis \(H_0\)) to get more
extreme results.

This is interesting for studies in medicine and similar scenarios. You have a drug and a placebo. You want to figure out if the drug is better than the placebo.

## Hypothesis testing

In statistics, hypothesis testing works as follows:

**Define a statistical model**: This means you have a sample \(\mathfrak{X}\) and an assumption about the distribution of the data. For example, \(\mathfrak{X} = \mathbb{R}^n\) where \(n \in \mathbb{N}\) is your sample size and \(X_1, \dots, X_n \stackrel{iid}{\sim} \mathcal{N}(\mu, \sigma^2)\)**Define a hypothesis and an alternative**: For example \(H_0: \mu = 100\) and \(H_1: \mu > 100\) might be hypotheses if you want to check if a drug increases the IQ.**Controll errors**: You can make two errors. Either \(H_0\) is true and you reject it or \(H_1\) is true, but you don't reject \(H_0\). In statistics, the first error is usually controlled. So the test is made in such a way that the first error is less than some \(\alpha \in (0, 1)\). Usually, \(\alpha = 0.05\) or \(\alpha = 0.01\) or even lower.**Test statistic**: You have a value which indicates something about the parameters in the hypotheses. For example \(T(X_1, \dots, X_n) = \frac{1}{n}\sum_{i=1} X_i\).**Test statistic distribution**: \(T\) itself is a random variable and you can calculate its distribution (e.g. \(T \stackrel{H_0}{\sim} \mathcal{N}(\mu, \frac{\sigma^2}{n})\)).**Calculate test decision**: Hence you can calculate a \(c \in \mathbb{R}\) such that \(P_{H_0}(H_0 \text{ is rejected}) = P_{H_0}(T \leq c) \leq \alpha\). It does not have to be \(T < c\), but often it is.

## The p value

Now note that you could also make it the other way round. You could calculate

If \(p^* \leq \alpha\), then \(H_0\) can be rejected on Niveau \(\alpha\).

## Interesting statements

I just came across the following statments which I think are interesting enough
to share them. **All of them are wrong.**

The follwing was takine from a German statistics exam by Dr. Klar (KIT, WS 2013/2014).

Assume for the following, that an experiment resulted in a \(p\) value of \(0.01\).

- \(H_0\) is certainly false.
- \(H_0\) is with probability \(0.01\) false.
- \(H_1\) is certainly correct.
- You can calculate the probability that \(H_1\) is correct with the \(p\) value.
- If one decides to reject \(H_0\), then the \(p\) value is the probability of making the wrong decision.
- The experimental result is reliable, meaning that if the experiment is repeated often one would get a significant result in 99% of the cases.

I'm not too sure if (5) is really wrong.