Group Theory

A group is a set with a binary operation (something like + or ×) satisfying four rules: it's closed (the operation always gives you something still in the set), associative (brackets don't matter), there's an identity element that leaves everything unchanged, and every element has an inverse that cancels it out. The integers under addition work. Rotations of a triangle work. Subtraction doesn't, because it isn't associative. The payoff of the definition is that it's abstract enough to cover a huge range of things at once.


A group you might be familiar with is the moves of a rubik's cube, in this group the identity move is no rotation, you can combine moves by adding them end-onto-end, and there is a well established inverse of each move, which is the reverse move. The coolest part of group structures is that multiple apparently different systems can have the same group structure, which allows you to solve problems for all such systems at the same time by working with the abstracted group.

Analysis

An understanding of calculus can help you differentiate and integrate. When first introduced it is common to think of these operations as tools to help solve problems in physics. However a proper understanding requires proofs and precise definitions. This is where analysis comes in. It provides the tools to figure out whether an infinite sum converges to some fixed value or not. In the context of functions this may seem simple, you just follow a function until you can read out the relevant point, but there are functions that are continuous everywhere but with no defined slope, or that are continuous nowhere at all. Analysis involves constructing precise rules, methods, and techniques for establishing the validity of applying calculus.


The true magic comes when you extend calculus into the complex numbers. While you might imagine an increase in complexity, in many cases the strict rules of analysis in fact make functions and sequences behave more neatly in this expanded space. For example the Residue Theorem lets you compute certain integrals by ignoring most of the function and only looking at a few special points where it blows up (poles), as it turns out the contributions from those points sum to give you the whole integral. Analytic continuation is even stranger; you start with a function defined on some small region of the complex numbers (such as the real axis), and provided it must be differentiable, there comes to be a unique definition for it on a much larger part of the complex numbers.

Combinatorics

Combinatorics involves counting quantities that are too large to count by hand. How many ways can you arrange a deck of 52 cards? How many ways can you move along a grid? In problems like this brute-force computation is simply impossible ("52!" is larger than the number of atoms in the universe). Hence the goal of combinatorics is to find structures and patterns that let you quantify large numbers quickly and abstractly.


The pigeonhole principle is a simple example of this notion. When you have to fit more pigeons into holes than you have holes, some holes must have at least 2 pigeons. This may seem obvious but the key is that it also applies in less obvious situations, for example in a room of 1000 people ther emust be at least one day of the year with more than 2 people sharing that day as a birthday.

Another example is the entropy of a room. Entropy is related to the number of ways to reach a given system. Combinatorics provides a way to quantify this precisely. It states that the number of ways to arrange N objects is N!. An average room has about 10²⁷ molecules of air, so the number of ways to arrange the air in a room is roughly (10²⁷)! (an unexpressibly large number), which allows us to gives a result for the entropy of around 600 kJ/K.

Hypothesis Testing

The p-value of a test represents the likelihood of seeing a result as or more extreme than the given measurement you observe. It is computed relative to the null hypothesis (the description of a situation if nothing interesting is happening). If you take a measurement for the effectiveness of a drug, and it cures 50% of the time, this on its own tells you nothing about how much better the drug is than no treatment at all. To figure this out we need to compute the p-value. If under the assumption that the drug does nothing we expect to measure a 48% cure rate with some randomness, we might compute that our test has a p-value of 0.03. The critical value of 0.05 is a common standard for calling a result significant (although it is somewhat arbitrary). So in this example we would have significant evidence that the drug did have some effect.


It is common to misunderstand this p-value as the probability of the null hypothesis given the data (how likely is no effect given our data about drug cure rate). But that is a different metric, and requires applying bayesian reasoning which requires deciding everything about the null hypothesis before taking any measurements.


The p-value also fails if you aren't careful about when and how you look at the same set of data. If you run twenty test, the critical value of 0.05 implies that you will get at least one false-positive on average. There are methods to try and maintain this false positive rate across multiple tests, but they are often ignored, and aren't practical when you have completely seperate experiments. This also contributes to the replication crisis impacting the reliability of science research, as we reasonably expect about 1 in 20 papers with a p-value near the critical one to be incorrectly rejecting its null hypothesis in favour of justifying an effect where there is none.