64. My PhD Thesis: Part 2: More Algebraic Topology and Model Theory
(Epistemic
status: A creased, stained map to what were once my favorite hunting
grounds. Accessible to anyone who can support substantial abstraction;
prior math knowledge is not necessary. In particular, ignorance of
calculus is not an obstruction here, but total ignorance of geometry or
like, arithmetic or logic, will be. Extremely dense and probably won’t
get you there, but at least you’ll ask better questions. Partially
dedicated to DG, JM, and PR.)
To start with, answers to the two challenge problems.
For the algebra one, the thing to notice is that since \(\mathbb{Z}_n = \langle x | x^n \rangle\) is cyclic, so that for any \(g \in \mathbb{Z}_n\), in fact \(g = x^k\) for some \(0 \leq k < n\). Then \(g^{gcd(k, n)} = x^{k \cdot gcd(k, n)} = x^{m \cdot n} = (x^n)^m = 1^m = 1\), for some \(m\) by the definition of gcd. Additionally, \(gcd(k, n)\) is minimal, such that the order of \(g\) - the number of elements that are some power of \(g\) inside of \(\mathbb{Z}_n\) - is in fact \(n/gcd(k, n)\). In particular, if \(gcd(k, n) > 1\), then not all of \(\mathbb{Z}_n\) will be in \(\langle g \rangle\). This also gives us the second half of the algebra challenge problem: if \(n\) is in fact prime, then it's always true that \(gcd(k, n) = 1\), and this is a necessary and sufficient condition for any \(k\) working.
For the topology one, the key is to start by simplifying the setup to think about both the fundamental group of the two-pin setup and the setup where either pin is removed, where loops get completed from the rock back to the base of the crane. The trick is to use what happens to the fundamental group when you remove each pin: in particular, we get a quotient group where the generator corresponding to adding an additional relation. The fundamental group is given by \(\langle a, b \rangle\), and removing one pin gives you the additional relation completely eliminating one of the generators: \(\langle a, b | a \rangle = \langle b \rangle; \langle a, b |b \rangle = \langle a \rangle\). This gives us one possible solution to the problem: \(aba^{-1}b^{-1}\), which reduces to \(aa^{-1} = 1\) or \(bb^{-1} = 1\), depending on which pin is removed. Below is a sketch of a one possible solution implementing that element of the fundamental group.
Anyhow: when last we left off, we'd walked the first half of the red
path - a first pass at abstract algebra - and then walked the green path - some basic algebraic topology - to see why we might care about abstract algebra at all, if we're thinking about geometry. However, surely the method I sketched of understanding a geometric object through explicit calculation of its fundamental group isn't an efficient way to understand those objects. We'd need some way of making our lives easier - some set of clear reductions, and ideally some way of using fundamental groups of simple objects to build up more complex ones. But why should we believe that we can do anything like that? To see why, we'll need to walk the second half of the green path and think a little more deeply about algebraic topology.
Let's jump right in and start with building up more complicated spaces out of simpler ones. Remember connect-sums from last time?
Let's call our two objects \(X, Y\). I claim that the key insight for figuring out what operation you need to do in order to find \(\pi_1(X \# Y)\) in terms of \(\pi_1(X), \pi_1(Y)\) is to think carefully about what loops need to look like inside of \(X \# Y\). Let's start by thinking about a special case of connect-sums called "wedge sums". The wedge sum \(X \vee Y\) is what happens when instead of removing a whole patch from each space, we join the two spaces as minimally as possible: at a single point.
Loops in \(X \vee Y\) look like sequences of simpler loops, each of which is wholly contained in either \(X\) or \(Y\), and each of which is joined to the next and previous one at the unique point of connection; we must thus identify the trivial loops in \(X, Y\) with each other. We call this a "word", where the individual loops are "letters". In fact, we also use the same term for what happens when you take two groups and combine them in a similar way: if we take two groups \(G = \langle \{g_i\} | \{r_i\} \rangle, H = \langle \{h_j\} | \{s_j\} \rangle\), then the free product \(G \ast H\) is given by \(G \ast H = \langle \{g_i\}, \{h_j\} | \{r_i\}, \{s_j\} \rangle\), where we similarly have \(1_G = 1 _H\)... though we might have been able to guess that given what we already know about how many identities a group can have.
Let's think about how this plays out in one of the simplest cases: two circles wedge-summed together, which we call \(S^1 \vee S^1\).
What about the more general case? Maybe you can see the pattern: if we have a space \(Z\) which we can break down as a union of two spaces \(Z = X \cup Y\) then for path-connected \(W = X \cap Y\), then the loops of \(Z\) break down as alternating loops from \(X\) and \(Y\), though now we need to think about avoiding double-counting any nontrivial loops that might lie entirely in the intersection. This gives us a form for the relationship between the fundamental groups. Let \(f: \pi_1(W) \to \pi_1(X), g: \pi_1(W) \to \pi_1(Y)\) be the natural inclusions of the fundamental group of the overlap into the fundamental groups of the two overlapping spaces. Then we have \(\pi_1(Z) = (\pi_1(X) \ast \pi_1(Y))/\langle f(\pi_1(W))g(\pi_1(W))^{-1} \rangle\), or with some abuse of notation, \(\pi_1(Z) = \langle \pi_1(X), \pi_1(Y) | \forall w \in W, f(w)g(w)^{-1} \rangle \), that is, the free product of the two component fundamental groups with added relations coming from the fact that the two homomorphic images of the same element from \(W\) need to actually be the same. We call this the amalgamated free product, and its application to algebraic topology is called the Seifert-van Kampen theorem. (Try showing that if we decompose a sphere into two discs overlapping on an equatorial band, this calculation still gives us that \(\pi_1(S^2) = 1\).)
One last topic to cover before moving on: quotient spaces, covering theory, and Cayley graphs. A quotient space is what happens when you start with one space - some geometric object - and then glue parts of it to other parts of it according to a specified rule. A classic example is that of the circle, \(S^1\). If we take the real line \(\mathbb{R}\) and declare that any pair of points that differ by exactly some integer are actually the same point, then what we get is the quotient of the real line by the action of the integers, which we write \(\mathbb{R}/\mathbb{Z}\). But consider that this is exactly the circle: all we've done is wrap the real line around itself once for every integer, so that moving from any integer to the next integer corresponds to going around the circle once.
Another classic example can be found in classic video games like Pac-Man or Asteroids. The story here is actually very similar to that of the quotient construction of the circle: take the plane \(\mathbb{R}^2\) and quotient it out by the action by \(\mathbb{Z}^2\) corresponding to moving an integer number of screens up or right, so that the top of the screen is the same as the bottom, and the left of the screen is the same as the right. The result, \(\mathbb{R}^2/\mathbb{Z}^2\), is better known as the torus \(T^2 = S^1 \times S^1 = \mathbb{R}/\mathbb{Z} \times \mathbb{R}/\mathbb{Z}\).
There's something unusual about quotient spaces as contrasted against quotient groups, which we discussed earlier. In both of the examples above, we started with a contractible space - one with fundamental group \(1\) - and ended up with a space with fundamental group precisely the same as the action we quotiented out by. We can even see why this should be the case: quotienting a space by an action means that we might create new loops in the resulting space, and we should get at most one such loop for any element of the action group. This is no accident - every "nice" topological space \(X\) has what's called a universal cover \(\tilde{X}\), distinguished by being the space which covers \(X\) and has trivial fundamental group, that is, \(\pi_1(\tilde{X}) = 1\) no matter what \(X\) is.
But what does it mean to cover a space? There's a deep rabbit-hole to go down there, so I'll keep it light: roughly, one space \(X\) covers another space \(Y\) if for every point \(y \in Y\), the covering map \(c: X \to Y\) has true of it that there's some open set \(U_y \subseteq Y\) such that the preimages \(c^{-1}(U_y)\) remain disjoint, are all homeomorphic to - topologically the same as - \(U_y\), and form a family as indexed by some discrete set of points. We call the "copies" of \(U_y\) from \(c^{-1}(U_y)\) "sheets". Coverings are, for our purposes, always surjective - they assign at least one point of the covering space to each point of the base space, and in fact the number of such points is the same no matter which point of the base space we pick; we call the set of such points the fiber over the point in the base space and the number of those points the degree.
Every space covers itself under the identity map. Similarly, another cover comes from having multiple disjoint copies of a space all sent to a single copy of the space. The two quotient space constructions above are also covers; not all quotient spaces need be so well-behaved, as the rule that a quotient space follows need not be as simple as an action by a group or even a specific envisionable welding of some points to others.
For a challenge problem to wrap this all up, try to find the degree-\(2\) covering map from the torus to the Klein bottle, as well as the degree-\(2\) covering map from the sphere to the real projective plane.
Let's take a break from algebra and topology for a moment and walk the blue path, thinking about some model theory. Model theory is the area of math that deals in how we connect up strings of symbols with what those symbols are supposed to mean. In other words, model theory asks the question "what restricted models of logic allow us to prove a given sentence, or even to assign it a desired meaning?".
We need an alphabet, to start with. We'll use all the familiar tools of propositional logic: the connectives AND \((\wedge)\), OR \((\vee)\), IF \((\rightarrow)\), and IFF \((\leftrightarrow)\); the quantifers FORALL \((\forall)\) and EXISTS \((\exists)\); the constants TRUE (\top) and FALSE (\bot); and as many variables \(x_i\) as we'd like, along with parentheses and brackets for disambiguation. We also need some "extra" symbols for functions, constants (which are just nullary functions), relations, and semantic predicates to work with; these last are just functions that take in zero or more variable values and return TRUE or FALSE - for example, "equals" is a 2-ary (or binary) predicate that returns TRUE exactly when its two arguments are actually equal - rather than something like a number. Each predicate and each function can accept whichever (fixed) number of arguments we like. (We actually only need NOT, a single connective (OR, say), and just one of the quantifiers (like FORALL), alongside all of the variables, though we can't guarantee that we can get rid of any of the constants, functions, or predicates, as that depends on the semantics. Pick a connective and a quantifier. Can you see why we can build all of the other connectives from NOT and your choice of connective? How about turning one quantifier into the other, again through use of NOT?)
The basic objects of discourse are well-formed formulas: "complete sentences" of mathematics, with correlates of things like correct spelling and grammar. More formally, a well-formed formula, or sentence, is either an atomic predicate - some simple statement like \(x = 17\), the result of applying a modifier (a quantifer or else simply NOT) to another formula, or two formulas connected with one of our connectives. Notably, formulas must always be of finite length. Model theory - and much of math as a whole - is built on first-order logic, which means that our quantifiers only ever range over individual variables, not sets of variables or anything more complicated. For example, we can express things like "all x such that x is even", but not "all subsets of x where the difference between lower and upper bound is no more than 5". For a well-formed formula to be a sentence in first-order logic, all its variables must be bound. This means that they must be quantified over, so that the formula makes claims about whether a satisfying assignment of values to the variables exists, or whether all assignments of value to a variable satisfy the formula. This allows us to assign truth values to them.
We can finally define what theories and models are. A first-order theory is a set of axioms, each of which is a sentence whose symbols come from the alphabet of first-order logic as given above, along with symbols from the theory's signature. The signature of a theory is the set of "extra" symbols - the important constants, functions, operations, and relations of our structure of interest. If we're talking about arithmetic on the integers, for example, the signature might include symbols for \(0, 1, +, \times, -, \mathrm{ and } =\), among others. A sentence \(S\) is said to hold in a theory \(T\), or that \(T\) models \(S\) - written \(T \models S\) - if \(S\) can be proven from the axioms of \(T\); the proof must be finitely long.
The major thing I'll ask you to take on faith is the "compactness theorem". (If you want to dig more deeply into this part and see why it works, the search terms you want - other than "compactness theorem" - are "Robinson's principle" and "Godel's completeness theorem". In particular, we really only care about Robinson's principle here.) The compactness theorem says that a set of first-order sentences has a model if and only if every finite subset of those sentences also has a model. Additionally, Godel's completeness theorem says that because we're working in first-order logic, if a theory models a sentence, that sentence can also be proven in the theory.
So why does this matter for us? Consider the theory of algebraically closed fields, called \(ACF\), where by "algebraically closed" we mean that every polynomial has some value for its variable(s) that make it be equal to zero, and we can find those values somewhere in the numbers that our theory can express. (Note that this is not always true! If we write down a theory of integer arithmetic, we wouldn't be able to express numbers like \(\sqrt{2}\).) Additionally, a field is an object like a group but stronger: now we have two operations - \(+ \;\mathrm{ and }\; \times\), say. Our chosen set of numbers must form a commutative group under both operations, each with their own identity element - though our "multiplicative" group leaves out the "additive" identity - and one of the operations must distribute over the other. (Can you see why the additive identity can't have a multiplicative inverse? For a hint, use distributivity. Suppose the multiplicative inverse of \(0\) is some \(r\). What's true about \(0 \times r = (r + (-r)) \times 0\)?) Just like before, you've seen fields before: the rationals \(\mathbb{Q}\), the reals \(\mathbb{R}\), and the complex numbers \(\mathbb{C}\) are all fields, and the fundamental theorem of algebra says that in particular \(\mathbb{C}\) is algebraically closed. Finite fields also exist: for any prime \(p\), \(\mathbb{Z}_p\) is an example of a field. (You may want to write down all of the axioms for a field explicitly before moving on, to make sure that you understand. In a proper axiomatization of the theory of fields, each such axiom would need to be written down explicitly.)
Now, \(ACF\) has axioms expressing all the group axioms for addition and multiplication, the distributive law, and one more set of axioms to the effect that every polynomial has a root, and a signature of the form \((0, 1, +, \times, =\), but on its own, \(ACF\) is actually incomplete. Fields have a value called a characteristic, which is the smallest number of times you need to add the multiplicative identity to itself to get the additive identity. We could prove that every finite field has order the power of some prime, but that's a little out of scope. The important thing is that the characteristic is prime. \(ACF\) has a hole in its specification that we can fill as we like: we get to set the characteristic; if we don't, we have a theory that expresses things that are true of every algebraically closed field. To fill the hole, we have a choice. We can add one more axiom of the form \(1 + 1 + \cdots + 1 = 0\), where the number of such \(1\)'s is some prime \(p\), in which case we have \(ACF_p\), the theory of algebraically closed fields of characteristic \(p\), like \(\mathbb{Z}_p\). On the other hand, we can instead add one axiom for every prime number of the form \(1 + 1 + \cdots + 1 \neq 0\), in which case we get \(ACF_0\), the theory of algebraically closed fields of characteristic \(0\), like \(\mathbb{C}\) or \(\overline{\mathbb{Q}}\). (\(\overline{\mathbb{Q}}\) is how we write the algebraic completion of the rationals, which we can think of as a nicer version of \(\mathbb{C}\). We construct it by adding every root of every polynomial with rational coefficients to our field. The resulting field is also called the algebraic numbers.)
For reasons we'll cover in a later post, we care a lot about being able to pass relatively freely between making statements about fields like \(\mathbb{C}\) or \(\overline{\mathbb{Q}}\) and fields like \(\mathbb{Z}_{17}\) or \(\mathbb{Z}_{3}^8\). We can do that fairly straightforwardly with model theory, assuming that we're alright with compactness, and this is an important lemma.
Let's show that a sentence \(S\) holds for \(ACF_0\) (and thus \(\mathbb{C}\)) if and only if \(S\) holds for \(ACF_p\), for infinitely many primes \(p\) - that is, we'll never run out of such primes. For the forward direction, consider that every sentence and every proof must be of finite length. Now, take some sentence \(S\): its proof in \(ACF_0\) is finitely long, and so it only uses finitely many axioms of the form "the characteristic of this field is not \(p\)". For all the rests, we can violate that axiom and see no difference. Thus for all such sufficiently large primes \(p\), \(S\) also holds in \(ACF_p\). For the reverse direction, we know that if \(ACF_0\) modeled \(\neg S\), by the previous argument \(ACF_p\) also models \(\neg S\) for infinitely many big enough primes; by contrapositive, this means that \(ACF_p\) modelling \(S\) for all big enough primes means that \(ACF_0\) also models \(S\). This is known as a transfer principle, and this specific one is one of the cornerstones of my thesis work.
For one final challenge problem, let's spend a little time thinking about the nature of polynomials with coefficients and variables in \(\mathbb{Z}_p\). First, prove that any polynomial in \(\mathbb{Z}_p[x]\) - that's the notation for the specified polynomial ring - which has degree \(p\) or greater is in fact identical to some polynomial of lesser degree. Then use that result to prove that \(\mathbb{Z}_p\) is in fact algebraically closed - that is, that every polynomial over it has a root.
Happy hunting!
Comments
Post a Comment