80-110: The Nature of Mathematical Reasoning

Spring 1998

4/16

Set Theory: Part I

1. The Theory of Infinite Sets - Part 1

The properties of infinite sets have perplexed many of the finest minds throughout history. Recall Zeno’s "proof" that Achilles could not beat a tortoise with a head start, puzzles about infinitesimals and indivisibles, and Galileo’s puzzles about the natural numbers and their squares. In the late 19th century Georg Cantor finally provided a theory of sets that involved extremely simple but plausible axioms and definitions. He proved a number of stunning theorems from these axioms, and it is to these we will now turn.

Set theory and logic are now considered the foundation of most of modern mathematics. Most of calculus, geometry, and algebra can be reduced to set theory and logic. It is also worth noting that the axiomatic approach was crucial in Cantor’s set theory. Soon after Cantor’s theory was published, Bertrand Russell showed that Cantor’s axioms were inconsistent as stated, and this discovery plunged the mathematics community into real worry about the foundation of their discipline. Luckily, since the axioms and definitions contained the entire "content" of the theory, it was possible to isolate and replace those parts of the theory that led to trouble. Soon after Russell’s paradox was accepted as a refutation of Cantor’s theory, mathematicians like Zermelo, Fraenk and Hilbert provided althernative axioms that are plausible, consistent and that lead to the results that are crucial to found almost all the rest of mathematics. Chapter 8 in FOL provides a very nice discussion of Cantor’s axioms, how they are inconsistent, and the alternatives that replaced them. In these lectures I cover Cantor’s theory of cardinality, which is still the cornerstone of the current theory of cardinality among infinite sets.

A set is a collection of objects. For example, the set S₁ containing the numbers 1 and 2 is written as S₁ = {1,2}. The only truly primitive concept one must accept in set theory is the idea of an object being "an element of" a set. For example, the number 1 is an element of the set S₁, which we write as: 1 Î S₁. Sets can contain sets, e.g., S₂ = {1,2,{1,2}}. So in this case S₁ Î S₂.

The cardinality of a set is how many elements there are in the set. The cardinality of a finite set is easy, and can be obtained by simply counting its members. What is difficult is coming up with a reasonable way to count the number of elements in infinite sets. We are most interested in comparing the cardinality of different infinite sets. For example, we want to compare the cardinality of the natural numbers N, the rationals Q, and the reals R. Cantor solved the problem by carefully considering what it is we are doing when we are counting the number of elements in a finite set, and thinking about the concept of a finite natural number, e.g., "two."

Suppose you were to count the number of elements in the set B = {apple, banana}. My 5 year old daughter does this very reliably by pointing her finger at the word apple as she says "one" and then pointing her finger at the word banana and saying "two," or sometimes by pointing at banana first and then at apple. Cantor conceived of counting the set B as the process of creating a 1-1 function between the set A of natural numbers {1,2} and the set B={X, banana}. So we need to understand functions and, in particular, 1-1 functions.

A function from a set A (the domain) to a set B (the range) is a map that tells you, for each member of A, which member of B to go to. We write a function F that maps elements of A to elements of B as: F: A ® B. For example, if the set A = {1,2} and the set B = {X, Y}, then there are exactly four different functions F_i: A ® B, i = 1 to 4:

A function can also be thought of as a set of ordered pairs, where the first element of the pair is an element of the domain and second is the element of the range that it maps to. So, for example, the function F₂ can also be written as follows:

F₂ = {<1,Y> , <2,Y>}

A function must map every element in the domain to some element in the range, so in the picture below F₅ is not a function, because it doesn’t map the element 2 to anything in B. So when my daughter was 3, she used to make mistakes in counting, one of which was to leave out certain numbers as she pointed at the people around a table. She might say "one, two, four, six. There are six people, daddy!"

A function cannot give more then one option for where to map any element in the domain, so F₆ is not a function because it maps the number 1 to two things in B.

Sometimes my daughter would make this mistake by pointing at my wife, and saying "ooooooooone" which took so long to say she would point at my mother-in-law while still saying it, and then finally pointing at my father-in-law while she begins to say "twooooooo." "No, no," we would say, "you used the number ‘one’ for both mommy and grandma, only use it for mommy." We can partially formalize this notion with quantifier logic as follows:

(F is a function from A to B) ®

F = {<a,b> | a Î A & b Î B, &

" aÎ A $ <a,b>Î F

" <a₁,b₁>,<a₂,b₂>Î F[ a₁ = a₂ ® b₁ = b₂]

This is read: F is a function from A to B, if and only if F is a set of ordered pairs such that the first member of each ordered pair is in A and the second in B, for every element of A there exists an ordered pair in the function with a as the first member of the pair, and for any two pairs in the function that map some member of A to a member of B, if the members of A are identical, then what they are mapped to is also identical. So we are blocked from ever having two pairs in the function with identical members of the domain mapped into distinct members of the range. As you can see, we rely only on FOL and the predicates "Î " and "=".

A function from A to B needn’t cover every element in B, however. So, for example, functions F₂ and F₃ only use 1 element of B. In this example, however, they use only one element of B because they map more than one element in the domain A to the same element in B. This is OK for functions, but not OK for counting with functions from numbers to objects, because these sorts of functions "double count." My 3-year old daughter would usually make this mistake by starting with my wife, saying "one," and then she would go around the table and get back around to my wife, and forgot that she had already counted her, and count her again, in effect assigning two different numbers to my wife. A function that assigns two different elements in the domain to the same element in the range is not 1-1. So in the picture above, functions F₁ and F₄ are 1-1, and F₂ and F₃ are not. We can formalize the notion of a 1-1 function with quantifier logic as follows:

(F is a 1-1 function from A to B) « F is a function from A to B and

" <a₁,b₁>,<a₂,b₂>Î F[ a₁ = a₂ « b₁ = b₂]

The main difference between this and the formula like it above is that the conditional in

a₁ = a₂ ® b₁ = b₂ has been replaced by a biconditional: a₁ = a₂ « b₁ = b₂. So whereas in the general notion of function, we are allowed to have a₁ ≠ a₂ & b₁ = b₂ , that is, two different members of the domain can map to the same member of the range, we are not allowed to do this in a 1-1 function.

Cantor founded his entire theory of cardinality on the notion of a 1-1 function. We denote the cardinality of a set A by writing |A|. So if A = {2,5,7}, then |A| = 3. Cantor defined what it is for one set to be equal or less than another in cardinality as follows:

Def. 1: If A and B are sets, then there is a 1-1 function from A to B (F_1-1: A ® B) if and only if |A| ≤ |B|.

Consider the following pictures and you will see the intuitive appeal of this definition.

In the first, we have created a 1-1 map from A to B. By pairing off everyone in A with someone in B, that is, by finding a unique member of B for every member of A, and possibly having some elements left in B, there must be at least as many members of B as there are in A. In the second picture we ran out of unique members of B after mapping the number 37, so we were then forced to map 38 to someone who was already assigned a number, thus we were forced to "double-count" and violate 1-1ness.

So how would we prove that |A| ≤ |B|? By providing an example of a 1-1 function that maps everyone in A to a unique member of B. How would we prove that |A| > |B|? By proving that there cannot exist a 1-1 function that maps everyone in A to a unique member of B. How might we do that? By taking a proof strategy just like we did in proving that the square root of 2 is irrational: by assuming there is such a function and proving a contradiction from that assumption.