Elementary discrete mathematics

Table of contents

0. What is discrete mathematics?

0.1 It’s math you do quietly, right?

The phrase “discrete mathematics” is often used nebulously to refer to a certain branch of mathematics that, from the college point of view, is “not calculus” or perhaps is applicable to computer science. In this section, we will try to understand what it means for a topic to fall under the umbrella of discrete mathematics.

0.2 But really, what is it?

Here’s a question: What is the next number after 11? Your answer depends on “where you live,” so to speak. If you live in the rational numbers (numbers that can be expressed as fractions, like 3/73/7) or the real numbers (including numbers that can’t be written as fractions, like π\pi), you can’t answer the question. If you give be a real or a rational number that’s a little more than 11, say 1.11.1, I can find something even smaller; say, 1.011.01.

If you live in the set of natural numbers, which is the set of numbers that are useful for counting things in the world, the question does have an answer: 22. The natural numbers do not suffer from the crowding issue that the rationals and the real do; all of their elements are sufficiently “far apart” in this way. There is 11, and then 22, and nothing in between. In this sense they are discrete. In this book, we will (usually) live in the natural numbers.

0.3 Some examples

Here are some questions we might try to answer using discrete mathematics. In fact, you will be able to answer all these questions by the end of the book.

Suppose we have a program that will activate condition 4 if conditions 1 and 2 are met but not condition 3. How do we know when condition 4 is active?

We can use logic to represent this program with the statement (pq¬r)s(p \wedge q \wedge \neg r) \to s, and then use truth tables to determine exactly when condition 4 will trigger.

Which of these two programs is more efficient?

A sequence is a function of natural numbers. An example is the number (an)(a_n) of steps it takes to run a computer program with nn inputs. If (an)(a_n) grows must faster than (bn)(b_n) does, that can tell us that one program scales worse for large quantities of data.

Some members of a community are sick. How do we know who is at risk of being infected?

Relations encode relationships between sets of objects. We may compose relations to iterate those relationships over and over again. If person xx is sick and has come into contact with person yy, and yy has come into contact with person zz, and zz has since seen ww, then ww is at risk.

0.4 How to use this book

Chapters 1-8 must be done in order. Each chapter has a section of exercises located on the problems page. These exercises should be attempted after each chapter. If you are in MATH 174 at Coastal Carolina University, these exercises are your homework.

Then, chapters 9-13 (on combinatorics) and chapters 14-18 (on relations) may be attempted in that order, or in the order 14-18; 9-13. Be aware that some of the exercises in 14-18 assume that the reader has learned the combinatorial material. So, if you follow the alternate order, leave these exercises for later.

This book is paired with a four playlists on my YouTube channel. Playlist 1 (sets and logic) maps to chapters 1-3, playlist 2 (proof and recursion) maps to chapters 4-8, playlist 3 (combinatorics) maps to chapters 9-13, and playlist 4 (relations) maps to chapters 14-18.

These videos are very short demonstrations and examples of the material pursued in depth by the book. I am not sure which ordering is better: book then videos, or videos then book. Choose your own adventure.

I try to use the invitational “we” throughout the book and maintain some level of formality, but sometimes my annoying authorial voice pokes through. I apologize if that bothers you.

Each chapter ends with several “takeaways” that summarize the chapter. Starting with the fourth chapter, if necessary, each chapter begins with hint to go back and look at some things you may not have thought about in a while.

0.5 The point of this book

– is to teach you discrete mathematics, of course.

But much more broadly, this book is intended to teach you to think mathematically. You may be a student who has never seen any math beyond calculus or algebra, or perhaps you dread doing math. The point of this book is to get you to learn to think deeply, communicate carefully, and not shy away from difficult material.

As you proceed, have something to write with on hand. Work each example along with the book. Then, at the end of each chapter, work the exercises. Some are more difficult than others, but are well worth doing to improve your understanding. Work along with someone at roughly your skill level, share solutions and ask each other questions.

Everyone can be a “math person!” So, let’s get started.

1. Sets


1.1 Membership

For nearly all interested parties, sets form the foundation of all mathematics. Every mathematical objects you have ever encountered can be described as a set, from numbers themselves to functions and matrices.

We will say that a set is an unordered collection of objects, called its elements. When xx is an element of the set XX, we write xXx \in X.

Examples. Our starting point will be the set of all natural numbers,

N={0,1,2,3,}. \mathbf{N} = \{0,1,2,3,\ldots \}.

Certain sets are special enough to get their own names, with bold or “blackboard bold” face. Sometimes (especially in ink) you will see N\mathbb{N} instead. Note that the elements of the set are enclosed in curly braces. Proper notation is not optional; using the wrong notation is like saying “milk went store” to mean “I went to the store to get milk.” Finally, not everyone agrees that 00 is a natural number, but computer scientists do.

If we take N\mathbf{N} and include its negatives, we get the integers

Z={,2,1,0,1,2,}. \mathbf{Z} = \{\ldots, -2, -1, 0, 1, 2, \ldots \}.

Notice that in the case of an infinite set, or even a large finite set, we can use ellipses when a pattern is established. One infinite set that defies a pattern is the set of prime numbers. You may recall that a prime number is a positive integer pp with exactly two factors: itself and 11. We may list a few: 2,3,5,7,11,13,17,19,23,2, 3, 5, 7, 11, 13, 17, 19, 23, but there is no pattern that allows us to employ ellipses.

Fortunately there is another way to write down a set. All previous examples have used roster notation, where the elements are simply listed. It may be more convenient to use set-builder notation, where the set is instead described with a rule that exactly characterizes this element. In that case, the set of primes can be written as

{p  p is a prime number}. \{ \, p \, | \; p \text{ is a prime number} \}.

To provide one last example, consider the set of all U.S. states. In roster notation this set is a nightmare: {AL, AK, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME, MD, MA, MI, MN, MS, MO, MT, NE, NV, NH, NJ, NM, NY, NC, ND, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VT, VA, WA, WV, WI, WY}. But in set-builder notation we simply write

{SS is a U.S. state}. \{ \, S \, | \, S \text{ is a U.S. state} \}.

Definition 1.1.1. Two sets XX and YY are equal if they contain exactly the same elements, in which case we write X=YX=Y.

Remember that sets are unordered, so even {a,b}={b,a}\{a,b\}=\{b,a\}.

Definition 1.1.2. Let XX be a finite set. Its cardinality, denoted X|X|, is the number of elements in XX.

Example. If X={a,b,c}X=\{a,b,c\}, then X=3|X|=3.

We can talk about the cardinality of infinite sets, but that is far beyond the scope of this course.

1.2 Containment

One of the most important ideas when dealing with sets is that of a subset.

Definition 1.2.1. Let XX and YY be sets. We say XX is a subset of YY if every element of XX is also an element of YY.

Examples. Put X={a,b,c}X = \{a,b,c\}, Y={a,c}Y = \{a,c\}, and Z={b}Z=\{b\}. Then YXY \subseteq X, ZXZ \subseteq X, but Y⊈ZY \not\subseteq Z and Z⊈YZ \not\subseteq Y.

In mathematics, true statements are either definitions (those taken for granted) or theorems (those proven from the definitions). We will learn to prove more difficult theorems later, but for now we can prove a simple one.

Theorem 1.2.2. Every set is a subset of itself.

A proof is simply an argument that convinces its reader of the truth of a mathematical fact. Valid proofs must only use definitions and the hypotheses of the theorem; it is not correct to assume more than you are given. So the ingredients in the proof of this theorem are simply going to be a set and the definition of the word “subset.”

Proof. Let XX be a set. Suppose xx is an arbitrary element of XX. Then, xx is also an element of XX. Because XX was shown to have any arbitrary element of XX, then XXX \subseteq X. \square

Notice that the actual make-up of the set XX is irrelevant. If XX has an element, then XX also has that element, and that is all it takes to conclude XXX \subseteq X. It would be incorrect to try to prove XXX \subseteq X from an example: just because {0}\{0\} is a subset of itself doesn’t necessarily mean that {0,1}\{0,1\} is!

We will discuss logic and proof-writing at length as we continue. For now, let’s continue learning about sets and proving the “simple” theorems as we go.

Definition 1.2.3. The set with no elements is called the empty set, denoted {}\{ \} or \varnothing.

With a little logical sleight-of-hand, we can prove another (much more surprising) theorem.

Theorem 1.2.4. The empty set is a subset of every set.

Proof. Let XX be a set. We may conclude X\varnothing \subseteq X if every element of \varnothing is an element of XX. Observe that this statement is equivalent to saying that no elements of \varnothing are not elements of XX. Because there are no elements of \varnothing at all, this statement is true; and we are done. \square

Perhaps you already agree with the central claim of the proof, that there are no difference between the statements "everything in \varnothing is in XX" and "nothing in \varnothing is not in XX". If you do not, take it on faith for now. When we study predicate logic later you will have the tools to convince yourself of the claim.

Such a statement that relies on the emptiness of \varnothing is said to be vacuously true.

When XYX \subseteq Y, we say YY contains XX or that YY is a superset of XX. With the idea of containment in hand we can construct a new type of set.

Definition 1.2.5. Let XX be a set. Its power set is the set P(X)\mathcal{P}(X) whose elements are the subsets of XX. That is,
P(X)={EEX}. \mathcal{P}(X) = \{ \, E \, | \, E \subseteq X \, \}.

Wait - our elements are sets? Yes, and this is perfectly fine! Elements can be all sorts of things, other sets included. When we have a “set of sets” we typically refer to it as a family or a collection to avoid using the same word and over and over.

Example. Let X={a,b,c}X = \{a,b,c\}. Then

P(X)={,{a},{b},{c},{a,b},{a,c},{b,c},X}. \mathcal{P}(X) = \{ \varnothing, \{a\}, \{b\}, \{c\}, \{a,b\}, \{a,c\}, \{b,c\}, X\}.

Notice that the number of sets in P(X)\mathcal{P}(X) can be derived from the number of elements in XX.

Theorem 1.2.6. If XX is finite, then P(X)=2X|\mathcal{P}(X)|=2^{|X|}.

We will come back to prove this theorem much later, once we have learned how to count!

1.3 The algebra of sets

You are used to word “algebra” as it refers to a class where you must solve some equations for xx. If you think about those classes, they are really about understanding operations and functions of real numbers. Think about it: a successful algebra student will know how to manipulate sums, products, polynomials, roots, and exponential functions – ways of combining and mapping real numbers.

So, “algebra” in general refers to the ways objects may be combined and mapped. In this section, we will see many ways that sets can be combined to produce new sets.

Definition 1.3.1. Let XX and YY be sets. Their union is the set XYX \cup Y of all elements that are in at least one of XX or YY; in set builder notation this is
XY={xxX or xY}. X \cup Y = \{\, x \, | \, x \in X \text{ or } x \in Y \}.

Definition 1.3.2. Let XX and YY be sets. Their intersection is the set XYX \cup Y of all elements that are in both XX and YY; in set builder notation this is
XY={xxX and xY}. X \cap Y = \{\, x \, | \, x \in X \text{ and } x \in Y \}.

A useful way to visualize combinations of sets is with a Venn diagram. The Venn diagrams for union and intersection are shown below.

On the left, two sets X and Y are pictured as overlapping circles. The region within both circles together - the union of the sets - is shaded in blue. On the right, only the region that is part of both circles - their intersection - is shaded in pink.

Figure. On the left, the union of two sets; on the right, the intersection.

Example. Let X={1,2,3,4}X = \{1,2,3,4\} and Y={2,4,7,9}Y=\{2,4,7,9\}. Their union is
XY={1,2,3,4,7,9} X \cup Y = \{1,2,3,4,7,9\} and their intersection is XY={2,4}.X \cap Y = \{2,4\}. Notice that we do not repeat elements in a set. An element is either in the set or it isn’t. (A generalization, the multiset, will be developed in Chapter 11.) Therefore, XY|X \cup Y| (recall, the number of elements in the union) won’t just be X+Y|X|+|Y|.

Theorem 1.3.3. If XX and YY are finite sets, then XY=X+YXY.|X \cup Y| = |X|+|Y|-|X\cap Y|.

Proof. If an element is only in XX, it will be counted by X|X|. If an element is only in YY, it will be counted by Y|Y|. Therefore any elements in XYX \cap Y will be counted twice. There are exactly XY|X \cap Y| of these elements, so subtracting that amount gives XY|X \cup Y|. \square

Definition 1.3.4. Let XX and YY be sets. The difference XYX-Y is the set of all elements in XX but not YY; i.e. XY={xxX and x∉Y}.X-Y=\{ \, x \, | \, x \in X \text{ and } x \not\in Y\}.

Example. Like before, let X={1,2,3,4}X = \{1,2,3,4\} and Y={2,4,7,9}Y=\{2,4,7,9\}. Then XY={1,3}X-Y=\{1,3\}. Notice that YX={7,9}Y-X=\{7,9\}; just like with numbers, taking a diference is not commutative. (That is, the order matters.)

Often when working with sets there is a universal or ambient set, sometimes denoted Ω\Omega, that all of our sets are assumed to be subsets of. This set is typically inferred from context.

Definition 1.3.5. Let XX be a set contained in some universal set Ω\Omega. The complement of XX (relative to Ω\Omega), denoted X\overline{X}, is all the elements not in XX. In other words, X=ΩX\overline{X}=\Omega-X.

On the left, two sets X and Y are pictured as overlapping circles. The region within X, excluding the portion that is also part of Y, is shaded in pink. This is a picture of the difference X minus Y. On the right, a set X, pictured as a circle, is shown contained in its universal set, pictured as a rectangle. The region between the rectangle and circle is shaded in blue, representing the complement of X.

Figure. On the left, the difference of two sets shaded in pink. On the right, the complement of a set shaded in blue.

Example. Suppose X={a,b,c}X=\{a,b,c\} is contained in the universal set Ω={a,b,c,d,e}\Omega=\{a,b,c,d,e\}. Then X={d,e}\overline{X}=\{d,e\}.

Definition 1.3.6. Let XX and YY be sets. Their Cartesian product (in this book, just “product”) is the set X×Y={(x,y)xX and yY}.X\times Y = \{(x,y)\,|\, x \in X \text{ and } y \in Y\}. The elements of the product are called ordered pairs.

As the name implies, the ordering of the entries in an ordered pair matters. Therefore, we will adopt the convention that unordered structures are enclosed in {\{ curly braces }\} and ordered structures are enclosed in (( parentheses )). The immediate consequence to this observation is that X×YX \times Y is different from Y×XY \times X.

A picture of the product of two sets is shown. One element of the first set and one element of the second combine to make an ordered pair in the product.

Figure. A visualization of the product of two sets.

Examples. You are likely familiar with R2=R×R\mathbf{R}^2=\mathbf{R}\times\mathbf{R}, the Cartesian plane from algebra and calculus. The elements of this (infinite) set are ordered pairs (x,y)(x,y) where both xx and yy are real numbers.

To make a discrete example, put X={a,b,c}X=\{a,b,c\} and Y={1,2}Y=\{1,2\}. Then
X×Y={(a,1),(a,2),(b,1),(b,2),(c,1),(c,2)}. X \times Y = \{(a, 1), (a,2), (b,1), (b,2), (c,1), (c,2)\}. As an exercise, calculate Y×XY \times X yourself. The pair (1,a)(1,a) is an element.

Theorem 1.3.7. If XX and YY are finite sets, then X×Y=XY|X \times Y| = |X||Y|.

Proof. The elements of X×YX\times Y may be tabulated into a rectangle, where each row corresponds to an element of XX and each column corresponds to an element of YY. There are X|X| rows and Y|Y| columns and therefore XY|X||Y| elements in the table. \square

1.4 Operations involving more than two sets

Note: While this section is cool, we will not make use of its content very often in this course. You may choose to skip this section for now and return to it after Chapter 8, when you are more comfortable with the material.

There is one last thing to say about sets (for now, anyway). There is no reason to limit ourselves to two sets in union, intersection, and product. If we have a family of nn sets X1,X2,,XnX_1, X_2, \ldots, X_n, we may take nn-ary unions, intersections, and products.

The nn-ary union i=1nXi=X1X2Xn\bigcup_{i=1}^n X_i = X_1 \cup X_2 \cup \cdots \cup X_n is the set of elements that are in at least one of the sets XiX_i.

The nn-ary intersection i=1nXi=X1X2Xn\bigcap_{i=1}^n X_i = X_1 \cap X_2 \cap \cdots \cap X_n is the set of elements that are in all of the sets XiX_i.

The nn-ary product i=1nXi=X1×X2××Xn\prod_{i=1}^n X_i = X_1 \times X_2 \times \cdots \times X_n is the set of ordered nn-tuples (x1,x2,,xn)(x_1, x_2, \ldots, x_n), where the element xix_i is a member of the set XiX_i.

Suppose instead the family X1,X2,X_1, X_2, \ldots is infinite. We will, exactly once in this book, need to take an infinite union. The infinite union i=1Xi=X1X2\bigcup_{i=1}^\infty X_i = X_1 \cup X_2 \cup \cdots is the set of all elements that are in at least one of the XiX_i. (Notice that the definition hasn’t changed – just the number of sets!) Likewise the infinite intersection i=1nXi=X1X2\bigcap_{i=1}^n X_i = X_1 \cap X_2 \cap \cdots is the set of elements in all the XiX_i.


Chapter 1 problems

2. Propositional logic


In mathematics, we spend a lot of time worried about whether a sentence is true or false. This is fine in the case of a simple sentence such as “The number 22 is prime.” (That sentence is true.) But there are much more complicated sentences lurking around, and whole constellations of sentences whose truth depends on one another. (An unsolved problem in mathematics is deciding whether the Riemann hypothesis, a statement about the prime numbers, is true. Most people believe it is, and many results in the literature are true given the hypothesis. The Riemann hypothesis is going to make or break a lot of research.)

So, we are interested in tools that will allow us to judge whether complicated statements are true. Such models of truth are called logic.

2.1 Statements and connectives

Definition 2.1.1. A statement is a sentence that is either true or false.

Examples. “The number 22 is prime” and “It is perfectly legal to jaywalk” are statements. Non-statements include questions (“Is it raining?”), commands (“Do your homework”), and opinions (“Mint chocolate chip ice cream is the superior flavor”). It is important to note, not necessarily for mathematical purposes, that opinions are statements of pure taste. Claims whose truth values exist but are unknown are beliefs.

Atomic propositional statements of the type discussed in this chapter are flatly always true or false. You may think of them in the same way you think of constant functions in algebra. They are denoted with lowercase letters pp, qq, rr, etc. If the only statements were atomic, logic would be boring. Fortunately, we can combine statements with connectives. In the next chapter, we will look at predicate statements whose truth values depend on an input. Likewise, these may be thought of as the non-constant functions.

Compound statements arise from combining atomic statements with connectives. They are denoted using Greek letters like φ\varphi (“phi”), ψ\psi (“psi”), and γ\gamma (“gamma”). The truth of a compound statement depends on the connectives involved and the truth values of the constituent statements. A tool for determining the truth of a compound statement is called a truth table.

In a truth table for the compound statement φ\varphi, the left side of the table is the atomic statements involved in φ\varphi and the right side of the table is φ\varphi itself. Each row gives a possible combination of truth values for the atomic statements, and the corresponding truth value for φ\varphi. At this point, it is easier to start introducing connectives and seeing some truth tables.

Definition 2.1.2. Let pp be a statement. Its negation is the statement ¬p\neg p, the statement that is true exactly when pp is false and vice-versa.

The statement ¬p\neg p is typically read "not pp" or “it is not the case that pp is true.” If we have an English rendering of pp, we try to write ¬p\neg p in a natural-sounding way that also conveys its logical structure.

Example. Suppose pp represents the statement “It is raining.” Then, ¬p\neg p stands for “It is not raining.”

Below is the truth table for ¬p\neg p.

pp ¬p\neg p

As you can see, on the left are columns corresponding to the atomic statements involved in ¬p\neg p (just pp), and each row is a possible combination of truth values for these statements. Here is a quick, useful theorem without proof:

Theorem 2.1.3. If the compound statement φ\varphi involves nn different atomic statements, then its truth table has 2n2^n rows.

Look familiar? We will prove this theorem when we prove the related theorem about power sets later on.

Negation is called a unary connective because it only involves one statement. Our remaining connectives are all binary connectives, as they combine two statements.

Definition 2.1.4. Let pp and qq be statements. Their conjunction is the statement pqp \wedge q, which is true only when pp and qq are both true.

The symbol in the conjunction is called a “wedge,” and pqp \wedge q is read "pp and qq." This definition is meant to be analogous to our English understanding of the word “and.” The statements pp and qq are called conjuncts.

Example. Letting pp be “It is raining” and qq be “I brought an umbrella,” the statement pqp \wedge q is “It is raining and I brought an umbrella.” The statement is true only if it is currently raining and the speaker brought an umbrella. If it is not raining, or if the speaker did not bring their umbrella, the conjunction is false.

Here is the truth table for pqp \wedge q:

pp qq pqp \wedge q

Notice that this table is organized intentionally, not haphazardly. The rows are divided into halves, where pp is true in one half and then false in the other half. The rows where pp is true are further subdivided into half where qq is true and half where qq is false. If there were a third letter, we would subdivide the rows again, and so on. This method of organizing the truth table allows us to ensure we did not forget a row. Our leftmost atomic letter column will always be half true followed by half false, and the rightmost atomic letter column will always alternate between true and false.

Definition 2.1.5. Let pp and qq be statements. Their disjunction is the statement pqp \vee q, which is true if at least one of pp or qq is true.

The symbol in the conjunction is called a “vee” and pqp \vee q is read "pp or qq." This definition is meant to be analogous to our English understanding of the word “or.” The statements pp and qq are called disjuncts. (Did you notice that I was able to copy and paste from an above paragraph? Conjunction and disjunction share a very similar structure.)

Example. Letting pp be “It is raining” and qq be “I brought an umbrella,” the statement pqp \vee q is “It is raining or I brought an umbrella.” This time, the speaker only needs to meet one of their conditions. They are telling the truth as long as it is raining or they brought an umbrella, or even if both statements are true.

The truth table for pqp \vee q follows.

pp qq pqp \vee q

Of great importance to computer science and electrical engineering, but not so much in mathematics (so it will not appear again in this book), is the exclusive disjunction (or “exclusive-or”). It is a modified disjunction where exactly one of the disjuncts must be true.

Definition 2.1.6. If pp and qq are statements, their exclusive disjunction is the statement pqp \oplus q, which is true if and only if exactly one of pp and qq is true.

The statement pqp \oplus q is read "pp exclusive-or qq."

Example. The statement “You can have the soup or the salad” is likely meant as an exclusive disjunction.

In mathematics, disjunctions are generally considered inclusive unless stated otherwise. (An example for us all to follow!)

Our next connective is probably the most important, as it conveys the idea of one statement implying another. As we discussed earlier in the chapter, mathematics is nothing but such statements. It is also the only connective that is not relatively easy to understand given its English interpretation. So we will begin with the truth table, and think carefully about each row.

Definition 2.1.7. If pp and qq are statements, the conditional statement pqp \to q is true if pp can never be true while qq is false.

There are many ways to read pqp \to q, including “if pp, then qq,” "pp implies qq“, and "pp is sufficient for qq.” The statement before the arrow is the antecedent and the following statement is the consequent. Here is the conditional’s truth table.

pp qq pqp \to q

That pqp \to q is true in the bottom two rows may surprise you. So, let’s consider each row separately.

Example. Let pp be “You clean your room” and qq be “I will pay you $10,” so that pqp\to q is “If you clean your room, then I will pay you $10.”

In the case that both pp and qq are true, the speaker has told the truth. You cleaned, and you were paid. All is well.

In the case that pp is true but qq is false, the speaker has lied. You cleaned the room, but you did not earn your money.

What if pp is false? Well, then the speaker did not lie. The speaker said that if you clean your room you will be paid. If the room isn’t cleaned, then pqp \to q cannot be falsified, no matter where qq is true or false.

If that explanation is not sufficient for you, reread the definition of the conditional: pqp \to q being true means that pp cannot be true without qq. If you are unsatisfied by the English connection, then consider pqp \to q to be a purely formal object whose definition is the above truth table.

Definition 2.1.8. If pp and qq are statements, the biconditional statement pqp \leftrightarrow q is true if pp and qq are both true, or both false.

The statement pqp \leftrightarrow q may be read as "pp if and only if qq, "pp is necessary and sufficient for qq," or "pp is equivalent to qq." Here is the truth table.

pp qq pqp \leftrightarrow q

Example. Let pp be “You clean your room” and qq be “I will pay you $10,” so that pqp\leftrightarrow q is “I will give you $10 if and only if you clean your room.” (The statements were reordered for readability. We will see later in the chapter that this is the same statement, but this wouldn’t work for pqp \to q.) This time, the clean room and the $10 paycheck are in total correspondence; one doesn’t happen without the other.

2.2 Connecting connectives

We are not required to use one connective at a time, of course. Most statements involve two or more connectives being combined at once. If φ\varphi and ψ\psi are statements, then so are: ¬φ\neg \varphi, φψ\varphi \wedge \psi, φψ\varphi \vee \psi, φψ\varphi \to \psi, and φψ\varphi \leftrightarrow \psi. This fact allows us to combine as many connectives and statements as we like.

How should we read the statement pqrp \wedge q \to r? Should it be read "pp; also, if qq then rr"? Or, “If pp and qq, then rr?” Just like with arithmetic, there are conventions defining which connective to apply first: negation, then conjunction/disjunction, then implication, then bi-implication. So, "If pp and qq, then rr" is correct. However, to eliminate confusion in this book we will always use parentheses to make our statements clear. Therefore, we would write this statement as (pq)r(p \wedge q) \to r.

Remember that a statement involving nn letters will have 2n2^n rows in its truth table. The column for the left-most statement letter should be true for half the rows then false for half the rows, and you should continue dividing the rows in half so that the right-most statement letter’s column is alternating between true and false.

Finally, it is a good idea to break a statement into small “chunks” and make a column for each “chunk” separately. Observe the following example.

Example. Write down the truth table for the statement (pq)(r¬s)(p \wedge q) \to (r \to \neg s). When is this statement false?

Since the statement has 44 letters, there will be 24=162^4=16 rows to the truth table. Let’s go through the first row very slowly, where all four atomic statements are true. Then pqp \wedge q is true and ¬s\neg s is false. Since rr is true and ¬s\neg s is false, that means r¬sr \to \neg s is false. Finally, because its antecedent is true and its consequent is false, the compound statement (pq)(r¬s)(p \wedge q) \to (r \to \neg s) would be false.

Here is the full truth table. Be sure you understand how each row is calculated.

pp qq rr ss pqp \wedge q ¬s\neg s r¬sr \to \neg s (pq)(r¬s)(p \wedge q) \to (r \to \neg s)

There are tricks to calculate truth tables more efficiently. However, it wouldn’t do you any good to read them; they must come with practice.

2.3 Inference and equivalence

Mathematics is all about relationships. If the statement φψ\varphi \to \psi is always true, then it tells us that whenever φ\varphi is true then ψ\psi must be as well. If the statement φψ\varphi \leftrightarrow \psi is always true, then it tells us that from the point of view of logic φ\varphi and ψ\psi are actually the same statement. That is, it is never possible for them to have different truth values.

Definition 2.3.1. A tautology is a statement that is always true, denoted TT.

Definition 2.3.2. A contradiction is a statement that is always false, denoted FF.

Definition 2.3.3. A rule of inference is a tautological conditional statement. If φψ\varphi \to \psi is a rule of inference, then we say φ\varphi implies ψ\psi and write φψ\varphi \Rightarrow \psi.

Remember that a conditional statement does not say that its antecedent and consequent are both true. It says that if its antecedent is true, its consequent must also be true.

There are many rules of inference, but we will explore two.

Definition 2.3.4. Modus ponens is the statement that says we may infer qq from pqp \to q and pp. Symbolically: [(pq)p]q.[(p \to q) \wedge p] \to q.
Example. Consider the conditional statement “If you clean your room, then I will give you $10.” If the speaker is trustworthy, and we clean our room, then we may conclude we will be receiving $10.

Another example: “If a function is differentiable, then it is continuous.” (You may know the meanings of these words, but it is not important.) If this statement is true, and the function ff is differentiable, then without any further work we know that ff is continuous.

Theorem 2.3.5. Modus ponens is a rule of inference.

Notice that in the definition no claim is made that modus ponens is always true. That is up to us.

Since we know how to use truth tables, we may construct one to show that [(pq)p]q[(p \to q) \wedge p] \to q is always true. Before reading the table below, try to make your own!

pp qq (pq)(p \to q) (pq)p(p \to q) \wedge p [(pq)p]q[(p \to q) \wedge p] \to q

Proof. The above table shows that [(pq)p]q[(p \to q) \wedge p] \to q is true no matter the truth of pp and qq. \square

Definition 2.3.6. In the context of conditional statements, transitivity is the rule that if pp implies qq and qq implies rr, then pp implies rr. Or, [(pq)(qr)](pr).[(p \to q) \wedge (q \to r)] \to (p \to r).

The wording of the definition above should you make you suspicious that this is not the only time we will see transitivity. In fact, you may have seen it before…

Example. If you give a mouse a cookie, then he will ask for a glass of milk. If he asks for a glass of milk, then he will also ask for a straw. Therefore, if you give a mouse of cookie then he will ask for a straw.

Theorem 2.3.7. Transitivity is a rule of inference.

We could simply write down another truth table and call it a day. In fact, go ahead and make sure you can prove transitivity with a truth table. Once you’re done, we’ll try something slicker to prove transitivity.

Proof. Suppose that pqp \to q and qrq \to r are true statements, and that moreover pp is true. By modus ponens, qq must also be true. Then, rr is true, by a second application of modus ponens. Therefore, if pp is true, rr must be also; this is the definition of prp \to r. \square

Pretty cool, right? We can cite proven theorems to prove a new one.

Does it work the other way? In other words: if I know that [(pq)(qr)](pr)[(p \to q) \wedge (q \to r)] \Rightarrow (p \to r), does that mean (pr)[(pq)(qr)](p \to r) \Rightarrow [(p \to q) \wedge (q \to r)]? Regrettably, no: consider the case there pp and rr are true but qq is false.

Some rules of inference do work both ways. These are called equivalences.

Definition 2.3.8. If φψ\varphi \leftrightarrow \psi is a tautology, then it is an equivalence and φ\varphi and ψ\psi are equivalent statements. We write φψ\varphi \equiv \psi.

Equivalent statements are not the same statements, but they are close enough from the point of view of logic. Equivalent statements are interchangeable, and may be freely substituted with one another.

We will quickly run through many equivalences before slowing down to prove a few.

Algebraic properties of conjunction, disjunction, and negation

Our first batch of statements should look familiar to you; these are the ways that statements “act like” numbers. As you read each one, try to justify to yourself why it might be true. If you aren’t sure of one, make a truth table.

Equivalence Name
p¬(¬p)p \equiv \neg (\neg p) Double negation
pqqpp \wedge q \equiv q \wedge p Commutative of conjunction
pqqpp \vee q \equiv q \vee p Commutative of disjunction
(pq)rp(qr)pqr(p \wedge q) \wedge r \equiv p \wedge (q \wedge r) \equiv p \wedge q \wedge r Associativity of conjunction
(pq)rp(qr)pqr(p \vee q) \vee r \equiv p \vee (q \vee r) \equiv p \vee q \vee r Associativity of disjunction
pTpp \wedge T \equiv p Conjunctive identity
pFpp \vee F \equiv p Disjunctive identity
pFFp \wedge F \equiv F Conjunctive absorption
pTTp \vee T \equiv T Disjunctive absorption
p¬pFp \wedge \neg p \equiv F Conjunctive cancellation
p¬pTp \vee \neg p \equiv T Disjunctive cancellation

Distributive properties

In our numbers, multiplication distributes over addition. This is one fact that is likely implemented twice in your head, once for a number: a(b+c)=ab+ac,a(b+c) = ab + ac, and once for “the negative” (truly the number 1-1): (b+c)=bc.-(b+c) = -b - c. However, these rules are one and the same; the distributive law. Likewise, we will have distributive laws for negation, conjunction, and disjunction.

Equivalence Name
¬(pq)¬p¬q\neg (p \wedge q) \equiv \neg p \vee \neg q DeMorgan’s law
¬(pq)¬p¬q\neg (p \vee q) \equiv \neg p \wedge \neg q DeMorgan’s law
p(qr)(pq)(pr)p \wedge (q \vee r) \equiv (p \wedge q) \vee (p \wedge r) Conjunction over disjunction
p(qr)(pq)(pr)p \vee (q \wedge r) \equiv (p \vee q) \wedge (p \vee r) Disjunction over conjunction

DeMorgan’s law is one of the most important facts in basic logic. In fact, we will see it three times before we are through. In English, DeMorgan’s law says that the negation of a conjunction is a disjunction and vice-versa.

To show that ¬(pq)\neg (p \wedge q) and ¬p¬q\neg p \vee \neg q are equivalent, we can show that the biconditional statement ¬(pq)¬p¬q\neg (p \wedge q) \leftrightarrow \neg p \vee \neg q is a tautology. Thinking about the definition of the biconditional gives us another way: we can show that the truth tables of ¬(pq)\neg (p \wedge q) and ¬p¬q\neg p \vee \neg q are the same; in other words, that both statements are always true and false for the same values of pp and qq.

Proof (of DeMorgan’s law for conjunction). Let pp and qq be statements. The statements ¬(pq)\neg (p \wedge q) and ¬p¬q\neg p \vee \neg q are equivalent, as shown by the below truth table.

pp qq ¬(pq)\neg (p \wedge q) ¬p¬q\neg p \vee \neg q

Therefore, ¬(pq)¬p¬q\neg (p \wedge q) \equiv \neg p \vee \neg q. \square

You’ll notice a few columns are skipped in the above truth table. After a few tables, it is more convenient to skip “easy” columns. For example, we know that pqp \wedge q is true only when pp and qq are both true, and false everywhere else. Therefore, we can “flip” that column to quickly get the column for ¬(pq)\neg (p \wedge q).

As an exercise, prove DeMorgan’s other law.

Properties of the conditional

Finally, we come to the properties of conditional (and biconditional) statements. These do not have “obvious” numerical counterparts, but are useful all the same.

Equivalence Name
pq¬pqp \to q \equiv \neg p \vee q Material implication
¬(pq)p¬q\neg (p \to q) \equiv p \wedge \neg q False conditional
pq¬q¬pp \to q \equiv \neg q \to \neg p Contraposition
pq(pq)(qp)p \leftrightarrow q \equiv (p \to q) \wedge (q \to p) Mutual implication

Material implication is the surprising result that the conditional is just a jumped-up disjunction! In fact, DeMorgan’s law, material implication, and mutual implication together prove that all of the connectives we’ve studied can be implemented with just two: negation and one other. However, this would make things difficult to read, so we allow the redundancy of five connectives.

Let’s prove material implication with a truth table, and then we will prove contraposition with another method.

Proof (of material implication). Let pp and qq be statements and observe that the truth tables for pqp \to q and ¬pq\neg p \vee q are the same:

pp qq pqp \to q ¬pq\neg p \vee q

Therefore, pq¬pqp \to q \equiv \neg p \vee q. \square

There is another way to think about material implication. The statement pqp \to q means that pp never happens without qq. Another way to say this is that qq happens, or else pp did not; so, ¬pq\neg p \vee q.

Before we prove contraposition by the other method, we must justify said method with a theorem.

Theorem 2.3.9. Statements φ\varphi and ψ\psi are equivalent if and only if φψ\varphi \Rightarrow \psi and ψφ\psi \Rightarrow \varphi are both rules of inference.

The phrase “if and only if” in a mathematical statement means that the statement “works both ways.” In this case, φψ\varphi \equiv \psi means φψ\varphi \Rightarrow \psi and ψφ\psi \Rightarrow \varphi are both rules of inference; and conversely, φψ\varphi \Rightarrow \psi and ψφ\psi \Rightarrow \varphi both being rules of inference means φψ\varphi \equiv \psi. It means that “φψ\varphi \equiv \psi” and “”φψ\varphi \Rightarrow \psi and ψφ\psi \Rightarrow \varphi are both rules of inference" are equivalent statements. Fortunately, we already have a rule that justifies this equivalence.

Proof (of Theorem 2.3.9). Mutual implication says that pq(pq)(qp)p \leftrightarrow q \equiv (p \to q) \wedge (q \to p). \square

It’s that easy. Whenever we have a conditional statement and its converse (the order reversed), we have the biconditional; likewise, we may trade in a biconditional for two converse conditional statements.

Now we will prove contraposition by showing that pqp\to q implies ¬q¬p\neg q \to \neg p, and vice-versa. Before you read the below proof, write down these rules: material implication, double negation, and commutativity of disjunction.

Proof (of contraposition). Suppose pp and qq are statements and that pqp \to q is true. By material implication, ¬pq\neg p \vee q is true. Commutativity says this is the same as q¬pq \vee \neg p. So, we may apply material implication backwards, negating qq this time, to get ¬q¬p\neg q \to \neg p; therefore, pqp \to q implies ¬q¬p\neg q \to \neg p.

Conversely, suppose ¬q¬p\neg q \to \neg p is true. Then material implication (and double negation) gives q¬pq \vee \neg p. We may commute the disjunction and apply material implication again, negating ¬p\neg p, to get pqp \to q.

By mutual implication, pq¬q¬pp \to q \equiv \neg q \to \neg p. \square


Chapter 2 problems

3. Predicate logic


3.1 Predicates

Basic propositional logic as you learned it in the preceding chapter is powerful enough to give us the tools we will need to prove many theorems. However, it is deficient in one key area. Consider the following example.

Example. If Hypatia is a woman and all women are mortal, then Hypatia is mortal.

We can all agree that this statement is a tautology; that is, if it is true that Hypatia is a woman (she was; she was a Greek mathematician contemporary with Diophantus) and if it is true that all women are mortal (as far as we know this is the case) then it must follow that Hypatia is mortal (she was murdered in 415).

So the statement is a tautology. If so, the truth table would confirm it. Let (pq)r(p \wedge q) \to r symbolize the statement, where rr stands for the statement that Hypatia is mortal.

pp qq rr pqp\wedge q (pq)r(p \wedge q)\to r

Oh no! Our second row gives that the statement is false when rr is false but pp and qq are true. What gives?

Well, consider the statement “If pigs fly and the Pope turns into a bear, then Bob will earn an A on the exam.” Now it is not so clear that this is a tautology, because pigs, the Pope, and Bob’s grade presumably have nothing to do with one another.

But Hypatia and the mortality of women do. What we will need is the ability to signify that two statements are related, whether in subject (e.g. Hypatia) or predicate (e.g. mortality). This is where predicate logic comes in to help.

Definition 3.1.1. A predicate is a function P:Ω{T,F}P:\Omega \to \{T,F\} that takes one or more subjects (members of some domain Ω\Omega) and assigns to them all either true TT or false FF.

Predicates are typically denoted with capital letters like PP, QQ, and RR. Subjects that are known are given lowercase letters like aa, bb, and cc, unless there is something more reasonable to pick like hh for Hypatia. Subjects whose values are unknown or arbitrary are denoted with lowercase xx, yy, zz, etc.

The definition of predicate above is much more complicated than it needs to be; this is so that we can take the central idea (a function) and continue using it throughout the book. You know a function: an assigment f(x)f(x) that gives every input xx exactly one output f(x)f(x). We write f:XYf:X \to Y to signify that ff is a function whose inputs are elements of the set XX and whose outputs are elements of the set YY.

So, in the above definition, the predicate PP is a function that takes a subject, a member of the set Ω\Omega, and assigns it exactly one truth value.

(Take note that propositions, which you studied in the last chapter, may be regarded as constant predicates – functions whose value is the same regardless of the input, like f(x)=3f(x)=3.)

Examples. Letting W(x)W(x) be the predicate "xx is a woman" on the domain of humans and hh stand for Hypatia, W(h)W(h) is the statement “Hypatia is a woman.” Since this statement is true, the function WW assigns TT to hh.

Let P(x)P(x) be the predicate “x+3=5x + 3 = 5” on the domain Z\mathbf{Z} and let Q(x,y)Q(x,y) be the predicate "xx orbits yy" on the domain of celestial objects in the sky. The statement P(2)P(2) is true, but P(3)P(3) is false. If ee stands for the earth and ss for the sun, then Q(s,e)Q(s,e) is false but Q(e,s)Q(e,s) is true.

A predicate, once evaluated to a subject, is a statement. Therefore, these statements may be combined into compound statements using the logical connectives we have just studied.

Examples. As before let P(x)P(x) be the predicate “x+3=5x + 3 = 5” on the domain Z\mathbf{Z} and let Q(x,y)Q(x,y) be the predicate "xx orbits yy" on the domain of celestial objects in the sky, where ee stands for the earth and ss for the sun.

The statements P(2)Q(s,e)P(2) \vee Q(s,e), ¬P(3)\neg P(3), and Q(s,e)P(2)Q(s,e) \to P(2) are true.

The statements P(2)Q(s,e)P(2) \wedge Q(s,e) and P(2)Q(s,e)P(2) \to Q(s,e) ae false.

3.2 Quantifiers

This is all well and good, you think, but what about the statement “All women are mortal”? How can we make “all women” the subject of a statement?

Quantifiers allow us to say that a predicate is true for all, or for at least one undetermined, object in its domain.

Definition 3.2.1 Let P(x)P(x) be a predicate. The universal statement xP(x)\forall x P(x) says that P(x)P(x) is true for all xx in its domain. The symbol \forall (“for all”) is called the universal quantifier.

Example. The domain is very important. If P(x)P(x) is the predicate "xx takes discrete math" then xP(x)\forall x P(x) – “Everyone takes discrete math” – is true on the domain of computer science students at Coastal Carolina, but not, say, on the domain of all people in South Carolina.

Definition 3.2.2 Let P(x)P(x) be a predicate. The existential statement xP(x)\exists x P(x) says that P(x)P(x) is true for at least one xx in its domain. The symbol \exists (“there exists”) is called the existential quantifier.

Example. Again let P(x)P(x) be the predicate "xx takes discrete math". The existential statement xP(x)\exists x P(x) says “Someone takes discrete math.” This is now true for people in South Carolina – pick a student at Coastal, or Clemson, or USC – but not true on the domain of dogs. Wouldn’t that be cute, though?

The statement operated on by a quantifier is called its scope. Without parentheses, a quantifier is considered to apply only to the statement it is next to and no others.

Examples. Let P(x)P(x) be the predicate "xx takes discrete math" and Q(x)Q(x) be the predicate "xx is a math major".

There are two important quantified statements: “All PP's are QQ's” and "Some PP is a QQ". These statements are all over the place in mathematics. We like to characterize mathematical objects by saying all objects of one type also belong to another; or, by saying we can find an object of one type that is another type.

In “All PP's are QQ's”, the word “all” suggests that the universal quantifier is appropriate. What connective should be used to combine the predicates? Observe that the statement x(P(x)Q(x))\forall x (P(x) \wedge Q(x)) says that every xx is both a PP and a QQ. This isn’t what we mean to say; we mean to say if something is a PP then it will also be a QQ. So, the correct rendering of “All PP's are QQ's” is x(P(x)Q(x))\forall x (P(x) \to Q(x)).

For the statement “Some PP's are QQ's”, we mean to say that there is an object that is at once a PP and a QQ. Therefore, this time the conjunction is appropriate: we write x[P(x)Q(x)]\exists x [P(x) \wedge Q(x)].

Statement Implementation
“All PP's are QQ's” x(P(x)Q(x))\forall x (P(x) \to Q(x))
“Some PP's are QQ's” x(P(x)Q(x))\exists x (P(x) \wedge Q(x))

Finally, what is to be done with statements with two or more subjects? The ordering of the quantifiers matters. Here are the four possible combinations of universal and existential quantifiers for two subjects, as well as a diagram corresponding to each.

Examples. Let R(x,y)R(x,y) be the predicate "xx and yy are classmates" over the domain of students at some school. Note that, in this case, the order of the subjects does not affect the truth of the statement.

The statement xyR(x,y)\exists xy R(x,y) means “There are two students who are classmates.” In other words, there is at least one student who is paired with at least one other student. Some writers say xyR(x,y)\exists x \exists y R(x,y) instead, but when the quantifiers are the same we will “collapse” the notation.

On the left, one object is paired with one object (existential-existential). On the right, one object is paired with many objects (existential-universal).

Figure. On the left, one object is paired with one object (existential-existential). On the right, one object is paired with many objects (existential-universal).

The statement xyR(x,y)\exists x \forall y R(x,y) means “There is a student who is classmates with every student.” We have one student who (somehow!) is classmates with the entire student body.

The statement xyR(x,y)\forall x \exists y R(x,y) means “Every student has a classmate.” Notice that their classmate need not be the same person. If we quantify universally first, then each existential object may be different. In the preceding example, we quantified existentially first, so that one person was classmates with everybody.

On the left, many objects are each paired with an object (universal-existential). On the right, many objects are paired with many objects (universal-universal).

Figure. On the left, many objects are each paired with an object (universal-existential). On the right, many objects are paired with many objects (universal-universal).

The statement xyR(x,y)\forall xy R(x,y) means “Every student is classmates with every other student.” Every object in our domain is paired up with every other object in the domain, including themselves! (Don’t think too hard about that for this example.)

3.3 Inferences and equivalences for quantifiers

Quantifiers evaluated on a predicate are statements, since they are functions to {T,F}\{T,F\}. Therefore, some quantified statements imply others, and some pairs imply each other. In this section we will learn a few important rules for quantifiers, before finally proving that our opening statement about Hypatia is indeed a tautology.

Theorem 3.3.1 (DeMorgan’s laws) Let P(x)P(x) be a predicate. Then ¬xP(x)x¬P(x)\neg \forall x P(x) \equiv \exists x \neg P(x) and ¬xP(x)x¬P(x)\neg \exists x P(x) \equiv \forall x \neg P(x).

Does that name look familiar? Here is our second application of DeMorgan’s laws. (The third will appear in the exercises.) You will see how they are used in the proof.

Proof. Consider the universal statement xP(x)\forall x P(x) on the domain Ω={x1,x2,xn}\Omega = \{x_1, x_2, \ldots x_n\}. We may rewrite xP(x)=P(x1)P(x2)P(xn)\forall x P(x) = P(x_1) \wedge P(x_2) \wedge \cdots \wedge P(x_n)
and xP(x)=P(x1)P(x2)P(xn).\exists x P(x) = P(x_1) \vee P(x_2) \vee \cdots \vee P(x_n).
Therefore, ¬xP(x)=¬(P(x1)P(x2)P(xn)).\neg \forall x P(x) = \neg (P(x_1) \wedge P(x_2) \wedge \cdots \wedge P(x_n)).
Using DeMorgan’s law, ¬xP(x)=¬P(x1)¬P(x2)¬P(xn)),\neg \forall x P(x) = \neg P(x_1) \vee \neg P(x_2) \vee \cdots \vee \neg P(x_n)),
which is the same as the statement x¬P(x)\exists x \neg P(x).
That ¬xP(x)x¬P(x)\neg \exists x P(x) \equiv \forall x \neg P(x) may be proven in the same way. \square

The idea is that a universal statement is a “big conjunction” and an existential statement is a “big disjunction.” The proof follows from the DeMorgan’s laws you learned in the preceding chapter .

The next two theorems deal with how universal and existential statements may imply one another.

Theorem 3.3.2 (Universal instantiation) If aa is a member of the domain of the predicate PP, then xP(x)P(a)\forall x P(x) \Rightarrow P(a).

Proof. If PP is true for every member of its domain and aa is a member of said domain, PP must be true for aa. \square

Theorem 3.3.3 (Existential generalization) If aa is a member of the domain of the predicate PP, then P(a)xP(x)P(a) \Rightarrow \exists x P(x).

Proof. Let aa be in the domain of PP. If P(a)P(a) is true, then PP is true for some member of its domain. \square

There are two comments to make about these two theorems. First, universal generalization and existential instantiation are not valid:

Examples. “I was able to easily pay off my college loans, therefore everyone can” is an example of invalid universal generalization.

“Someone in this room killed the wealthy magnate, so it must have been you!” is an example of invalid existential instantiation.

Finally, transitivity gives us that a universal statement implies an existential one: xP(x)P(a)xP(x)\forall x P(x) \Rightarrow P(a) \Rightarrow \exists x P(x).

We are ready to prove our tautology is, in fact, a tautology. Remember that the statement was “If Hypatia is a woman and all women are mortal, then Hypatia is mortal.”

We have seen that “Hypatia is a woman” may be rendered as W(h)W(h) for the predicate W(x)W(x) being "xx is a woman" and hh being Hypatia. Likewise, “Hypatia is mortal” could be rendered M(h)M(h) where M(x)M(x) is "xx is mortal". Finally, the statement “All women are mortal” would be x(W(x)M(x))\forall x (W(x) \to M(x)).

Theorem 3.3.4 The statement [P(a)x(P(x)Q(x))]Q(a)[P(a) \wedge \forall x (P(x) \to Q(x))] \to Q(a) is a tautology.

We have changed our labeling to the more standard PP, QQ, and aa to emphasize that this statement is always true no matter what we are talking about.

Proof. Suppose that for some predicate PP and some aa in the domain of PP that P(a)P(a) is true, and that for some predicate QQ the statement x(P(x)Q(x))\forall x (P(x) \to Q(x)) is true.

By universal instantiation, x(P(x)Q(x))\forall x (P(x) \to Q(x)) means that P(a)Q(a)P(a) \to Q(a) is true.

By modus ponens on P(a)P(a) and P(a)Q(a)P(a) \to Q(a), we know Q(a)Q(a) must be true.

Therefore, [P(a)x(P(x)Q(x))]Q(a)[P(a) \wedge \forall x (P(x) \to Q(x))] \to Q(a) is always true. \square


Chapter 3 problems




Now we have things to talk about – sets and their elements – and ways of talking about those things – propositional and predicate logic. This opens the door to the true game of mathematics: proof.

Mathematics stands apart from the natural sciences because its facts must be proven. Whereas theories in the natural sciences can only be supported by evidence, mathematical facts can be shown to be indisputably true based on the agreed definitions and axioms.

For example, once you have agreed upon the definitions of “integer,” “even,” and the rules of arithmetic, the statement “An even integer plus an even integer is an even integer” is without doubt. In fact, you will prove this statement yourself in the exercises to this chapter.

Now, this introduction is not meant to imply that mathematical epistemology (the study of knowledge) is without its issues. For example: What definitions and axioms are good ones? (“Is there such a thing as a real number?” can entertain you for a while.) What sorts of people decide which mathematical research is worth doing, and who decides which proofs are convincing? Are all published mathematical proofs correct? (They aren’t.) What is the role of computers in mathematical proof?

Fortunately for us, we don’t have to worry about these questions. (Unless we want to. They’re quite interesting.) In this chapter we will simply learn what a mathematical proof is, learn some basic facts about the integers, and practice writing direct and indirect proofs about mathematical facts that are not particularly in dispute.

4.1 Direct proofs

What is a proof? In this book we will adopt Cathy O’Neil’s position that a proof is a social action between speaker/writer and listener/reader that convinces the second party that a statement is true.

Several proofs have been written in this book so far. Many begin with an italicized Proof. and end in a little \square, as a way to set them off from definitions, theorems, comments, and examples. Hopefully they have all been convincing to you. (DW: If not, my e-mail address is somewhere on this webpage.)

As you practice writing your own proofs, take for an audience someone else in your position: a new student of discrete mathematics. Imagine the reader knows slightly less than you do. A good proof gently guides the reader along, making them feel smart when they anticipate a move.

It is also this book’s position that talking about proofs is nearly useless. They must be read and practiced. With that in mind, let’s get to writing some proofs about integers.

We will take for granted that sums, differences, products, and powers of integers are still integers. (Depending on how the set of integers is defined, these qualities may be “baked in” to the definition.) Every other fact will either be stated as a definition or a theorem, and proven in the latter case.

Definition 4.1.1. An integer nn is even if there exists an integer kk such that n=2kn=2k.

Definition 4.1.2. An integer nn is odd if there exists an integer kk such that n=2k+1n=2k+1.

For examples, 28=2(14)28=2(14), and 39=2(19)+139=2(19)+1.

Theorem 4.1.3. If an integer nn is odd, so is n2n^2.

This statement may be obvious to you. If so, great; but we are trying to learn about the writing, so it is best to keep the mathematics simple. (It will only get a little more challenging by the end of the chapter.) To prove this statement, we must start with an arbitrary odd integer and prove that its square is odd. The integer in question must be arbitrary: if we prove the statement for 3, then we must justify 5, 7, 9, and so on.

Proof. Let nn be an odd integer.

The first step in the proof is to assume any of the theorem’s hypotheses. The theorem says that if we have an odd integer, its square we will be odd; so, we are entitled to assume an odd integer. However, there is nothing more we may assume. Fortunately, we know what it means for an integer to be odd.

That means there exists an integer kk such that n=2k+1n=2k+1.

A great second step is to “unpack” any definitions involved in the assumptions made. We assumed nn was odd. Well, what does that mean? However, now we are “out of road.” Let’s turn our attention to the object we mean to say something about; n2n^2.

Therefore, n2=(2k+1)2=4k2+4k+1=2(2k2+2k)+1, n^2 = (2k+1)^2 = 4k^2 + 4k + 1 = 2(2k^2+2k) + 1, which is an odd number by definition.

Remember: sums, powers and products of integers are integers – this is the only thing we take for granted, other than, y’know, arithmetic – and so 2k2+2k2k^2 + 2k is an integer, call it mm. Then 2m+12m+1 is odd. The final step – obviously rendering the result in the explicit form of an odd number – is a good one to take because we assume our reader to be as new at this as we are. They likely recognize 4k2+4k+14k^2+4k+1 is odd, but we take the additional step to ensure maximum clarity.

Thus, if nn is an odd integer so is its square. \square

Definition 4.1.4. Let aa be a nonzero integer and let nn be an integer. We say aa divides nn or that nn is divisible by aa if there exists an integer kk such that n=akn=ak. We write ana|n when this is the case.

Theorem 4.1.5. Divisibility has the following three properties:

(These three properties together define something called a partial order, which is studied in the last chapter. Foreshadowing!)

You will prove two of these properties for yourself in the exercises. Here is a demonstration of the proof of the second property.

Proof. Suppose that mm and nn are positive integers where mnm|n and nmn|m.

As before, we are assuming only what is assumed in the statement of the theorem: that mm and nn are positive divide one another. By the way: why did we have to say they are positive?

That means that there exist integers k1k_1 and k2k_2 such that n=k1mn=k_1m and m=k2nm=k_2n.

The second step is usually to apply relevant definitions.

Combining these two statements, we have m=(k1k2)mm=(k_1k_2)m. Since mm is not zero, we see 1=k1k21=k_1k_2. Because k1k_1 and k2k_2 must be integers by definition, the only way that k1k2=1k_1k_2=1 is for k1=k2=1k_1=k_2=1.

If we didn’t have the restriction that k1k_1 and k2k_2 were integers, then k1=1/k2k_1=1/k_2. However, this is only possible for integers when both numbers are one.

Therefore, if mm and nn are positive integers that divide one another, then m=nm=n. \square

In the next section we will explore indirect proofs and, from there, bidirectional proofs. You are highly recommended to try some exercises before moving on.

4.2 Indirect and bidirectional proofs

Consider the following theorem.

Theorem 4.2.1 If nn is an integer such that n2n^2 is odd, then nn is odd.

This theorem may seem the same as Theorem 4.1.3, but note carefully that it is not: it is the converse Q(n)P(n)Q(n) \to P(n) to the previous theorem’s P(n)Q(n)P(n) \to Q(n).

Let’s try to imagine a direct proof of this theorem. Since n2n^2 is odd, there exists an integer kk such that n2=2k+1n^2 = 2k+1. And then, seeking to prove a fact about nn, we take the positive square root of each side: n=2k+1n = \sqrt{2k+1}. And…ew. Have we even proven anything about square roots? (We haven’t, and we won’t.)

We shall, instead, employ some sleight of hand. Recall from Chapter 2 that a conditional statement pqp \to q is equivalent to its contrapositive ¬q¬p\neg q \to \neg p.

Proof (of Theorem 4.2.1). Suppose that nn is even. Then there exists an integer kk such that n=2kn=2k. In that case, n2=(2k)2=4k2=2(2k2),n^2 = (2k)^2 = 4k^2 = 2(2k^2), which is even. Therefore for n2n^2 to be odd, nn must also be odd. \square

Another equivalence from Chapter 2 was that pqp \to q and qpq \to p combine to create pqp \leftrightarrow q. That is, we can prove an “if and only if” statement by proving both the “forward” and “backward” directions (which is which is a matter of perception). Therefore, in this chapter we have proven the following statement:

Theorem 4.2.2 An integer nn is odd if and only if n2n^2 is.

Remember that an “if and only if” statement is true if the component statements are both true or both false. So, this theorem also tells us that an integer is even if and only if its square is.

Theorem 4.2.3 Let nn be an integer. The integer nn is odd if and only if n2+4n+1n^2+4n+1 is even.

Proof. Let nn be an integer. First, suppose nn is odd. That means that there exists an integer kk where n=2k+1n=2k+1. So, n2+4n+1=(2k+1)2+4(2k+1)+1=4k2+4k+1+8k+4+1, n^2 + 4n +1 = (2k+1)^2 + 4(2k+1) + 1 = 4k^2 + 4k + 1 + 8k + 4 + 1,which may be written as 2(2k2+6k+3),2(2k^2 + 6k + 3), which is even. Therefore if nn is odd then n2+4n+1n^2+4n+1 is even.

Conversely, suppose nn is even. That means that n=2kn=2k for some integer kk. In that case, n2+4n+1=(2k)2+4(2k)+1,n^2 + 4n + 1 = (2k)^2 + 4(2k) + 1,which is2(2k2+4k)+1,2(2k^2+4k)+1, which is odd. Therefore for n2+4n+1n^2+4n+1 to be even, nn must be odd.

So, nn is odd if and only if n2+4n+1n^2+4n+1 is even. \square

4.3 Counterexamples

When proving a universal statement on a domain, it never suffices to just do examples (unless you do every example, which is usually not feasible or possible). However, remember from Chapter 3 that the negation of a universal statement is an existential one. You can prove that "every PP is a QQ" is false by finding just one PP that isn’t a QQ.

Examples. The statement “all prime numbers are odd” is false because 22 is prime and even.

The statement “all odd numbers are prime” is false because 99 is odd and not prime.


Chapter 4 problems

5. Recursion and integer representation


One of the most important ideas in discrete mathematics is that of recursion, which we will explore in one way or another for many of the remaining chapters in the course. In this chapter, we will explore it through the idea of integer representation; representing integers in bases other than the common base 10. Bases you are likely to encounter in computer science contexts are base 2 (binary), base 8 (octal), and base 16 (hexadecimal). We will learn how to represent an integer in binary using a recursive algorithm, and then we will cheat our way into octal and hexadecimal.

5.1 Recursion

Recursion is when an algorithm, object, or calculation is constructed recursively.

Just kidding. Reread this joke after the chapter and it’ll make more sense.

Recursion means that an algorithm, object, or calculation is written, defined, or performed by making reference to itself. It is something more easily understood by example, and there will be plenty as we proceed. For now, consider the following example.

Definition 5.1.1 Let nn be a positive integer. Then n!n! ("nn factorial") is the product of all positive integers at most nn: n!=1×2××n=n!n! = 1 \times 2 \times \cdots \times n = n! Furthermore, we define 0!=10!=1.

Anyone questioning the usefulness of n!n! will have their questions answered in excruciating detail over the course of Chapters 9-13. If the reader has had calculus, then they may recognize n!n! as the denominator of the terms in the Taylor expansion of the function exe^x; it is good for “absorbing” constants created by taking derivatives. However, we are not interested in any of its uses right now.

Example. Here are the values of a few factorials: 0!=10!=1, 1!=11!=1, 2!=2×1=22!=2\times 1 = 2, 3!=3×2×1=63!=3\times 2 \times 1 = 6, 4!=4×3×2×1=244!=4 \times 3 \times 2 \times 1 = 24, and 5!=5×4×3×2×1=1205!=5\times 4\times 3 \times 2 \times 1=120.

You may have already noticed the recursion at work. Take for instance 5!=5×4×3×2×15!=5 \times 4 \times 3 \times 2 \times 1. We see the value of 4!4! lurking here: 5!=5×4!5!=5 \times 4!.

In fact, for all positive nn, we can make the alternative (usually more useful) definition n!=n(n1)!n!=n(n-1)!. This is the recursive defintion of the factorial, because we are defining the factorial in terms of the factorial.

Why is this okay? Well, we’re really collapsing a lot of work into two lines: it may make more sense to break it down one step at a time. First, define 0!=10!=1. This definition will make sense in a later chapter, but remember that mathematical definitions don’t necessarily need to “make sense.”

Next, define 1!=1×0!1!=1 \times 0!. This definition is valid, because the number 11, the operation ×\times, and the number 0!0! have all been defined. And when we go to define 2!=2×1!2!=2 \times 1!, we have already defined all of these objects as well. And so on. Therefore, the following definition is a valid one.

Definition 5.1.2 Let nn be a natural number. Then n!n! is equal to 1 when n=0n=0, and n(n1)!n(n-1)! otherwise.

5.2 Binary numbers

Now that we know how recursion works (or at least have more of an inkling than we did upon starting the chapter), we will see another example of it in the calculation of the binary representation of a number.

When we speak of the number one thousand five hundred eight or 1,5081,508, we are implicitly talking about the composition of the number in terms of powers of 10: 1,508=1×103+5×102+0×101+8×100,1,508 = 1\times 10^3 + 5 \times 10^2 + 0 \times 10^1 + 8 \times 10^0, remembering that b0=1b^0=1 for all nonzero bb.

African and Asian mathematicians were writing in base-10, or decimal notation, as early as 3,000 BCE. Presumably the prevalence of decimal notation throughout human history is due to the fact that most of us have ten fingers and ten toes, making the base convenient. However, computers only have two fingers and two toes.

That last line is a joke, but here’s the idea: Computers read information by way of electrical signals, which are either on or off. So the simplest way for a computer to process information is in terms of two states. Here is a higher-level example: suppose you are writing some code. Is it easier to have a single if-then statement (two possible states), or nine (ten possible states)? This is why base-2, or binary numbers, are so prevalent in computer science.

The first step to becoming adept with binary numbers is to know many powers of 22. Fortunately, powers of 22 are easy to calculate: simply multiply the last value by 22 to get the next one. (This is a recursive definition for 2n2^n!)

nn 00 11 22 33 44 55 66 77 88 99
2n2^n 11 22 44 88 1616 3232 6464 128128 256256 512512

Algorithm 5.2.1 Let nn be a natural number. To write the dd-digit binary expansion (ad1ad2a1a0)2(a_{d-1}a_{d-2}\ldots a_1a_0)_2 of nn:

This is perhaps a little intimidating at first glance, so let’s press on with an example.

Example. Let’s calculate the binary expansion of n=1,508n=1,508. Our target nn is neither 00 nor 11, so we must find the highest power of 22 that is no more than nn. This number is 210=1,0242^{10}=1,024. Therefore, our binary number will have eleven digits, and right now it looks like (1a9a8a1a0)2.(1a_9a_8\ldots a_1a_0)_2.

To find the values of the remaining digits a9a_9, a8a_8, etc., let’s find the binary expansion of the remainder 1,5081,024=4841,508 - 1,024 = 484.

Here’s the recursion, by the way: to find the binary expansion of nn, we must find the binary expansion of a smaller number.

We repeat the process. The highest power of 22 that divides 484484 is 28=2562^8=256. So, the binary expansion of 484484 is the nine-digit number (1a7a6a1a0)2.(1a_7a_6\ldots a_1a_0)_2. Be aware that it is okay to reuse the names aia_i for the unknown digits, because the algorithm says they are the same for both numbers.

Next, we find the second remainder 484256=228.484-256=228. The highest power of 2 dividing 228 is 27=128.2^7=128. Continue the example and see for yourself that the binary expansion of 228228 is (11100100)2(11100100)_2. (Your remainders after 228228 will be 100100, 3636, 44, and 00.)

Now, the binary expansion of 484484 is (111100100)2(111100100)_2 and the binary expansion of 1,5081,508 is (1a9111100100)2.(1a_9111100100)_2.

The algorithm tells us what to do with our missing a9a_9: since it does not appear in the expansions of any of our remainders, it is 0. Therefore, 1,508=(10111100100)2.1,508 = (10111100100)_2.

This examples was made much more tedious than it needed to be so that the recursive steps were laid bare. In practice, you will simply continue subtracting off powers of 22 until none of the target number remains, writing a 11 if you subtracted a particular power of 22 and a 00 if you didn’t. Since we had to subtract 1,0241,024, 256256, 128128, 6464, 3232, and 44, we write a 11 for the eleventh, ninth, eighth, seventh, sixth, and third digits, and a 00 everywhere else.

You have no doubt noticed that our binary numbers are all enclosed in the notation ()2()_2. This is to signify that the (101)2(101)_2, for example, is referring to the number five and not the number one hundred one.

5.3 Octal and hexadecimal numbers

The word octal refers to the base-8 system and hexadecimal refers to the base-16 system. These two bases are useful abbreviations of binary numbers: octal numbers are used to modify permissions on Unix systems, and hexadecimal numbers are used to encode colors in graphic design.

We say “abbreviation” because of the following algorithm.

Algorithm 5.3.1 Suppose nn is an integer with a base bb representation and we desire to know its base bkb^k representation, for some integer k>1k > 1. Then we may group the digits of nn in base bb into groups of kk and rewrite each group as a digit in base bkb^k. The resulting string is the binary expansion of nn in base bkb^k.

The idea is that it is easy to go from some base to a base that is a power of the original base. Consider the following example.

Example. Write the number 236236 in octal and in hexadecimal.

The starting point will be to write 236236 in binary. Since 236=128+64+32+8+4,236=128+64+32+8+4, its binary expansion is (11101100)2(11101100)_2.

To write 236236 in octal, we will regroup the binary digits of 236236 by threes because 8=238=2^3. However, there are eight digits. No worries; we will simply add a meaningless 00 as the leftmost digit of the binary expansion. Those groups are: 011011, 101101, and 100100 from left to right. In order, those numbers are 33, 55, and 44; so, 236236 is (354)8(354)_8.

Observe that in a base bb number system, the digits are 0,1,2,,b10, 1, 2, \ldots, b-1. Therefore the highest digit we will see in octal is 77.

However – what about hexadecimal? Its set of digits is larger than our usual base-10 set. It is standard usage to use AA through FF to represent 1010 through 1515.

Decimal digit 1010 1111 1212 1313 1414 1515
Hexadecimal digit AA BB CC DD EE FF

Since 16=2416=2^4, to write 236236 in hexadecimal we will group the digits in its binary expansion by fours this time. Those groups are 11101110 and 11001100, which represent fourteen and twelve respectively. So, 236=(EC)16236=(EC)_{16}.

To summarize our example, here are four different representations of the number 236236.

System Decimal Binary Octal Hexadecimal
236236 (11101100)2(11101100)_2 (354)8(354)_8 (EC)16(EC)_{16}


Chapter 5 problems

6. Sequences and sums


6.1 Sequences

Remember that a set has no order. For example, {1,2}\{1,2\} and {2,1}\{2,1\} are the same set. Often, we care about the order objects are in. Enter the sequence.

Definition 6.1.1 Let INI \subseteq \mathbf{N} and XX be some set. A sequence is a function a:IXa:I \to X. We write ana_n instead of a(n)a(n). Here, nn is the index and ana_n is the term. We denote the entire sequence as (an)(a_n) (when the start and endpoints are stated elsewhere or are unimportant), (an)n=k(a_n)_{n=k}^\infty (for an infinite sequence starting with aka_k), or (an)n=km(a_n)_{n=k}^m (for a finite sequence starting at aka_k and ending at ama_m). A finite sequence is called a word from the alphabet XX.

Once again we have chosen a more technical, functional definition where we instead could have said “a sequence is a totally ordered list of objects.” It is worth considering the way in which a function can “port structure” from one set to another. If we wanted a totally ordered set it is hard to imagine a better candidate than N\mathbf{N}. The sequence transfers the order structure from N\mathbf{N} into whatever set XX we like.

Example. Consider the sequence (an)n=1=(1,4,9,16,25,).(a_n)_{n=1}^\infty = (1, 4, 9, 16, 25, \ldots). Observing that each term is the square of its index, this sequence may be represented in closed form by writing an=n2a_n=n^2.

Not every sequence may be so easily identified in closed form.

Example. Consider the sequence (Fn)n=0=(0,1,1,2,3,5,8,13,21,34,).(F_n)_{n=0}^\infty = (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, \ldots). You may recognize this sequence as the famous Fibonacci sequence. There is a closed-form expression for FnF_n, but you are unlikely to see it by visual inspection. However, notice that each term in the sequence is equal to the sum of the preceding two terms: for instance, 13=8+513=8+5.

Therefore instead we can represent the Fibonacci sequence recursively with a recurrence relation: Fn={Fn1+Fn2,n>11,n=10,n=0 F_n = \begin{cases} F_{n-1} + F_{n-2}, & n > 1 \\ 1, & n = 1\\ 0, & n = 0 \end{cases}

Once a recurrence relation is established, the terms of a sequence may be calculated iteratively or recursively. Each method has its strengths and weaknesses depending on the context; typically, recursive calculations are more difficult but require storing less information.

Example. Consider the sequence (xn)(x_n) where xn={xn12xn2,n>21n=2,1n=1 x_n = \begin{cases} x_{n-1} - 2x_{n-2}, & n > 2 \\ 1 & n = 2, \\ -1 & n = 1 \end{cases}

To calculate x4x_4 iteratively, we simply calculate each term of (xn)(x_n) until we arrive at x4x_4. So: x1=1x_1=-1, and x2=1x_2 = 1. Then x3=x22x1=12(1)=3,x_3 = x_2 - 2x_1 = 1 - 2(-1) = 3, and x4=x32x1=32(1)=1.x_4 = x_3 - 2x_1 = 3 - 2(1) = 1. Thus, x4=1x_4 = 1.

In the recursive calculation, we will begin with x4x_4 and repeatedly apply the recurrence relation until we have only known values of the sequence: so, x2x_2 and x1x_1. First, x4=x32x2.x_4 = x_3 - 2x_2.