For each “proof” below, critique it and identify the mistake.

Prove or disprove: Any \(n\) points on the plane are collinear (they are all in the same line).

**Proposed answer:**We prove the claim via induction on \(n\).Base case (\(n \in \{1,2\}\)): trivial.

Now let \(k > 2\) be some arbitrary natural number. Assume the statement holds for all sets of \(k-1\) points on the plane. We want to show that it also holds for any \(k\) points. So consider any \(k\) points on the plane. By induction hypothesis the first \(k-1\) points are one line, and also the last \(k-1\) points are on a line. Therefore, all the \(k\) points are on a line.

How many ways are there to arrange \(c \geq 0 \ \clubsuit\)s and \(d\geq 0 \ \diamondsuit\)s so that all \(\clubsuit\)s are consecutive?

**Proposed answer:**You can have any number between 0 and \(d\) \(\diamondsuit\)s, then a string of \(\clubsuit\)s; then the remainder of the \(\diamondsuit\)s. Hence, there are \(d+1\) possibilities.There is a circle of \(15251\) chips, green on one side, red on the other. Initially, all show the green side. In one move, you may take any four consecutive chips and flip them. Is it possible to get all of the chips showing red?

**Proposed answer:**No it is not possible. Let’s assume for contradiction we converted all \(15251\) chips to red. But this means in the very last move there must be 4 consecutive green chips and the remaining \(15247\) must be red. Repeating this \(k\) times for \(1 \leq k \leq 3812,\) we get three consecutive red chips, with the rest green. But we started from all green, contradiction.It is well known that \(\ln 2\) is an irrational number that is equal to the infinite sum \[\sum_{i = 1}^\infty \frac{(-1)^{i + 1}}{i} = 1 - \frac{1}{2} + \frac{1}{3} - \frac{1}{4} + \frac{1}{5} - \dots\] However, Leonhard claims to have a proof that shows otherwise:

He claims that \(\ln 2\) is rational and will prove this by showing \(\sum_{i = 1}^n \frac{(-1)^{i+1}}{i}\) is rational for all \(n > 0\) via induction.

Base Case: \(n = 1\): \(\sum_{i = 1}^1 \frac{(-1)^{i + 1}}{i} = 1\) is indeed rational.

Induction Hypothesis: Suppose that \(\sum_{i = 1}^n \frac{(-1)^{i + 1}}{i}\) is rational for \(0 < n < k + 1\) for some \(k \in \mathbb N\).

Induction Step: It now suffices to show that \(\sum_{i = 1}^{k + 1} \frac{(-1)^{i + 1}}{i}\) is rational. We have that \[\sum_{i = 1}^{k + 1} \frac{(-1)^{i + 1}}{i} = \sum_{i = 1}^k \frac{(-1)^{i + 1}}{i} + \frac{(-1)^{k+2}}{k+1}\] and by induction hypothesis, the first term is rational, and clearly the second term is also rational, and since the sum of two rationals is rational, we are done!

The proof breaks when we try to establish the claim for \(k=3\). To prove that the union of two sets of collinear points is collinear, we need the two sets to intersect in at least two points. However, we do not have that when we try to prove the \(k=3\) case.

When induction breaks, it usually breaks early on. So to sanity-check your induction proof, it is a good idea to manually work through it for small cases like \(n=1\), \(n=2\), and \(n=3\). Also, lots of errors in proofs, even by professional mathematicians, can arise from trying to take an element of a set, without first proving the set is non-empty. Watch out for this.

This proof ignores the edge case of \(c=0\), where there is only 1 possible arrangement. Mistakes in edge cases are usually considered to be minor. However for some problems, the edge cases are the only interesting or hard cases. Missing these cases would be considered a crucial mistake.

This solution tries to disprove the statement by assuming a particular process and then showing that the particular method won’t work. It does not suffice to show that one particular method does not work.

Here is a correct proof of the statement:

We wish to show that it is not possible to reach a position of \(15251\) red chips in a circle, beginning from the position of \(15251\) green chips in a circle, in any number of moves, where a move consists of flipping the colors of four adjacent chips. To show this, we will first prove that if we make a move from any position with an even number of red chips (and \(15251\) total chips) we will leave a position with an even number of red chips.

Note that whenever we flip a chip, we either increase or decrease the number of red chips by one, and therefore the parity of the total number of red chips changes (if it was odd, it becomes even, if it was even, it becomes odd). Since we can view each of the moves from the problem description as flipping four chips in succession, each move will cause the parity to change four times, meaning that the parity of the total number of red chips will end up remaining the same after each move. Thus, we have shown that if we make a move from a position with an even number of red chips, we will leave a position with an even number of red chips.

Note that the initial position has an even number of red chips, namely \(0\). By the property we just proved, no matter how many moves we make, the invariant of having an even number of red chips is preserved. Since \(15251\) is an odd number, we can conclude that we can never reach a position with \(15251\) red chips.

While each step of his proof is correct, it doesn’t actually prove his claim! He only proved that for all \(n \in \mathbb N\), the first \(n\) terms of the summation is rational, but not that the entire summation is rational. In other words, he didn’t prove anything about the

*infinite*sum.

Let \(\Sigma\) be the alphabet consisting of an opening parenthesis and a closing parenthesis, that is, \(\Sigma = \{{\texttt{(}},{\texttt{)}}\}\). Let language \(L \subseteq \{{\texttt{(}},{\texttt{)}}\}^*\) be inductively/recursively defined as follows.

\({\texttt{()}} \in L\),

If \(x \in L\), then \({\texttt{(}}x{\texttt{)}} \in L\). (WRAP)

If \(x, y \in L\), then \(xy \in L\). (CONCAT)

We claim that these rules create exactly the set of “balanced strings of parentheses”. But what does that even mean?

Give a non-recursive definition of “balanced string of parentheses”?

Prove that a string \(x\) over \(\{{\texttt{(}},{\texttt{)}}\}^*\) is in \(L\) if and only if it satisfies your above definition of a “balanced string of parentheses”.

For part 1, a balanced string of parentheses is a non-empty string \(w\) in \(\{{\texttt{(}},{\texttt{)}}\}^*\) satisfying the following properties.

**Prop 1**: The total number of open parentheses and closed parentheses in \(w\) are equal.**Prop 2**: When scanning \(w\) from left to right, at all points in time, the number of open parentheses is at least the number of closed parentheses.

We will denote by \(K\) the set of all strings satisfying **Prop 1** and **Prop 2**.

We now begin the proof for part 2. Our goal is to show \(L = K\), and we will do so by a double containment argument. But before we begin, we introduce some notation that will allow us to express our proof more clearly and succinctly.

Given a string \(w\) in \(\{{\texttt{(}},{\texttt{)}}\}^*\), let \(O_w(i)\) be the number of open parentheses in the substring \(w[1:i]\), which denotes the first \(i\) symbols of \(w\). Similarly, let \(C_w(i)\) be the number of closed parentheses in the substring \(w[1:i]\). Let \(D_w(i)\) be \(O_w(i) - C_w(i)\). And finally, let \(n\) be the length of \(w\). Note that the \(O\) is for “open”, the \(C\) is for “closed”, and the \(D\) is for “difference”. With this notation, we can express the two properties above as follows.

**Prop 1**: \(D_w(n) = 0\).**Prop 2**: For all \(i \in \{1,2,\ldots,n-1\}\), \(D_w(i) \geq 0\).

One nice way to visualize these properties is with a “mountain range” picture:

As we scan the string from left to right, every time we encounter an opening parenthesis, we can think of it as an arrow going up, and every time we encounter a closing parenthesis, we can think of it as an arrow going down. The horizontal line represents height \(0\). We can think of \(D_w(i)\) as the height after reading \(i\) symbols of the string. The first property says that at the very end, we should be back at height \(0\). The second property says that we never go below height \(0\). When these two properties are satisfied, we’ll say that we have a *valid mountain range*. (Successfully solving problems is often about finding the right visualization!)

**Part 1 : Showing \(L \subseteq K\)**

We will prove by structural induction that for any \(x\in L\), \(x\in K\) as well.

*Base Case*. The base case corresponds to \(w={\texttt{()}}\) and it is easy to check that this \(w\) satisfies both **Prop 1** and **Prop 2**. Therefore \(w \in K\).

*Induction Step*. Consider an arbitrary string \(w\in L\) of length \(n\) such that \(w \neq {\texttt{()}}\) (i.e. \(w\) does not correspond to the base case). This means that \(w\) was created using some \(k \geq 1\) applications of the WRAP and CONCAT rules. Depending on what the last applied rule was, we know that either \(w = {\texttt{(}}x{\texttt{)}}\) for some \(x\in L\), or \(w = xy\) for some \(x, y\in L\). We consider both cases below. Note that the structural induction hypothesis allows us to assume that any string \(x \in L\) that can be created using less than \(k\) applications of the WRAP and CONCAT rules, is in \(K\).

**Case 1**: \(w = {\texttt{(}}x{\texttt{)}}\) where \(x\in L\). By induction hypothesis, we know \(x \in K\) and therefore satisfies**Prop 1**and**Prop 2**, i.e. \(x\) corresponds to a valid mountain range. Then \(w\) corresponds taking the mountain range \(x\), lifting it by 1 unit of height, and attaching an up arrow at the beginning, and a down arrow at the end. This operation produces a valid mountain range.\(O_w(n) = 1 + O_x(|x|)\), \(C_w(n) = 1 + C_x(|x|)\), and therefore \(D_w(n) = D_x(n) = 0\).

\(D_w(1) = 1\) and for every \(i \in \{2,3,\ldots,n-1\}\), \(D_w(i) = 1 + D_x(i) > 0\).

These two observations show that \(w\) satisfies

**Prop 1**and**Prop 2**respectively (i.e. that it is a valid mountain range). Therefore \(w\) is in \(K\).**Case 2**: \(w = xy\) where \(x, y\in L\). By induction hypothesis, we know \(x,y \in K\) and therefore both satisfy**Prop 1**and**Prop 2**, i.e. both are valid mountain ranges. The string \(w\) is obtained by concatenating two valid mountain ranges, so visually it is clear that \(w\) also corresponds to a valid mountain range.Equivalently, let \(t\) be the length of \(x\). Then observe that the following are true:

\(O_w(n) = O_x(|x|) + O_y(|y|)\), \(C_w(n) = C_x(|x|) + C_y(|y|)\), and therefore \(D_w(n) = D_x(|x|) + D_y(|y|) = 0 + 0 = 0\).

For every \(i \in \{1,2,\ldots,t\}\), \(D_w(i) = D_x(i) \geq 0\).

\(D_w(t) = D_x(t) = 0\), and therefore for \(i \in \{t+1,\ldots,n\}\), \(D_w(i) = D_y(i-t) \geq 0\).

The first item above shows that \(w\) satisfies

**Prop 1**. The next two items show that \(w\) satisfies**Prop 2**(in fact the third item subsumes the first one, so we only really need the second and the third items). We conclude \(w\) is in \(K\).

**Part 2 : Showing \(K \subseteq L\)**

We prove that if \(w \in K\), then \(w \in L\), by (strong) induction on the length of \(w\).

*Base Case*. Note that \(K\) does not contain the empty string, so the shortest length \(w\) in \(K\) is the string \({\texttt{()}}\), which by definition is in \(L\). There are no other strings of length 2 that are in \(K\).

*Induction Step*. Let \(w\) be an arbitrary string in \(K\) of length \(n > 2\). The (implicit) induction hypothesis is that any string \(w' \in K\) of length less than \(n\) is in \(L\). By the definition of \(K\), we know \(w\) satisfies **Prop 1** and **Prop 2**, i.e., it is a valid mountain range. We will consider two cases.

**Case 1**: There exists \(t \in \{2,\ldots,n-1\}\) such that \(D_w(t) = 0\).Define \(x = w[1:t]\) and \(y = w[t+1:n]\) so that \(w = xy\). Then we claim that both \(x\) and \(y\) satisfy

**Prop 1**and**Prop 2**. In other words, both \(x\) and \(y\) are valid mountain ranges. Visually this is not hard to see. Or equivalently:For \(i \in \{1,2,\ldots,t\}\), \(D_x(i) = D_w(i)\). Therefore \(D_x(t) = D_w(t) = 0\) and \(D_x(i) = D_w(i) \geq 0\) for \(i \in \{1,2,\ldots,t-1\}\).

For \(i \in \{1,2,\dots,n-t\}\), \(D_y(i) = D_w(t+i)\). Therefore \(D_y(n-t) = D_w(n) = 0\) and for \(i \in \{1,2,\dots,n-t-1\}\), \(D_y(i) = D_w(t+i) \geq 0\).

Since \(x\) and \(y\) satisfy

**Prop 1**and**Prop 2**, they both belong to \(K\). And by induction hypothesis, they both belong to \(L\). This implies that \(w = xy\) is also in \(L\) as it can be constructed from two strings in \(L\) using the CONCAT rule.**Case 2**: There does not exist \(t \in \{2,\ldots,n-1\}\) such that \(D_w(t) = 0\). In other words, for all \(t \in \{2,\ldots,n-1\}\), \(D_w(t) \geq 1\). Let’s mark this statement with [\(*\)].We know that \(w\) must start with an opening parenthesis. And since \(D_w(n) = 0\), we also know that \(w\) must end with closing parenthesis (otherwise we would have \(D_w(n-1) = -1\)). Therefore we can write \(w\) as \({\texttt{(}}x{\texttt{)}}\) for some string \(x\) of length \(n-2\). We claim that \(x\) satisfies

**Prop 1**and**Prop 2**, i.e. \(x\) is a valid mountain range. This follows from [\(*\)].\(D_x(n-2) = O_x(n-2) - C_x(n-2) = (O_w(n) - 1) - (C_w(n) - 1) = D_w(n) = 0\).

For \(i \in \{1,2,\ldots,n-2\}\), \(D_x(i) = D_w(i+1) - 1 \geq 1 - 1 = 0\), where in the last inequality, we used [\(*\)] above.

Since \(x\) satisfies

**Prop 1**and**Prop 2**, it belongs to \(K\). And by induction hypothesis, it belongs to \(L\). This implies that \(w = {\texttt{(}}x{\texttt{)}}\) is also in \(L\) as it can be constructed from a string in \(L\) using the WRAP rule.

In both cases, we were able to show that \(w \in L\), as desired.

Let \(L \subseteq \{{\texttt{0}}, {\texttt{1}}\}^*\) be the set of all strings \(s\) with the property that \(s = s^R\).

Give an inductive/recursive definition for \(L\).

Prove that your definition is correct. That is, if \(K\) is the set of all words that can be constructed inductively by applying your recursive rules from part (1), then prove that \(L = K\).

We call a string \(s\) a

*palindrome*if \(s = s^R\).A recursive definition consists of some number of base cases and some number of recursive rules. It may be easier to think of the recursive rules first, and then see what base cases are necessary. Here, we want to make new palindromes out of old ones. If you have a palindrome \(w\), then appending the same character \(a\) to the beginning and end will keep it a palindrome. Since this keeps the parity (even/odd) of the length of the palindrome the same, we will then need base cases for both even and odd cases. With this in mind, we have the following definition:

\(\epsilon \in K\);

For any \(a \in \Sigma\), \(a \in K\);

For any \(a \in \Sigma\) and \(w \in K\), \(awa \in K\).

To show \(L = K\), we’ll prove \(L \subseteq K\) and \(K \subseteq L\).

\((L \subseteq K)\): We’ll prove \(w \in L \implies w \in K\) by induction on the length of \(w\).

Note that if \(|w| = 0\), i.e. \(w = \epsilon\), or \(|w| = 1\), then \(w\) corresponds to one of the base cases in the definition of \(K\) and therefore \(w \in K\).

For the inductive step, let \(n \geq 1\) and assume that for all \(x \in L\) with \(|x| \leq n\), we have \(x \in K\). Now consider an arbitrary \(w \in L\) of length \(n + 1\). Since \(w = w^R\), and \(|w| \geq 2\), the first and last characters of \(w\) are identical. Say \(w = axa\) for \(a \in \Sigma\). Since \(w = w^R\), \[(axa)^R = axa \Longrightarrow ax^Ra = axa \Longrightarrow x^R = x.\] Since \(x = x^R\), i.e. \(x \in L\), and \(|x| < |w|\), applying the inductive hypothesis gives us that \(x \in K\). Then using the definition of \(K\), \(w = axa\) is in \(K\).

\((K \subseteq L)\): We’ll prove \(w \in K \implies w \in L\) by structural induction. There are three cases to consider, corresponding to the three parts of the definition of \(K\).

Suppose \(w = \epsilon\). Then \(w^R = \epsilon\) and so \(w \in L\).

Suppose \(w = a\) for some \(a \in \Sigma\). Since it’s a single character, again \(w^R = a\) and so \(w \in L\).

Suppose \(w = axa\) for some \(a \in \Sigma\) and \(x \in K\). By inductive hypothesis, \(x \in L\), so \(x = x^R\). Then \(w^R = (axa)^R = ax^Ra = axa\). Therefore \(w = w^R\) and \(w \in L\).

Let \(\mathbb Q\) denote the set of all rational numbers. Show that \(\mathbb Q\) is encodable.

By the definition of encodability, it suffices to find an alphabet \(\Sigma\) such that there is an injective function \(\phi : \mathbb Q\to \Sigma^*\). That is, we want to map each rational number to a string so that no two distinct rational numbers map to the same string.

Any alphabet would do, but let’s pick \(\Sigma = \{{\texttt{0}}, {\texttt{1}}, {\texttt{-}}, {\texttt{/}}\}\). We know every integer can be represented uniquely by a finite-length string over the alphabet \(\{{\texttt{0}}, {\texttt{1}}, {\texttt{-}}\}\). So we have an encoding \(\text{Enc}: \mathbb Z\to \{{\texttt{0}}, {\texttt{1}}, {\texttt{-}}\}^*\). We also know every rational number can be written as \(a/b\) where \(a\) and \(b\) are integers. So we may try to define \(\phi : \mathbb Q\to \Sigma^*\) such that \(\phi(a/b)\) is the string \(\text{Enc}(a){\texttt{/}}\text{Enc}(b)\). However, this is not a well-defined function because there are infinitely ways we can express a rational number as \(a/b\) (e.g. \(2a/2b\) is one equivalent representation, and \(-a/-b\) is another). Given some rational number, we need to map it to a single string. We can do this as follows. Each rational has a unique reduced form \(a/b\) where \(\gcd(a,b) = 1\) and \(b \in \mathbb N^+\). So we let \(\phi\) map a rational number to \(\text{Enc}(a){\texttt{/}}\text{Enc}(b)\) where \(a/b\) is the reduced form of the rational number. The function is injective because if two rational numbers have the same reduced form, then they are the same number.

Fix an alphabet \(\Sigma\). We define the notion of a *very simple* language recursively as follows.

\(\varnothing\) is very simple;

\(\{\sigma\}\) for each \(\sigma \in \Sigma\) is very simple;

if \(L_1\) and \(L_2\) are simple, then \(L_1 \cup L_2\) is very simple;

if \(L_1\) and \(L_2\) are simple, then \(L_1 L_2\) is very simple;

This means that a very simple language is any language that can be constructed by starting from the base cases (the first two bullet points) and applying the recursive rules (the last two bullet points) a finite number of times. Note that here we are not defining a single language, but rather a set of languages called *very simple languages*.

We define the notion of a *simple* language recursively as follows.

\(\varnothing\) is simple;

\(\{\sigma\}\) for each \(\sigma \in \Sigma\) is simple;

if \(L_1\) and \(L_2\) are simple, then \(L_1 \cup L_2\) is simple;

if \(L_1\) and \(L_2\) are simple, then \(L_1 L_2\) is simple;

if \(L\) is simple, then \(L^*\) is simple.

Give a non-recursive characterization for the set of all very simple languages. Briefly justify your answer. A detailed argument is not needed.

Briefly explain why the set of all very simple languages is not equal to the set of all simple languages.

Prove that if \(L\) is simple, then so is \(L^R\).

Let \(\Sigma = \{{\texttt{a}}, {\texttt{b}}\}\) and define \(h: \Sigma \to \Sigma^*\) such that \(h({\texttt{a}}) = {\texttt{b}}\) and \(h({\texttt{b}}) = {\texttt{ab}}\). Then \(h^* : \Sigma^* \to \Sigma^*\) is defined as follows: for \(w = \sigma_1\sigma_2 \ldots \sigma_k \in \Sigma^*\), \(h^*(w) = h(\sigma_1)h(\sigma_2) \ldots h(\sigma_k)\). We now define recursively a language \(L \subseteq \Sigma^*\) as follows:

\({\texttt{a}} \in L\);

if \(w \in L\), then \(h^*(w) \in L\).

For \(n \in \mathbb N\), let \(w_n \in \Sigma^*\) be the word generated after \(n\) applications of the recursive rule above. So \(w_0 = {\texttt{a}}\), \(w_1 = {\texttt{b}}\), \(w_2 = {\texttt{ab}}\), \(w_3 = {\texttt{bab}}\), \(w_4 = {\texttt{abbab}}\) and so on.

Prove that \(|w_n| = F_{n}\), where \(F_n\) is the \(n\)’th Fibonacci number. The Fibonacci numbers are defined as follows: \(F_0 = 1\), \(F_1 = 1\), and \(F_n = F_{n-1} + F_{n-2}\) for \(n \geq 2.\)

Fix the alphabet \(\Sigma = \{0,1\}\). Either prove or give a counter-example for the following statements.

For languages \(L_1\) and \(L_2\), \((L_1 L_2)^* = L_1^* L_2^*\).

For languages \(L_1\) and \(L_2\), \((L_1 \cup L_2)^* = L_1^* \cup L_2^*\).

For languages \(L_1\) and \(L_2\), \((L_1 \cup L_2)^* = (L_1^* L_2^*)^*\).

Fix some alphabet \(\Sigma\). Recall that for strings \(x\) and \(y\), we say that \(y\) is a prefix of \(x\) if there exists a string \(z\) such that \(x = yz\). Let \(A\) be a set of objects that we are interested in encoding. We say that an encoding \(\text{Enc}: A \to \Sigma^*\) is *useful* if it has the property that the set \(\{\text{Enc}(a) : a \in A\}\) does not contain two strings such that one is a prefix of the other.

In this question, we will see how we can take a useful encoding of a set \(A\) and turn it into an encoding (not necessarily useful) of the set \(A^*\). The set \(A^*\) is the set of all finite-length tuples of elements from \(A\). In mathematical notation \[A^* = \{(a_1, a_2, \ldots, a_k) : k \in \mathbb N\text{ and } a_i \in A \text{ for all } i\}.\] Note that here \(A\) is an arbitrary set (so it is not necessarily an alphabet or a language).

Here is what you need to prove. Let \(\text{Enc}: A \to \Sigma^* \setminus \{\epsilon\}\) be a useful encoding of the set \(A\). Prove that the function \(\text{Enc}^* : A^* \to \Sigma^*\), which is such that for \((a_1,a_2,\ldots, a_k) \in A^*\), \[\text{Enc}^* (a_1, a_2, \ldots, a_k) = \text{Enc}(a_1) \text{Enc}(a_2) \ldots \text{Enc}(a_k),\] is an encoding of \(A^*\).

Fix the alphabet \(\Sigma = \{0,1\}\), let \(A\) be any (countable) set, and recall the definition of a *useful* encoding of \(A\) from the previous problem.

Given some encoding \(\text{Enc}: A \to \{0,1\}^*\), we will see how we can construct another encoding \(\text{Enc}': A \to \{0,1\}^*\) that is *useful*. And we’ll try not to be wasteful in the process.

Suppose \(\text{UEnc} : \mathbb N\to \{0,1\}^*\) is some *useful* encoding of the naturals. Then define \(\text{Enc}': A \to \{0,1\}^*\) such that for \(a \in A\), \(\text{Enc}'(a) = \text{UEnc}(|\text{Enc}(a)|) \cdot \text{Enc}(a)\). Take a moment to really parse what this is saying.

Prove that \(\text{Enc}'\) is a

*useful*encoding of \(A\).Prove that any encoding \(\text{Enc}: A \to \{0,1\}^*\) can be transformed into a

*useful*encoding \(\text{Enc}': A \to \{0,1\}^*\) such that for all \(a \in A\), if \(n = |\text{Enc}(a)|\), then \(|\text{Enc}'(a)| \leq n + O(\log n)\) (i.e. \(|\text{Enc}'(a)| - n\) is \(O(\log n)\)).

Let \(\Sigma\) be an alphabet. Let \(x\) and \(y\) be nonempty strings in \(\Sigma^*\). Consider the following three statements about \(x\) and \(y\):

\(xy = yx\);

there is a nonempty string \(z\) and numbers \(m,n \in \mathbb N^+\) such that \(x = z^m\) and \(y = z^n\);

there are numbers \(k, \ell \in \mathbb N^+\) such that \(x^k = y^\ell\).

Show that these statements are equivalent; if one of them holds, so do the other two.