This seminar by mark-t took place on 5th October 2008 20:00 UTC, in #mathematics channel of irc.freenode.net.

The timestamps displayed in the IRC text are UTC+1.

Any explanation or errors regarding this seminar should be recorded in the discussion page.

## Seminar Edit

21:00:19 ChanServ changed the topic of #mathematics to: SEMINAR IN PROGRESS. If you want to ask a question, say ! and wait to be called 21:00:26 _llll_: ready when you are mark-t 21:00:36 mark-t: all righty then 21:02:31 mark-t: so, here's a brief overview of what I expect to cover today: I'll introduce non-deterministic finite automata (NFAs) and discuss their relationship with DFAs, and I'll introduce turing machines in just enough detail to actually state the P vs. NP problem 21:03:04 mark-t: I don't expect to have enough time to really do turing machines justice, so I'd like to hold that until next week if possible 21:03:31 mark-t: and, in that case, I'll also talk more about NP-Completeness and the like 21:03:42 mark-t: so, moving on 21:03:56 mark-t: first, I need to introduce big O notation, for those of you who don't know it 21:05:32 mark-t: given two functions, in our case they'll both map the natural numbers to themselves, f and g, we'll say that f(n) is O(g(n)) if there exists a constant c such that f(n) <= c*g(n) for sufficiently large n 21:06:45 mark-t: for example, if f(n) = n^2 + 3n + 5, then f(n) is O(n^2) because for all sufficiently large n, n^2 + 3n + 5 <= 2n^2 21:07:40 mark-t: in complexity theory, which is the field in which the P vs. NP problem lies, this is the typical use for big O notation -- we use it to ignore smaller order terms and constant multipliers 21:08:20 mark-t: so, if there are no questions on that, we'll move on to NFAs 21:09:43 mark-t: but first, a note to simplify things for us: it was stated that a DFA has the requirement for every state to have one edge leaving it for each letter 21:10:45 mark-t: that's technically true, but it's pretty common just to say that a machine rejects the string if it reaches a state with no edge that it can use 21:11:31 mark-t: this is, of course, equivalent to adding a dead state to the machine that loops to itself on every input; we'll just pretend it's there, so we don't have to worry about it 21:11:55 mark-t: now, an NFA differs from a DFA in one major way: any state is allowed to have more than one edge leaving it for any given letter 21:12:36 mark-t: now, we obviously need a rule for how to deal with this 21:13:06 mark-t: we say that an NFA accepts a string if any possible path it could follow ends in an accepting state 21:13:53 mark-t: how this is interpreted is a matter of preference; I prefer to think that the machine does all the computations in parallel, but you could also think of it as being that the machine always knows the right choice to make 21:14:35 mark-t: before I do an example, there's one more little feature that we're going to add: epsilon transitions 21:14:57 mark-t: an epsilon transition is an edge in the machine whose label is epsilon, the empty string 21:15:31 mark-t: when an NFA reaches a state with an epsilon transition, it may choose (or not) to follow that edge without reading a letter from the string 21:15:36 josip_: ! 21:15:40 mark-t: yes? 21:16:05 josip_: you said that an NFA accepts a string if any possible path .. do you mean if at least one path? 21:16:17 mark-t: yes 21:16:53 mark-t: so, let's do an example 21:17:26 mark-t: consider the language over {a,b} (which will be my language all day) for the following regular expression: (ab|ba|aba)* 21:18:27 mark-t: we need to build a machine that accepts any string that is some combination of ab, ba, and aba, and rejects anything else 21:19:18 mark-t: here is the NFA that accomplishes this:

21:20:05 mark-t: so, the point of interest here is the state q_a; it has two "b" edges leaving it 21:22:43 mark-t: this is because when you read "ab" in a string, you don't know if that's the end of an "ab" or the middle of an "aba" 21:22:43 mark-t: but since the NFA has an edge for both cases, it will check both possibilities 21:22:50 mark-t: do I need to explain this machine a little more? 21:23:35 vixey: no 21:23:42 mark-t: ok then 21:24:08 mark-t: once you understand NFAs, it becomes clear that they can recognize any language described by a regular expression 21:24:54 mark-t: each possible way of checking the regular expression just becomes one path (computation) in the machine 21:25:30 mark-t: a bigger question is whether or not NFAs can accept some languages that DFAs cannot 21:26:06 mark-t: it turns out the answer is no; I'll attempt to convince you of this by showing how any NFA can be converted to a DFA, using this example 21:26:41 mark-t: the basic idea is that we want to build a DFA that emulates the behavior of the NFA 21:28:03 mark-t: this is where I prefer to think of the NFA as doing the computations in parallel; after any given number of steps, the NFA may have computations in any given subset of its states 21:28:53 mark-t: for example, after reading "ab", it is in the set {q_e,q_ab} 21:30:02 mark-t: so, our DFA will contain one state for each subset of the states of the NFA; and, for marginal simplicity, I'll omit the {} on singleton sets 21:30:29 mark-t: so, let's figure out how the transitions of the machine must look 21:30:50 mark-t: we have to start in state q_e, so that will also be the starting state for the DFA 21:31:34 mark-t: from there, if we read an "a", the NFA must end up in q_a, so that will be the target in the DFA as well; if we read a "b", the NFA must end up in q_b 21:32:17 mark-t: from q_a, if we read a "b", the NFA may end up in q_e or q_ab, so the DFA will lead to state {q_e,q_ab} with a "b" 21:32:38 mark-t: getting an "a" from q_a leads to rejecting the string, so we'll just ignore that edge 21:33:02 mark-t: now, from {q_e,q_ab}, let's consider what happens when we read an "a" 21:33:36 mark-t: if the NFA was in state q_e and read an "a", it would end up in state q_a, so q_a must be an element of the set for the next state 21:33:59 mark-t: if it was in state q_ab and read an "a", it would end up in state q_e, so q_e must also be an element of the set for the next state 21:34:18 mark-t: therefore, {q_e,q_ab} leads to {q_e,q_a} when reading an "a" 21:34:58 mark-t: we just keep repeating this logic for all states and letters until there aren't any new states, and then we're done 21:35:09 mark-t: here's the result:

21:35:19 vixey: ! 21:35:22 mark-t: yes? 21:35:45 vixey: if you get into a state {} does that mean failure? and if you are in a state {x,y,z} with no text left, what does that mean? 21:36:25 mark-t: the state {} is the dead state 21:37:18 mark-t: and thanks for bringing up the accepting criterion; it's made me realize there's a mistake in my machine 8) 21:38:06 mark-t: if the NFA runs out of text in an accepting state, then it accepts 21:38:25 ~PRIMESinP: ! 21:38:30 mark-t: so, if any element of the state {x,y,z} is an accepting state in the NFA, then the DFA should accept that string 21:38:41 mark-t: since there would've been some computation that ended in that state 21:38:42 mark-t: yes? 21:39:07 ~PRIMESinP: So, it means that {q_e,q_a} and {q_e,q_ab} states on img are also accepting? 21:39:19 mark-t: yes, they should've been 21:40:37 mark-t: by the way, a similar process can be used to show that the set of regular languages is closed under unions and intersections 21:42:20 mark-t: now, when you look at this process, you'll notice that, for an NFA with n states, the corresponding DFA could have as many as 2^n states 21:43:10 mark-t: but it's not clear that you couldn't construct a smaller DFA that accepts the same language 21:43:37 mark-t: I'd like to show an example of a set of languages where the minimal DFA has O(2^n) states, compared to the NFA 21:44:14 mark-t: consider the set of languages L_n given by L_n = {u: nth to last letter of u is 'a'} 21:44:39 mark-t: here's the NFA for L_3:

21:45:29 mark-t: the way this works is that the machine just reads through letters for as long as it likes, guesses when it's hit the third to last letter, and then goes along the path to the right 21:45:52 mark-t: note that the machine would reject if it read another letter after the accepting state 21:46:36 mark-t: so, it's pretty clear that we can always construct an NFA for L_n with n+1 states; you just keep adding one state at a time to the sequence on the right 21:47:29 mark-t: for DFAs, I intend to prove that the minimal machine for L_n has 2^n states 21:48:40 mark-t: to this end, we'll introduce the equivalence relation ~ as follows: u ~ v iff for every possible string w, either both uw and vw are in the language, or neither are 21:49:33 mark-t: for example, in L_3, "aaa" and "abbaaa" are equivalent 21:50:14 mark-t: that's because, no matter what you add to the end of each, they'll always have the same third to last letter 21:51:25 mark-t: now, for DFAs, if the machine is reading two strings that are not equivalent, then they can't end up in the same state 21:52:23 mark-t: that's because, for a DFA, the only memory it has is what state it's currently in; anything that happens after that is indistinguishable between them 21:52:47 mark-t: (for two things that end up in the same state) 21:53:14 mark-t: now, I claim that the following strings are all inequivalent: aaa, aab, aba, abb, baa, bab, bba, bbb 21:54:36 mark-t: clearly, this is true for the ones that begin with different letters -- adding the empty string makes the machine accept one and reject the other 21:55:00 mark-t: for the ones that are different in the second letter, adding any string of length one to the end produces the same result 21:55:16 mark-t: for the ones that are different in the third letter, adding any string of length two produces the same result 21:55:35 mark-t: and I think this makes it clear that L_n will have at least 2^n strings that are pairwise inequivalent 21:55:52 mark-t: and therefore, any DFA that accepts L_n must have at least 2^n states 21:56:39 mark-t: it turns out that every regular language has a minimal DFA, and the number of states is equal to the number of equivalence classes, and this machine is unique 21:57:03 mark-t: from here, this isn't hard to prove, but I'll omit that 21:57:50 mark-t: but one consequence of this fact is that it gives another way, in addition to the pumping lemma, to show that a language isn't regular: if it has infinitely many equivalence classes, it can't be recognized by a finite automaton 21:58:41 mark-t: by the way, L_n does have exactly 2^n equivalence classes; here's the minimal DFA for L_3:

File:Mark-t-min-dfa-for-L3.png

21:58:52 mark-t: yes, it's a mess, but that's partly graphviz's fault 21:59:19 mark-t: so, if there aren't any questions on this, I think it's time to move on to turing machines 22:01:06 mark-t: there are two major differences between DFAs and turing machines: a turing machine, after it reads a letter, can decide to look at either the letter after it or the letter before it, and a turing machine may edit the string in the middle of a computation 22:02:15 mark-t: a little more technically, each transition in a turing machine has three parts: the symbol it reads, the symbol it writes, and the direction it moves 22:03:21 mark-t: the turing machine reads its input from a semi-infinite tape; this tape always begins with a blank space, followed by the input string, and then followed by infinitely many blank spaces 22:04:53 mark-t: the turing machine has an extra alphabet, called the tape alphabet; it contains the alphabet of the language, but it also contains a B, for a blank space, and it may contain many other symbols as well, which the machine uses to mark spaces that it cares about 22:05:11 mark-t: these will normally be something like X or Y 22:06:11 mark-t: so, a turing machine halts when it gets to a state and position on the tape where the symbol on the tape has no corresponding transition in the machine 22:06:49 mark-t: when that happens, it decides whether or not to reject the string in the same way as a DFA: it checks whether or not it's in an accepting state 22:08:04 mark-t: one caveat is that a turing machine might never halt at all; for example, it might decide that it wants to read any alphabet symbol, including B, replace it on the tape, and then always move to the right 22:08:17 mark-t: this machine would never halt, because after the input, it will keep reading blanks forever 22:09:00 mark-t: also, if the machine moves off the left side of the tape, this is considered a bad thing, and the machine bursts into flames 22:09:12 mark-t: or it rejects 22:09:25 ~TRWBW2: ! 22:09:28 mark-t: yes? 22:09:39 ~TRWBW2: you including any notion at this level of end of the input? 22:10:06 mark-t: what end of the input? 22:10:23 ~TRWBW2: mark-t: you are using blank spaces for the end of the input, am i interpreting that correctly? 22:10:29 mark-t: yes 22:10:31 ~TRWBW2: k 22:11:18 mark-t: ok, so let's do a quick example: we'll construct a turing machine that recognizes the language L given by L = {u: u has the same number of 'a's and 'b's} 22:12:39 mark-t: intuitively, one way you might solve this problem, is to look at your string and look for an "a" or "b"; when you find an "a", you cross it off, find a "b" and cross it off; if you found a "b", you'd cross it off, look for an "a", and cross it off; then, you'd repeat 22:12:52 mark-t: that's more or less what the machine will do 22:13:28 mark-t: but it needs to do it in a systematic way 22:14:02 mark-t: so what we'll actually do is the following: find the first letter that isn't crossed off, cross it off, then find the first letter of the opposite kind, cross it off, then go back to the beginning 22:14:23 mark-t: the machine will terminate when it can't find any letter to cross off 22:14:50 ~^C: ! 22:14:50 mark-t: so, let's build this thing 22:14:52 mark-t: yes? 22:15:16 ~^C: How it could determine if there's no letter to cross off? 22:15:34 mark-t: it will run into a blank before it runs into an "a" or "b", whatever it's looking for 22:16:57 ~^C: So there is blank on the left side of tape ("before" input)? 22:17:03 mark-t: yes 22:17:40 mark-t: ok, so, we'll start off in state q_0; we read the initial blank, leave it alone, and go to the right; we'll call this new state q_f, since at this point, if we encountered another blank, the string would have 0 "a"s and 0 "b"s, so we want to accept it here -- f stands for final 22:18:51 mark-t: now, from here, if we read an "a", we want to replace it with an X, then move to the right and start looking for a "b"; we'll call that direction q_a; we'll do the same thing for "b" and call that one q_b 22:19:36 mark-t: now, from q_a, we want to skip past any "a"s we see, so we'll have a loop on this state that keeps going right 22:20:09 mark-t: if we read a "b", then we want to replace it with an X and go back to the beginning of the string 22:20:53 mark-t: it turns out that we can reuse state q_0 for this; just add loops on it to skip past any "a" and skip past any "X" 22:21:31 mark-t: we'll do something similar for state q_b, so we'll also add a loop on q_0 to skip past any "b" 22:22:06 mark-t: once we get back to the blank at the beginning of the string, q_0 already tells us we should leave it there and go to the right, back to state q_f 22:23:07 mark-t: at this point, we want to find the next "a" or "b"; in particular, that means we want to skip the X we know we're currently looking at 22:23:24 mark-t: we'll add a loop to q_f that skips any X it finds 22:24:31 mark-t: now, if it runs into a blank, that means it's done, and there were an equal number of "a"s and "b"s, so we'd just stay in q_f and accept 22:24:55 mark-t: if we didn't want to accept there, we could read one blank and go to a dead state that just rejects on the next blank 22:26:00 mark-t: so anyway, this now all repeats; if we found an "a" or "b", we'd replace it with an X and look for the next one; this time, though, we might have to ignore some X on the way, so we add loops to q_a and q_b to ignore any X it finds 22:26:23 mark-t: and then we're done; here's the result:

22:27:11 mark-t: the edge transitions look like "letter being read" / "letter being written" "L(eft) or R(ight)" 22:28:34 |Steve|: ! 22:28:37 mark-t: yes? 22:28:49 |Steve|: That will move left immediately and then reject. 22:29:00 mark-t: nope, the first letter on the tape is a blank 22:29:14 |Steve|: Oh, that's nonstandard. 22:29:20 |Steve|: Is B blank? 22:29:23 mark-t: yes 22:29:38 |Steve|: Okay. 22:29:49 mark-t: it's the way I was taught, which was from Sudkamp's "Languages and Machines" 22:30:48 mark-t: I find it a useful convention, as this would require much more care if the blank wasn't there 22:35:55 |Steve|: All of the minor variations are easily shown to be equivalent. 22:36:34 mark-t: ok, so non-deterministic turing machines are very analogous to NFAs; if there's more than one edge from any state that reads the same symbol, the machine is deterministic; the non-deterministic turing machine accepts if there's any possible computation that accepts the string 22:37:00 mark-t: er, "the machine is non-deterministic" 22:37:59 mark-t: I won't go into it this week, but again deterministic and non-deterministic turing machines accept the same set of languages; basically, it's because the deterministic one is powerful enough to emulate the non-deterministic one 22:38:34 mark-t: so, we need one more notion, and that's the time complexity of a turing machine 22:39:17 mark-t: if M is a turing machine, then tc(M) is a function of the natural numbers to itself, computed as follows: 22:40:26 mark-t: given a string u, let t_M(u) be the number of transitions it takes for the machine M to accept or reject u 22:40:51 mark-t: then, tc(M) is the function that takes n to the maximum of t_M(u) over all strings u of length n 22:41:27 mark-t: and, for this to be defined, we're only considering machines that halt on all inputs 22:41:44 mark-t: now, that's for a deterministic turing machine 22:42:10 mark-t: for a non-deterministic, we let t_M(u) be the maximum number of transitions in all possible computations of u 22:43:32 mark-t: with this, we can define a class of languages P by P = {L: there exists a deterministic turing machine M accepting L such that tc(M) is O(f) for some polynomial f} 22:44:14 mark-t: similarly, we define NP = {L: there exists a non-deterministic turing machine M accepting L such that tc(M) is O(f) for some polynomial f} 22:44:27 mark-t: then, the P vs. NP problem asks whether or not P and NP are the same 22:45:16 mark-t: so, I think I'll end there, and leave the rest for next week 22:45:32 vixey: thanks mark-t! very interesting 22:45:48 Cale: :) 22:46:18 ~^C: P is subset of NP for sure. ;) 22:30:46 ChanServ changed the topic of #mathematics to: Next Seminar: Properties of Turing Machines and NP-Completeness by mark-t, 12 October 20:00 UTC