This seminar by mark-t took place on 5th October 2008 20:00 UTC, in #mathematics channel of

The timestamps displayed in the IRC text are UTC+1.
Any explanation or errors regarding this seminar should be recorded in the discussion page.

Seminar Edit

21:00:19 ChanServ changed the topic of #mathematics to: SEMINAR IN PROGRESS.
                  If you want to ask a question, say ! and wait to be called
21:00:26 _llll_: ready when you are mark-t 
21:00:36 mark-t: all righty then
21:02:31 mark-t: so, here's a brief overview of what I expect to cover today: I'll introduce
                 non-deterministic finite automata (NFAs) and discuss their relationship with
                 DFAs, and I'll introduce turing machines in just enough detail to actually
                 state the P vs. NP problem
21:03:04 mark-t: I don't expect to have enough time to really do turing machines justice, so
                 I'd like to hold that until next week if possible
21:03:31 mark-t: and, in that case, I'll also talk more about NP-Completeness and the like
21:03:42 mark-t: so, moving on
21:03:56 mark-t: first, I need to introduce big O notation, for those of you who don't know it
21:05:32 mark-t: given two functions, in our case they'll both map the natural numbers to
                 themselves, f and g, we'll say that f(n) is O(g(n)) if there exists a constant
                 c such that f(n) <= c*g(n) for sufficiently large n
21:06:45 mark-t: for example, if f(n) = n^2 + 3n + 5, then f(n) is O(n^2) because for all
                 sufficiently large n, n^2 + 3n + 5 <= 2n^2
21:07:40 mark-t: in complexity theory, which is the field in which the P vs. NP problem lies,
                 this is the typical use for big O notation -- we use it to ignore smaller
                 order terms and constant multipliers
21:08:20 mark-t: so, if there are no questions on that, we'll move on to NFAs
21:09:43 mark-t: but first, a note to simplify things for us: it was stated that a DFA has the
                 requirement for every state to have one edge leaving it for each letter
21:10:45 mark-t: that's technically true, but it's pretty common just to say that a machine
                 rejects the string if it reaches a state with no edge that it can use
21:11:31 mark-t: this is, of course, equivalent to adding a dead state to the machine that loops
                 to itself on every input; we'll just pretend it's there, so we don't have to
                 worry about it
21:11:55 mark-t: now, an NFA differs from a DFA in one major way: any state is allowed to have
                 more than one edge leaving it for any given letter
21:12:36 mark-t: now, we obviously need a rule for how to deal with this
21:13:06 mark-t: we say that an NFA accepts a string if any possible path it could follow ends
                 in an accepting state
21:13:53 mark-t: how this is interpreted is a matter of preference; I prefer to think that the
                 machine does all the computations in parallel, but you could also think of it
                 as being that the machine always knows the right choice to make
21:14:35 mark-t: before I do an example, there's one more little feature that we're going to
                 add: epsilon transitions
21:14:57 mark-t: an epsilon transition is an edge in the machine whose label is epsilon, the
                 empty string
21:15:31 mark-t: when an NFA reaches a state with an epsilon transition, it may choose (or not)
                 to follow that edge without reading a letter from the string
21:15:36 josip_: !
21:15:40 mark-t: yes?
21:16:05 josip_: you said that an NFA accepts a string if any possible path .. do you mean if
                 at least one path?
21:16:17 mark-t: yes
21:16:53 mark-t: so, let's do an example
21:17:26 mark-t: consider the language over {a,b} (which will be my language all day) for the
                 following regular expression: (ab|ba|aba)*
21:18:27 mark-t: we need to build a machine that accepts any string that is some combination of
                 ab, ba, and aba, and rejects anything else
21:19:18 mark-t: here is the NFA that accomplishes this:


21:20:05 mark-t: so, the point of interest here is the state q_a; it has two "b" edges leaving
21:22:43 mark-t: this is because when you read "ab" in a string, you don't know if that's the
                 end of an "ab" or the middle of an "aba"
21:22:43 mark-t: but since the NFA has an edge for both cases, it will check both possibilities
21:22:50 mark-t: do I need to explain this machine a little more?
21:23:35 vixey: no
21:23:42 mark-t: ok then
21:24:08 mark-t: once you understand NFAs, it becomes clear that they can recognize any language
                 described by a regular expression
21:24:54 mark-t: each possible way of checking the regular expression just becomes one path
                 (computation) in the machine
21:25:30 mark-t: a bigger question is whether or not NFAs can accept some languages that DFAs
21:26:06 mark-t: it turns out the answer is no; I'll attempt to convince you of this by showing
                 how any NFA can be converted to a DFA, using this example
21:26:41 mark-t: the basic idea is that we want to build a DFA that emulates the behavior of
                 the NFA
21:28:03 mark-t: this is where I prefer to think of the NFA as doing the computations in
                 parallel; after any given number of steps, the NFA may have computations in
                 any given subset of its states
21:28:53 mark-t: for example, after reading "ab", it is in the set {q_e,q_ab}
21:30:02 mark-t: so, our DFA will contain one state for each subset of the states of the NFA;
                 and, for marginal simplicity, I'll omit the {} on singleton sets
21:30:29 mark-t: so, let's figure out how the transitions of the machine must look
21:30:50 mark-t: we have to start in state q_e, so that will also be the starting state for the
21:31:34 mark-t: from there, if we read an "a", the NFA must end up in q_a, so that will be the
                 target in the DFA as well; if we read a "b", the NFA must end up in q_b
21:32:17 mark-t: from q_a, if we read a "b", the NFA may end up in q_e or q_ab, so the DFA will
                 lead to state {q_e,q_ab} with a "b"
21:32:38 mark-t: getting an "a" from q_a leads to rejecting the string, so we'll just ignore
                 that edge
21:33:02 mark-t: now, from {q_e,q_ab}, let's consider what happens when we read an "a"
21:33:36 mark-t: if the NFA was in state q_e and read an "a", it would end up in state q_a, so
                 q_a must be an element of the set for the next state
21:33:59 mark-t: if it was in state q_ab and read an "a", it would end up in state q_e, so q_e
                 must also be an element of the set for the next state
21:34:18 mark-t: therefore, {q_e,q_ab} leads to {q_e,q_a} when reading an "a"
21:34:58 mark-t: we just keep repeating this logic for all states and letters until there
                 aren't any new states, and then we're done
21:35:09 mark-t: here's the result:


21:35:19 vixey: !
21:35:22 mark-t: yes?
21:35:45 vixey: if you get into a state {} does that mean failure? and if you are in a state
                {x,y,z} with no text left, what does that mean?
21:36:25 mark-t: the state {} is the dead state
21:37:18 mark-t: and thanks for bringing up the accepting criterion; it's made me realize
                 there's a mistake in my machine  8)
21:38:06 mark-t: if the NFA runs out of text in an accepting state, then it accepts
21:38:25 ~PRIMESinP: !
21:38:30 mark-t: so, if any element of the state {x,y,z} is an accepting state in the NFA, then
                 the DFA should accept that string
21:38:41 mark-t: since there would've been some computation that ended in that state
21:38:42 mark-t: yes?
21:39:07 ~PRIMESinP: So, it means that {q_e,q_a} and {q_e,q_ab} states on img are also
21:39:19 mark-t: yes, they should've been
21:40:37 mark-t: by the way, a similar process can be used to show that the set of regular
                 languages is closed under unions and intersections
21:42:20 mark-t: now, when you look at this process, you'll notice that, for an NFA with
                 n states, the corresponding DFA could have as many as 2^n states
21:43:10 mark-t: but it's not clear that you couldn't construct a smaller DFA that accepts the
                 same language
21:43:37 mark-t: I'd like to show an example of a set of languages where the minimal DFA
                 has O(2^n) states, compared to the NFA
21:44:14 mark-t: consider the set of languages L_n given by
                  L_n = {u: nth to last letter of u is 'a'}
21:44:39 mark-t: here's the NFA for L_3:


21:45:29 mark-t: the way this works is that the machine just reads through letters for as long
                 as it likes, guesses when it's hit the third to last letter, and then goes
                 along the path to the right
21:45:52 mark-t: note that the machine would reject if it read another letter after the
                 accepting state
21:46:36 mark-t: so, it's pretty clear that we can always construct an NFA for L_n with n+1
                 states; you just keep adding one state at a time to the sequence on the right
21:47:29 mark-t: for DFAs, I intend to prove that the minimal machine for L_n has 2^n states
21:48:40 mark-t: to this end, we'll introduce the equivalence relation ~ as follows:
                 u ~ v iff for every possible string w, either both uw and vw are in the 
                 language, or neither are
21:49:33 mark-t: for example, in L_3, "aaa" and "abbaaa" are equivalent
21:50:14 mark-t: that's because, no matter what you add to the end of each, they'll always
                 have the same third to last letter
21:51:25 mark-t: now, for DFAs, if the machine is reading two strings that are not equivalent,
                 then they can't end up in the same state
21:52:23 mark-t: that's because, for a DFA, the only memory it has is what state it's currently
                 in; anything that happens after that is indistinguishable between them
21:52:47 mark-t: (for two things that end up in the same state)
21:53:14 mark-t: now, I claim that the following strings are all inequivalent:
                 aaa, aab, aba, abb, baa, bab, bba, bbb
21:54:36 mark-t: clearly, this is true for the ones that begin with different letters
                 -- adding the empty string makes the machine accept one and reject the other
21:55:00 mark-t: for the ones that are different in the second letter, adding any string of
                 length one to the end produces the same result
21:55:16 mark-t: for the ones that are different in the third letter, adding any string of
                 length two produces the same result
21:55:35 mark-t: and I think this makes it clear that L_n will have at least 2^n strings that
                 are pairwise inequivalent
21:55:52 mark-t: and therefore, any DFA that accepts L_n must have at least 2^n states
21:56:39 mark-t: it turns out that every regular language has a minimal DFA, and the number of
                 states is equal to the number of equivalence classes, and this machine is
21:57:03 mark-t: from here, this isn't hard to prove, but I'll omit that
21:57:50 mark-t: but one consequence of this fact is that it gives another way, in addition
                 to the pumping lemma, to show that a language isn't regular: if it has
                 infinitely many equivalence classes, it can't be recognized by a finite
21:58:41 mark-t: by the way, L_n does have exactly 2^n equivalence classes; here's the minimal
                 DFA for L_3:


21:58:52 mark-t: yes, it's a mess, but that's partly graphviz's fault
21:59:19 mark-t: so, if there aren't any questions on this, I think it's time to move on to
                 turing machines
22:01:06 mark-t: there are two major differences between DFAs and turing machines: a turing
                 machine, after it reads a letter, can decide to look at either the letter after
                 it or the letter before it, and a turing machine may edit the string in the
                 middle of a computation
22:02:15 mark-t: a little more technically, each transition in a turing machine has three parts:
                 the symbol it reads, the symbol it writes, and the direction it moves
22:03:21 mark-t: the turing machine reads its input from a semi-infinite tape; this tape always
                 begins with a blank space, followed by the input string, and then followed by
                 infinitely many blank spaces
22:04:53 mark-t: the turing machine has an extra alphabet, called the tape alphabet; it contains
                 the alphabet of the language, but it also contains a B, for a blank space, and
                 it may contain many other symbols as well, which the machine uses to mark
                 spaces that it cares about
22:05:11 mark-t: these will normally be something like X or Y
22:06:11 mark-t: so, a turing machine halts when it gets to a state and position on the tape
                 where the symbol on the tape has no corresponding transition in the machine
22:06:49 mark-t: when that happens, it decides whether or not to reject the string in the same
                 way as a DFA: it checks whether or not it's in an accepting state
22:08:04 mark-t: one caveat is that a turing machine might never halt at all; for example, it
                 might decide that it wants to read any alphabet symbol, including B, replace
                 it on the tape, and then always move to the right
22:08:17 mark-t: this machine would never halt, because after the input, it will keep reading
                 blanks forever
22:09:00 mark-t: also, if the machine moves off the left side of the tape, this is considered
                 a bad thing, and the machine bursts into flames
22:09:12 mark-t: or it rejects
22:09:25 ~TRWBW2: !
22:09:28 mark-t: yes?
22:09:39 ~TRWBW2: you including any notion at this level of end of the input?
22:10:06 mark-t: what end of the input?
22:10:23 ~TRWBW2: mark-t: you are using blank spaces for the end of the input, am i interpreting
                 that correctly?
22:10:29 mark-t: yes
22:10:31 ~TRWBW2: k
22:11:18 mark-t: ok, so let's do a quick example: we'll construct a turing machine that
                 recognizes the language L given by
                 L = {u: u has the same number of 'a's and 'b's}
22:12:39 mark-t: intuitively, one way you might solve this problem, is to look at your string
                 and look for an "a" or "b"; when you find an "a", you cross it off, find a
                 "b" and cross it off; if you found a "b", you'd cross it off, look for an
                 "a", and cross it off; then, you'd repeat
22:12:52 mark-t: that's more or less what the machine will do
22:13:28 mark-t: but it needs to do it in a systematic way
22:14:02 mark-t: so what we'll actually do is the following: find the first letter that isn't
                 crossed off, cross it off, then find the first letter of the opposite kind,
                 cross it off, then go back to the beginning
22:14:23 mark-t: the machine will terminate when it can't find any letter to cross off
22:14:50 ~^C: !
22:14:50 mark-t: so, let's build this thing
22:14:52 mark-t: yes?
22:15:16 ~^C: How it could determine if there's no letter to cross off?
22:15:34 mark-t: it will run into a blank before it runs into an "a" or "b", whatever it's
                 looking for
22:16:57 ~^C: So there is blank on the left side of tape ("before" input)?
22:17:03 mark-t: yes
22:17:40 mark-t: ok, so, we'll start off in state q_0; we read the initial blank, leave it
                 alone, and go to the right; we'll call this new state q_f, since at this point,
                 if we encountered another blank, the string would have 0 "a"s and 0 "b"s, so
                 we want to accept it here -- f stands for final
22:18:51 mark-t: now, from here, if we read an "a", we want to replace it with an X, then move
                 to the right and start looking for a "b"; we'll call that direction q_a; we'll
                 do the same thing for "b" and call that one q_b
22:19:36 mark-t: now, from q_a, we want to skip past any "a"s we see, so we'll have a loop on
                 this state that keeps going right
22:20:09 mark-t: if we read a "b", then we want to replace it with an X and go back to the
                 beginning of the string
22:20:53 mark-t: it turns out that we can reuse state q_0 for this; just add loops on it to
                 skip past any "a" and skip past any "X"
22:21:31 mark-t: we'll do something similar for state q_b, so we'll also add a loop on q_0 to
                 skip past any "b"
22:22:06 mark-t: once we get back to the blank at the beginning of the string, q_0 already tells
                 us we should leave it there and go to the right, back to state q_f
22:23:07 mark-t: at this point, we want to find the next "a" or "b"; in particular, that means
                 we want to skip the X we know we're currently looking at
22:23:24 mark-t: we'll add a loop to q_f that skips any X it finds
22:24:31 mark-t: now, if it runs into a blank, that means it's done, and there were an equal
                 number of "a"s and "b"s, so we'd just stay in q_f and accept
22:24:55 mark-t: if we didn't want to accept there, we could read one blank and go to a dead
                 state that just rejects on the next blank
22:26:00 mark-t: so anyway, this now all repeats; if we found an "a" or "b", we'd replace it
                 with an X and look for the next one; this time, though, we might have to ignore
                 some X on the way, so we add loops to q_a and q_b to ignore any X it finds
22:26:23 mark-t: and then we're done; here's the result:


22:27:11 mark-t: the edge transitions look like "letter being read" / "letter being written"
                 "L(eft) or R(ight)"
22:28:34 |Steve|: !
22:28:37 mark-t: yes?
22:28:49 |Steve|: That will move left immediately and then reject.
22:29:00 mark-t: nope, the first letter on the tape is a blank
22:29:14 |Steve|: Oh, that's nonstandard.
22:29:20 |Steve|: Is B blank?
22:29:23 mark-t: yes
22:29:38 |Steve|: Okay.
22:29:49 mark-t: it's the way I was taught, which was from Sudkamp's "Languages and Machines"
22:30:48 mark-t: I find it a useful convention, as this would require much more care if the
                 blank wasn't there
22:35:55 |Steve|: All of the minor variations are easily shown to be equivalent.

22:36:34 mark-t: ok, so non-deterministic turing machines are very analogous to NFAs; if
                 there's more than one edge from any state that reads the same symbol, the
                 machine is deterministic; the non-deterministic turing machine accepts if 
                 there's any possible computation that accepts the string
22:37:00 mark-t: er, "the machine is non-deterministic"
22:37:59 mark-t: I won't go into it this week, but again deterministic and non-deterministic
                 turing machines accept the same set of languages; basically, it's because the
                 deterministic one is powerful enough to emulate the non-deterministic one
22:38:34 mark-t: so, we need one more notion, and that's the time complexity of a turing
22:39:17 mark-t: if M is a turing machine, then tc(M) is a function of the natural numbers to
                 itself, computed as follows:
22:40:26 mark-t: given a string u, let t_M(u) be the number of transitions it takes for the
                 machine M to accept or reject u
22:40:51 mark-t: then, tc(M) is the function that takes n to the maximum of t_M(u) over all
                 strings u of length n
22:41:27 mark-t: and, for this to be defined, we're only considering machines that halt on all
22:41:44 mark-t: now, that's for a deterministic turing machine
22:42:10 mark-t: for a non-deterministic, we let t_M(u) be the maximum number of transitions
                 in all possible computations of u
22:43:32 mark-t: with this, we can define a class of languages P by
                  P =  {L: there exists a deterministic turing machine M accepting L
                          such that tc(M) is O(f) for some polynomial f}
22:44:14 mark-t: similarly, we define
                  NP = {L: there exists a non-deterministic turing machine M accepting L
                          such that tc(M) is O(f) for some polynomial f}
22:44:27 mark-t: then, the P vs. NP problem asks whether or not P and NP are the same
22:45:16 mark-t: so, I think I'll end there, and leave the rest for next week
22:45:32 vixey: thanks mark-t! very interesting
22:45:48 Cale: :)
22:46:18 ~^C: P is subset of NP for sure. ;)

22:30:46 ChanServ changed the topic of #mathematics to: Next Seminar: Properties of Turing
                  Machines and NP-Completeness by mark-t, 12 October 20:00 UTC