This seminar by TRWBW took place on 28 September 2008 20:00 UTC, in #mathematics channel of

The timestamps displayed in the IRC text are UTC-5.
Any explanation or errors regarding this seminar should be recorded in the discussion page.

Topic Edit

A discussion of deterministic finite automata and basic theory of computation.

Seminar Edit

15:01 < TRWBW> maybe give it a minute for stragglers, but the the seminar is
               going to be an introduction to deterministic finite state
               automata, from scratch
15:02 < TRWBW> so i'm going to start with an informal description of
15:03 < TRWBW> then modify it to get into a form that can be modelled with a 
               DFA (deterministic yada ...)
15:03 < TRWBW> then give a formal definition and show some examples
15:03 < TRWBW> um, anyone here who *hasn't* seen DFA's before?
15:04 < Kasadkad> I haven't seen enough to really know what they are
15:06 < TRWBW> okay, well i guess i'll start then. i'm starting very basic, i 
               can speed up slow down if asked. kinda the advantage of a live 
               talk is that you can ask me to do stuff, so might as well use 
15:06 < TRWBW> so an informal description of a particular computation: adding 
               up a list of numbers
15:06 < TRWBW> i'll describe it as, another person has a list of numbers
15:06 < TRWBW> you ask them for numbers one at a time, integers, they give you 
               the next on the list
15:07 < TRWBW> you have a pad of scratch paper, so you start with a total of 0, 
               and add each number as you get it to get a new total
15:07 < TRWBW> when you ask for the next number and the guy says there are no 
               more, you tell him the total
15:08 < TRWBW> the DFA model has some similarities to that, but some 
               distinctions, even informally
15:08 < TRWBW> the similarities are that a DFA processes a sequence of inputs,
               one at a time, in steps, and at the end produces the output, and 
               what it does on each step is completely determined mathematically
15:09 < TRWBW> the main difference is in the concept of state
15:09 < TRWBW> in that informal example, your state is your pad of paper. and 
               as i described it, it could store anything you wanted. in 
               particular, there was no problem with the partial total getting 
               too big to write down on your scratch pad
15:10 < TRWBW> its state in the sense that if the process was interrupted, you 
               took a break or something
15:11 < TRWBW> you could come back and pick up where you left off. the sub 
               total on your paper is all the information you need from the 
               preceeding part of the computation to continue it to the output
15:11 < TRWBW> a DFA is by definition only allowed state that takes on one 
               value from a finite set. that's the "finite state" in 
               deterministic finite state automata
15:12 < TRWBW> another distinction is that while adding up takes inputs from an 
               infinite set, the set of integers, DFA's take inputs from again 
               a finite set (a different one)
15:13 < TRWBW> a third distinction is that while adding produces a number as an
               output, DFA's are very limited in the output they produce. they 
               produce a single "yes" or "no" value, or using the usual 
               terminology, "accept" or "reject"
15:13 < TRWBW> so they are a very simplified model of computation, but useful 
               because that makes them easier to define and prove theorems 
               about. they also serve as a building block for more general 
15:14 < TRWBW> to make my informal example more DFA like, i'll change it to 
               adding up one digit numbers, in fact more restricted, 0's and 
               1's, so that the input is now a sequence of values from the
               finite set {0,1}
15:15 < TRWBW> that set {0,1} is called the alphabet of the DFA. in fact all 
               the DFA terminology has a bit of a bias towards text, but it's 
               just a finite set.
15:16 < TRWBW> to make the output "accept" or "reject" i'll change it to the 
               DFA determining whether the list sums to a number that is odd. 
               if it's odd, "accept". if it's even, "reject". i'll include the 
               case of an empty list of numbers as "even"
15:17 < TRWBW> those changes allow the DFA to get by with only a finite number 
               of values of state, or "states". it only has to remember whether
               the previous subtotal was even or odd to get the final answer.
15:17 < TRWBW> so a DFA that does this can be represented graphically as:


15:18 < TRWBW> kilimanjaro, mopped: make sense so far?
15:18 < mopped> yes
15:20 < TRWBW> i'll introduce some terminology. the input to a DFA is a finite, 
               ordered sequence of values from its alphabet set, here {0,1}. in 
               the context of DFA's they are called strings, and usually 
               written without commas. so 010011 would be the sequence 
15:20 < mopped> or rather, does the double circle around 'odd' mean anything 
15:20 < TRWBW> the symbol e will denote the "empty string", a sequence of no 
               values. empty strings are perfectly valid, they show up all the 
15:20 < TRWBW> k, yes
15:21 < TRWBW> those circles are the states of the DFA. two of them have been 
               specially marked, one has an arrow from nowhere to it. that's 
               the "start state". another has a double circle, that means its 
               an "accepting state".
15:22 < TRWBW> you always have one and exactly one start state. any state, 
               including the start, can be marked as accepting or not. here i 
               just happened to have one, and it's different from the start, 
               but that's accidental. you can have every state accepting, no 
               state accepting, or anything in between.
15:23 < TRWBW> a diagram like that fully defines the DFA. given one, you can 
               determine whether the DFA accepts or doesn't accept a particular 
               string by simulating it based on the diagram
15:24 < TRWBW> for example, for the string i gave 010011, you would start in 
               the state with the arrow from nowhere, the start state, and
               since the first symbol is 0, you could follow the arrow labeled 
               0 back to the start state "even"
15:24 < TRWBW> then next the 1 takes you from "even" to
15:24 < TRWBW> then next the 1 takes you from "even" to "odd" along the arrow 
               labeled 1
15:25 < TRWBW> then 0 takes you from "odd" to itself, so you stay in odd, then 
               again for the next 0, then a 1 takes you back to even, and the 
               last 1 takes you to odd
15:25 < TRWBW> now odd has a double circle, which means its an accepting state,
               so the DFA accepts the string 010011. which it should, since 
               0+1+0+0+1+1=3 is odd
15:26 < TRWBW> i've given the requirement that it have a start state, and that 
               every state is either marked as accepting or not. the last 
               requirment is on the arrows
15:27 < TRWBW> every state must have an arrow leading from it for every symbol 
               in your alphabet. as you can see in the diagram, those arrows 
               are allowed to lead back to the state they come from, but to be 
               a DFA you are forbidden from either having a state which for 
               some symbol doesn't have an arrow out (labelled with that 
               symbol), and forbidden from having a state where for some symbol 
               it has more than one arrow out (labelled with that symbol)
15:28 < TRWBW> i'll give another example, before i move on to formal 
               definitions, this one with an alphabet of {a,b} instead of {0,1}.


15:29 < TRWBW> mopped: give it a try, would that DFA accept the string abba ?
15:31 < mopped> i'm going with no, as it lands on qa
15:31 < kilimanjaro> (Gotcha so far, I was wrangling some tacos earlier.)
15:31 < TRWBW> yes
15:32 < TRWBW> okay, for any DFA, and any particular string, you can say 
               unambiguously whether it accepts or doesn't accept that string.
15:33 < TRWBW> that lets you talk about the *set* of strings that the DFA 
               accepts. again in keeping with the bias towards text 
               terminology, that's called the "language" of the DFA
15:33 < TRWBW> perhaps a bad choice of terminology, but that's what everyone 
               uses. but all it is mathematically is some set of strings over
               the alphabet
15:34 < TRWBW> here's another


15:35 < TRWBW> mopped: maybe take a look at that and see if you can figure out 
               an informal description of what its language is
15:37 < TRWBW> if anyone new to DFA is 
               following, suggest you take a moment to try and figure out what 
               the language of that DFA is, the last one
15:38 < TRWBW> you could start with some examples, does it accept the empty 
               string? what about: a,b,ab,ba,aa,bb
15:38 < mopped> Uh, it begins/ends with the same symbol?
15:38 < TRWBW> yup
15:39 < TRWBW> i'll give a formal definition of the DFA to make it more mathy, 
               but basically the diagrams define them already, so nothing new 
15:41 < TRWBW> so to summarize, i've defined a structure for which you can say 
               whether it accepts or rejects a particular string. this means
               the structure defines a language (set of strings) that it 
               accepts. the rest of the talk will be about some theorems about 
               what kinds of languages can be defined by a DFA (accepted by the 
               DFA), and what kind can't
15:42 < TRWBW> so the standard 5-tuple definition of a DFA, a DFA 
               M=(Q,E,t,q0,A) where Q is a finite set (states), E is another 
               finite set (alphabet), t is a transition function t:QxE->Q (the 
               arrows), q0 is a member of Q (the start state), and A is a 
               subset of Q (the accepting states, marked with double circles on 
               the diagrams)
15:44 < TRWBW> a string is a finite sequence of symbols from the alphabet E. E* 
               denotes the set of all possible strings over E, and L(M), a 
               subset of E*, is the language (again, just a set of strings) 
               that M accepts
15:44 < TRWBW> mopped: does this kind of terminology make sense to you?
15:45 < mopped> not at all, but go on
15:46 < mopped> I understand what you're talking about, I'm just a bit wiery of 
                the M = () part
15:47 < mopped> wiery/wary :p
15:47 < TRWBW> mopped: it just says those are the parts of it
15:47 < mopped> ok
15:47 < TRWBW> mopped: if you have those 5 things, those define your DFA.
15:47 < TRWBW> mopped: since i think you already get the idea, the formalism 
               isn't a big idea, but it's traditional, so i'll finish it for 
               forms sake
15:48 < mopped> sure
15:48 < TRWBW> the function t is the key part of the DFA, it corresponds to the 
               arrows in the diagrams. so t(q1,s)=q2 means that there is an 
               arrow labeled s from q1 to q2
15:49 < TRWBW> the reqirement i gave before, one and only one arrow for each 
               state and symbol, corresponds to t being a valid function
15:49 < TRWBW> you can then extend t inductively from s being a single symbol, 
               0 or 1 or a or b or such, to w a string of symbols
15:50 < TRWBW> you start with e, the empty string, "", and define t(q,e)=q. 
               then for a string, you define 
15:51 < TRWBW> which is a bit of abuse of notation, but it just means that 
               t(q,w) is the state you would end up if you started in state q, 
               and followed the arrows corresponding to the symbols of w
15:52 < TRWBW> so then L(M)={w in E*: t(q0,w) in A} becomes the formal 
               definition of the language of M. it's the set of strings, from 
               the universe of all possible strings over the alphabet, such 
               that if you start in the start state, and follow the arrows, you 
               end up in a state that's in the set of accepting states.
15:54 < TRWBW> mopped: so i guess defining DFA's is done, so after that comes 
               theorems. there are a couple that would make sense at this 
               point. one is proving that if you have machines M1 and M2, there 
               exists a machine M3 such that L(M3)=L(M1) u L(M2), where that u 
               is set union. another is something called the pumping lemma, 
               that lets you prove there are some languages for which no DFA 
15:55 < TRWBW> mopped: of could do something ambitious, like define regular 
               expressions -- you might have seen if you've programmed in perl 
               or such -- and prove that any DFA corresponds to a regular 
               expression with the same language.
15:55 < TRWBW> mopped: any of those sound worthwhile?
15:55 < mopped> I've read about the latter (pumping lemmas)
15:55 < subconscious> !
15:56 < mopped> Something about palindromes having no DFA? The proof the book 
                provided was a bit over my head, but then agian I wasn't too 
                sure on my finite automata then :P
15:56 < TRWBW> mopped: yes, could do over that again.
15:57 < subconscious> great I am on /ignore
15:57 < TRWBW> subconscious: did you say something i missed?
15:59 < kilimanjaro> Although we aren't really taking a poll, I'd like to hear 
                     more about a) argument that there are languages not 
                     accepted by dfa, and b) link between dfa and regexp.
16:02 < TRWBW> kilimanjaro: the previous is just pumping lemma, and then you 
               can demonstrate plenty of languages, palindromes, a^n.b^n for n 
               natural, etc.
16:03 < kilimanjaro> lets go for it
16:03 < TRWBW> kilimanjaro: the latter there is a short proof using generalized 
               non-deterministic finite state automata, but i prepared a proof
               non-deterministic finite state automata, but i prepared a proof 
               that for any M, there is a regular expression for L(M) that 
               doesn't require any of the later machinery
16:04 < TRWBW> it's not short in the sense there are a bunch of cases, but they 
               are all similar, so after the first case the rest become easy to 
               figure out
16:04 < TRWBW> anyways, your call, you seem to be the audience ;)
16:04 < mopped> I don't mind either way, both seem interesting
16:05 < TRWBW> mopped: well have you seen regular expressions before? if you 
               program they are popular for text processing, so you might have
16:05 < kilimanjaro> well I'm most curious about the link between regular 
                     expressions and dfa, it gets a little bit less mathy but I 
                     have heard people say things like "programming language x 
                     has regular expressions but not in the sense of formal 
                     languages", etc
16:05 < kilimanjaro> so it would be interesting
16:05 < mopped> yeah I have
16:06 < TRWBW> okay well to summarize formally, you are talking about building 
               up languages starting with some basic languages and then 
               combining them with operations
16:07 < TRWBW> the basic languages are the empty language, the language
               containing the empty string (those are different), and languages 
               containing a single string of a single symbol
16:07 < TRWBW> so {},{e}, and the languages {s} for all symbols s in your 
               alphabet E
16:07 < TRWBW> mopped: does that terminology work for you?
16:08 < mopped> yeah
16:08 < TRWBW> the operations are union, i'll use | for that, concatenation, 
               i'll just . for that or just write them next to each other, and 
               the star operator *
16:09 < TRWBW> union is just set union, L=L1|L2 means L={w: (w in L1) or (w in 
               L2)}. so one or the other (or both)
16:10 < TRWBW> concatenation for strings means one starts at the end of the 
               other, so abcd.efg=abcdef, just join them. for languages you 
               join each string from the first with every one from the second
16:10 < TRWBW> so {e,a,abcd}.{cc,b}={cc,acc,abcdcc,b,ab,abcdb}
16:11 < TRWBW> formally L=L1.L2 means L={w: exists strings w1 in L1 and w2 in 
               L2 such that w=w1.w2}
16:12 < TRWBW> L* is a bit trickier if you haven't seen it. you can define it 
               as L*={}|L|(L.L)|(L.L.L)|.....
16:12 < TRWBW> ugh typo
16:12 < TRWBW> L* is a bit trickier if you haven't seen it. you can define it 
               as L*={e}|L|(L.L)|(L.L.L)|.....
16:13 < TRWBW> so for L={a,bb}, L*={e,a,bb,aa,abb,bba,bbbb,...} any string you 
               can get by concatenating strings from L onto each other, 
               including the empty string
16:14 < TRWBW> mopped: so if i wrote: {a}((b,c}*|{aa}) would you be able to 
               describe that language?
16:14 < TRWBW> typo again
16:14 < TRWBW> mopped: so if i wrote: {a}({b,c}*|{aa}) would you be able to 
               describe that language?
16:15 < mopped> hmm
16:15 < TRWBW> kilimanjaro: should check with you too, but assumed you had seen 
               that before
16:16 < kilimanjaro> TRWBW, yea I am familiar with it, just refraining from 
                     answer so as to not spoil the fun for others
16:17 < mopped> {a, b, c, bc, cb, bbc, bcb, cbc, ccb, .., aa}?
16:17 < mopped> describing it eh, not so much
16:18 < kilimanjaro> mopped, maybe try parsing it into sort of a prefix form, 
                     you got (concat {a} (or (* {b,c}) {aa})), so from the 
                     outermost operation you know at the very least that every
                     string starts with an a
16:19 < mopped> ahhh yeah
16:20 < TRWBW> mopped: it ends up being {a,ab,ac,abb,abc,acb,acc,...,aaa}
16:20 < mopped> so every string starts with a, ends with aa and then has b and 
                c inbetween
16:20 < mopped> oh :|
16:20 < TRWBW> mopped: no, the |, the or, means it either ends with b's and c's 
               mixed however, or it ends with aa
16:20 < mopped> aha
16:20 < mopped> i understand
16:21 < TRWBW> i'm gonna simplify the notation a bit, like most people do, and 
               start using a to mean, when it's in the context of a language 
               {a}. same with e, if it comes up.
16:21 < TRWBW> so that would be a({b,c}*|aa)
16:22 < TRWBW> mopped: here's another one, just cause you want to get a sense 
               before we move on, how about (a{a,b)*a)|(b{a,b}*b)
16:23 < mopped> everything starts/ends with the same letter? aa, bb, aba, bab, 
16:23 < TRWBW> mopped: yup, which is the same as


16:24 < TRWBW> so regular expressions and DFA's both are ways of defining a 
               language. kinda flip sides, one gives you a way to construct 
               strings, one gives you a way to check them.
16:24 < TRWBW> given a DFA it may not be immediately obvious how to generate 
               all the strings it accepts, but with a regular expression it is
16:25 < TRWBW> and vice versa, given a regular expression it may not be obvious 
               whether it generates a given string, but with a DFA easy to check
16:26 < TRWBW> it turns out, perhaps surprisingly, that as methods of defining 
               languages they are equivalent. for any regular expression there 
               is a DFA, and for any DFA there is a regular expression, so that
               the DFA accepts precisely the same language as the regular 
               expression generates
16:27 < TRWBW> so i'll give a proof of one way, that for any DFA, there is a 
               regular expression that generates the languages it accepts. it 
               will actually be a proof in the sense of a method (although not 
               a very effecient one) that you could apply to generate the 
               regular expression from the DFA
16:28 < TRWBW> mopped: still with me? at least in terms of understanding what 
               the issue is and what i'm going to be attempting to prove?
16:28 < mopped> yeah
16:29 < TRWBW> okay, are you better with me talking about in terms of diagrams, 
               or in terms of the transition function t(q,w) i defined formally 
16:32 < TRWBW> kilimanjaro: k, gonna start the actual proof now
16:33 < kilimanjaro> rgr
16:34 < TRWBW> so given a machine M, and its language L(M), i'm going to 
               generate a regular expression for L(M) in terms of other 
               languages, actually a whole bunch of them, in such a way that
               they all correspond to L(M') where M' is some minor modification 
               of M
16:34 < TRWBW> more importantly i'm going to do it so that M' always has one 
               fewer states than M
16:34 < TRWBW> so it's an inductive proof, or if you implemented it, a 
               recursive algorithm
16:35 < kilimanjaro> ok
16:35 < mopped> sure
16:36 < TRWBW> to simplify life later on, i'm going to assume that all the 
               DFA's have (at least one) dead-end state. i'll define a dead end 
               state as one with the following properties: 1) it's not the 
               start state, 2) its not an accepting state, 3) all transitions 
               from it lead back to it
16:36 < TRWBW> so when you get to a dead end state, no matter what happens 
               later the DFA will never accept the string being processed
16:37 < TRWBW> given any DFA you can always add a dead end state without 
               affecting anything, just draw a circle disconnected from the 
               rest of the DFA and make all its transitiosn loop back to it
16:37 < TRWBW> i'll give a DFA so i can talk about it, sec


16:39 < TRWBW> that's actually a DFA that accepts any string that is either e, 
               the empty string, or that starts with a different {a,b} then it 
               ends. if you want to think about the language. doesn't matter, 
               just there to be able to describe stuff.
16:39 < TRWBW> okay, now i'll describe the kinda "surgery" i meant
16:41 < TRWBW> call that M, by definition L(M)={w : t(q0,w) in {q0,qa_b,qb_a}}. 
               say you wanted L={w: w in L(M) and w does not visit node qa_a}
16:41 < TRWBW> call that M, by definition L(M)={w : t(q0,w) in {q0,qa_b,qb_a}}. 
               say you wanted L={w: w in L(M) and w does not visit node qa_a
16:42 < TRWBW> ugh i'll just say it words. when you start from a state q, and 
               apply a string w, the DFA goes through a sequence of states to 
               get to some final state p
16:42 < TRWBW> you can talk about whether that sequence of states visits some 
               other state r or not
16:43 < TRWBW> so that gives you another language you can define based on a DFA 
               M, the ones that are accepted (from start state) without 
               visiting some forbidden state q
16:43 < TRWBW> that make more sense?
16:43 < kilimanjaro> yes
16:44 < TRWBW> anyways, you can modify M to get an M' in the following way. 
               every transition that used to go to q, instead send it to the 
               dead state
16:44 < kilimanjaro> so this accepts the new restricted language?
16:44 < TRWBW> then L(M') is precisely that of M with state q forbidden
16:44 < TRWBW> yup
16:45 < TRWBW> so another modification, i defined L(M) in terms of {w: t(q0,w) 
               in A}. instead say you want all the strings, given two states 
               q,p such that t(q,w)=p
16:46 < TRWBW> you can do that by generating an M' from M where you change the 
               start state to q, and the accepting set to just be {p}
16:47 < mopped> when you say in A and in {q0,q_ab} etc, does that just mean 
                t(q0, w) = A, ie a string that takes it from start state to an 
                acceptance state?
16:47 < TRWBW> now neither of these modifications change the set of states, so 
               the size of M' (measured as number of states) stays the same
16:47 < TRWBW> yeah except its not really =, it's membership. you can have more 
               than one accepting, like the one i gave last does


16:47 < mopped> ok, so t(q0, w) = a member of A, got you
16:48 < TRWBW> okay, gonna try to go a bit faster, stop me if i stop making 
16:49 < TRWBW> the overall goal is to convert a DFA into a regular expression. 
               i'm going to just prove exisistance by induction on the size of 
               the machine, defined as number of states. but it will be 
               constructive, so you could make that an algorithm if you wanted.
16:49 < TRWBW> i'm assuming all machines have a dead state. that's not a 
               limitation, because you can add one to M to get M', then 
               L(M)=L(M') and if L(M') has a regular expression, so does L(M)
16:50 < TRWBW> first the base cases. they will actually be all DFA's with two 
               nodes, since under this restriction the smallest a DFA can get 
               is a start state and a dead state.
16:50 < TRWBW> so for such a machine, first see if the start state is 
               accepting. if it isn't your machine has no accepting states and 
               its language is {},  the empty language
16:51 < TRWBW> if the start state is accepting, then look at the symbols in 
               your alphabet. for each s, its transition either goes to the 
               dead state or back to the start state (which is accepting)
16:51 < TRWBW> so in that case, L(M)=(s1|s2|..)* where s1,s2,.. are the symbols
               whose transitions loop back to the start state.
16:52 < TRWBW> so with me on the base cases being covered?
16:53 < kilimanjaro> yea
16:54 < mopped> yes
16:55 < TRWBW> okay, not for any machine M, i'm going to start by splitting 
               L(M)=(L_loop)*.L_finish, where i'll define L_loop as the 
               language of strings that, when you start at the start state, 
               return to the start state (once, multiple loops are taken care 
               of by the * after), and L_finish is the language of strings 
               that, again starting from the start state, are accepted 
               *without* ever returning to the start state
16:55 < TRWBW> that's the key idea, so i'll pause again to give you time to 
               read it
16:55 < TRWBW> is it clear that those two languages are well defined, and that 
               L(M)=(L_loop)*.L_finish ?
16:56 < kilimanjaro> yes
16:57 < mopped> yes
16:58 < TRWBW> you can get a machine M' for L_finish by the surgery of
               forbidding the start state. that is, take all transitions that 
               go back to the start state and instead reroute them to the dead 
               state. that doesn't help immediately though, because you can't 
               induct yet, since M' has the same number of states as M had
16:59 < TRWBW> the next idea is that you break it down based on what the first 
               symbol of a string accepted by M' is.
17:00 < TRWBW> first you have to special case the empty string. so if the start 
               state is accepting, then e is in L_finish, and the regular 
               expression for that case is {e}
17:01 < TRWBW> non-empty strings must start with some symbol s. for each s, you 
               do the following, let q be the state that you get to from start 
               by s. q isn't the start state, since we've surgically removed 
               transitions to start
17:01 < TRWBW> now make M'' by make q the new start state, and removing the old 
               start state from the state set.
17:02 < TRWBW> i'm claiming s.L(M'') is the language {w in L_finish: w starts 
               with s}
17:02 < TRWBW> ..pauses..
17:03 < kilimanjaro> sounds right to me
17:04 < TRWBW> in that last DFA say, 
               you have a transition on a from q0 to qa_a, so you would make 
               qa_a the start state, remove q0, and then the machine accepts 
               exactly those strings that reach an accepting state from qa_a 
               without going through the start state
17:04 < TRWBW> (that was just an example of what i was saying, not something 
17:05 < TRWBW> okay, so by induction L(M'') has a regular expression, call that 
               R_s since it depended on s. that is because by removing a state, 
               it's now smaller, and we are inducting on that.
17:06 < TRWBW> now L_final=R_s1|R_s2|...  where s1,s2,.. is the whole alphabet 
               (but still finite), with |{e} stuck in too if machine accepts 
               the empty string (i.e. original machine had an accepting start 
17:08 < TRWBW> the construction for L_loop is very similar. remember L_loop={w 
               | t(q0,w)=q0, and t(q0,p)!=q0 for any prefix p, 0<|p|<|w|}
17:09 < kilimanjaro> yup
17:09 < TRWBW> so cases of strings of length 1 are self loops on the start 
               state, for any of them you have the regular expression {s}, 
               where t(q0,s)=q0
17:11 < TRWBW> for strings of greater length, you have some a first symbol of 
               the string, s1, and last symbol, s2, and states q1 and q2, with 
               t(q0,s1)=q1, and t(q2,s2)=q0, and the middle part of the string
               transitions from q1 to q2
17:13 < TRWBW> for any such pairs q1 and q2, where q0 has a transition s1 to 
               q1, and q2 has a transition s2 to q0, you get s1.L(M').s2, where 
               M' is the machine you get by the surgery of making q1 the start 
               state, q2 the accepting state, and removing the old start state
17:14 < TRWBW> again, that's smaller and you can induct, again, you | together 
               all possibilities
17:14 < TRWBW> qed i guess
17:15 < kilimanjaro> cool beans
17:15 < kilimanjaro> and the map from regexp -> dfa seems like it would be an 
                     easier argument
17:16 < |Steve|> kilimanjaro: It's actually not quite so easy to do it directly.
17:17 < kilimanjaro> what is the general argument?
17:17 < TRWBW> kilimanjaro: really you want to do NFA->DFA first, and then it 
               becomes very easy to do regexp->NFA->DFA
17:17 < |Steve|> R1R2 is easy, R1|R2 is easy, the base cases are easy, the only 
                 one that isn't so easy is R1*.
17:17 < kilimanjaro> what is NFA?
17:18 < TRWBW> kilimanjaro: non-deterministic. i think mark-t is going to do 
               that one.
17:18 < kilimanjaro> ahh just looked it up, instead of the transition giving 
                     you a state it gives you a set of states
17:19 < TRWBW> yeah
17:19 < TRWBW> anyways, i think that's enough DFA for one seminar. gonna go get 
               some dinner ;)
17:20 < kilimanjaro> thanks for the talk
17:20 < mopped> yeah, thanks
17:20 < TRWBW> np ;)
17:20 < |Steve|> Good talk.
17:20 < TRWBW> thx