Here we explain how two-level rules work, how
they can be implemented as finite-state machines, and
all the types of rule constraints can be translated
into finite-state tables.  We then summarize the rule
semantics.  This is followed by a detailed discussion
of rule conflicts; specficity and conflicts amongst
SUBSETS; and finally, and explanation of the rule
file format and the rules
in the pc-kimmo file english.rul.

It's a lot to read through... but I hope, complete,
and will guide you through Spanish.

1. How two-level rules work.

Consider Rule 2  (R2) below.

R2   t:c ==> ____i

The operator ==> means that lexical t is realized as a surface c only
(but not always) in the environment preceding i:i.

The correspondence t:c declared in R2 is a special correspondence.
All two-level descriptions must also contain a set of *default*
correspondences, such as t:t, i:i, etc. (This is the so-called
"BOGUS RULE" - it isn't really bogus, it is a default.)
The sum of the special and default correspondences are the total set
of valid correspondences or feasible pairs that can be used in the
description. 

If a two-level description containing R2 (and all default
correspondences) is applied to the lexical (underlying) form "tati"
(without the quote marks) PCKIMMO proceeds as follow to produce the
corresponding surface form(s).  (NOTE this is why you can use
GENERATE without a dictionary and JUST the .rul file)

Beginning with the first character of the input form, it looks to see
if there is a correspondence declared for it. Due to R2, it will find
that lexical t can correspond to surface c, so it will begin by
positing that correspondence.

Lexical:   t   a     t   i
           |   |     |   |
Rule:     R2
           |
Surface:   c

At this point the generated has entered R2. For the posited t:c
correspondence to succeed, the generator MUST find an i:i
correspondence next - that is what R2 says. When the generator moves
on to the second character of the input word, it finds that it is a
lexical a, and thus R2 FAILS, so the generator must back up, undo
what it has done so far, and try to find a different path. Backing up
to the first character t, it now tries the DEFAULT correspondence t:t
(which is guaranteed to succeed, since it has NO conditions):

Lexical:   t   a     t   i
           |   |     |   |
Rule:     R2
           |
Surface:   t

The generator now moves on to the second character. No correspondence
for lexical a has been declared other than the default, so the
generator posits a surface a:

Lexical:   t   a     t   i
           |   |     |   |
Rule:     R2   |
           |   |
Surface:   t   a

Moving on to the third character, the generator again finds a lexical
t, so it posits a surface c and enters R2 again:


Lexical:   t   a     t   i
           |   |     |   |
Rule:     R2   |    R2
           |   |     | 
Surface:   t   a     c

Now the generator looks at the fourth character, a lexical i. This
SATISFIES the environment of R2, so it keeps the i (NOTE that the
constraint refers only to a surface i, and says nothing about the
lexical, underlying character):

R2   t:c ==> ____i

Since the context of R2 requires an i, the generator must also posit
a surface i, so it does, and exits R2.  NOTE that by the time R2 is
finished, TWO characters will have been posited.

Lexical:   t   a     t   i
           |   |     |   |
Rule:     R2   |    R2   |
           |   |     |   |
Surface:   t   a     c   i

Since there are no more characters in the lexical form, the generator
outputs the surface form "taci".  However, the generator is not yet
done. It will continue backtracking, trying to find alternative
realizations of the lexical form.  First, it will undo the i:i
correspondence of the last character of the input word, then it will
consider the third character, lexical t.  Having already tried the
correspondence t:c, it will try the default correspondence t:t:

Lexical:   t   a     t   i
           |   |     |   
Rule:     R2   |     |   
           |   |     |   
Surface:   t   a     t   i

Now the generator will try the final correspondence and succeed,
since R2 does NOT prohibit t:t before an i (rather, it prohibits t:c
in any environment EXCEPT BEFORE i). It will then output "tati".
The reader may confirm that no other outputs will be generated.

2. The ==> rule as a finite-state machine.

A key insight of PCKIMMO is that if phonological rules are written as
two-level rules, they can be implemented as FST's running in
parallel. In the next 4 sections we briefly show how each of the four
rule types (==>, <==, <==>, and \<==) translates to an FST.
We then go on to describe conflicts in SUBSETS, and RULES.

2.1 A ==> rule.  
Consider rule R2 again.

A possible paraphrase is, If ever the correspondence t:c occurs, it
must be followed by i:i. In other words, if anything OTHER THAN t:c
occurs, this rule ignores it.  This must be incorporated into our
two-level FST, call this T2 (for table 2)
  
    t   i  @
    c   i  @

1:  2   1  1
2.  0   1  0

The @:@ arc means ANY OTHER symbol than t, i, or c, i.
State 2 is a kind of 'default'state that ignores everyting except the
substring crucial to the rule. It is also the only final, accepting
state.

Importantly, the state table is constructed such that the entire set
of feasible pairs in the rule description is partition among the
column headers WITH NO OVERLAP  (this is the source of MANY bugs in
Kimmo rule systems). T2 specifies the special correspondence t:c and
the environment in which it is allowed. (the machine goes to state 2
to anticipate that an i:i comes next - if it does, success, and goes
to state 1; if not, it goes to state 0, the rejecting state.)

The column header @:@ in T2 matches ALL the feasible pairs that are
defined by ALL THE OTHER FSTs of the system - thus saying that R2
'takes a pass' and doesn't care about any other feasible pairs. So,
with respect to T2, @:@ does not stand for all feasible pairs,
rather, all feasible pairs except i:c and i:i.

The default correspondences of the system must be declared in a
trivial FST like T3:  (also see below where we cover the .rul file
format).   If we assume p, t, k, a, i, u in our alphabet, then we need:

    p  t  k  a  i  u  @
    p  t  k  a  i  u  @

1:  1  1  1  1  1  1  1  
(Table T3)


Even this table of correspondences must include @:@ as a column.
Otherwise, it would fail when it encountered a special correspondence
such as t:c, because all the rule in a two-level description apply in
parallel, and for each character in an input string ALL the rules
must succeed, even if vacuously.  Now, given the lexical form tatik,
T2 and T3 together will generate the surface forms tatik and tacik.

IMPORTANT. To understand how to represent two-level rules as state
tables, we must understand what the rules really mean.  It is a
common tendency to think of them positively, that is, as statement of
where the correspondence succeeds.  IN FACT STATE TABLES ARE FAILURE
DRIVEN, THEY SPECIFY WHERE THE CORRESPONDENCES MUST FAIL.

This point is perhaps THE biggest source of difficulty in building
the FSTs.

In our case above, it is natural to think of R2 as saying that t:c
succeeds when it occurs preceding i:i.  But T2 actually works because
it FAILS when ANYTHING BUT i:i follows t:c.  

2.2   A <== rule.

Now consider R4.

R4  t:c <== ____i

This rule says that lexical t is always realized as surface c when it
occurs before i:i, but NOT ONLY BEFORE i:i.  Thus, the lexical form
tati will successfully match the surface form taci, but not tati.
Note, however, it would also match "caci" since it does not disallow
t:c in any environment.  Rather, its function is to disallow t:t in
the environment following i:i.
Remember that state tables are failure-driven, so the strategy of
writing the state table for R3 is to force it to fail if it
recognizes the sequence t:t i:i.  So the state table for R4, viz., T4,
looks like this:

T4
    t t  i  @
    c t  i  @

1:  1 2  1  1
2:  0 2  0  1

In state 1, any occurrences of the pairs t:c, i:i, or any other
feasible pairs are allowed without leaving state 1.  It is only the
correspondence t:t that forces a transition to state 2, where all
feasible pairs succeed except i:i.  Note that state 2 must be a final
state - this allows all the correspondences to succeed and return to
state 1. Also note that in state 2 the cell under the t:t column
contains a 2.  This is necessary to allow for the possibility of a tt
sequence in the input.  For example, tatti will surface as the form
tatci. This phenomenon is called "backlooping" - more on this below.

Actually T4 is potentially over-specified. It is not really the pair
t:t that is disallowed before i, but rather the pair t:not-c
(lexical t and surface anything but c)   Given that the more specific
correspondence t:c is already in the table, the more general
correspondence t:@ will take care of all the rest of the characters,
including t:t. (I'll leave the details of this to you..)

In summary, the rule type L:S <==E  positively says that L is ALWAYS
realized as S in the environment E.  Thus, it is a kind of OBLIGATORY
rule. Negatively, it says that L is realized as any character but S
is not allowed in E. The state table must be written so that it
forces all correspondences of L with anything BUT S to fail. 

2.3 A <==> rule.

R5  t:c <==> ____i

The state table for a <==> rule is simply the combination of the
tables for ==>  and <==.  You build it by anding the two fst's
together. So here, t:c MUST occur before i, and NOWHERE ELSE.

We next turn to the problem of what can happen when you have
more than one rule - rule conflicts, the use of SUBSETS, and
overlapping character descriptions.

3.0 Writing rules: conflicts, SUBSETS, and character descriptions.

Writing Rule Automata - Part 2

In this part we cover common issues that arise from using Subsets, rule conflicts,
and character descriptions in subsets, as well as use of the word boundary (3), affix
marker (+) and 0 symbols.  We follow this with a detailed look at the rules for English.

Don't worry - after the abstract bit that follows below, we
immediately do an English example to illustrate it.

As a summary for writing rules based on ==> <==, etc.
given some Lexical (L) and Surface (S) correspondence, and the
environment (E) in which it occurs, one can ask yourself:

(a) is E the ONLY environment in which L:S is allowed?

(b) Must L ALWAYS be realized as S in the environment E?  (forgetting
for the moment whether E means left or right context, or both)

There are 4 possible outcomes.  Depnding on the outcome:

(1) If (a) is Yes and (b) is No, the rule is L:S ==> E  
(2) If (a) is  No and (b) is Yes, the rule is L:S <== E
(3) If both (a) and (b) are YES, the rule is L:S <==> E.  (this means
("if and only if" - this is the ONLY place you see this
(correspondence, and it MUST be like this)
(4) If neither is Yes, find the other environments in which L:S is
(allowed, combine these into a single disjunctive environment, and go
through the exercise again. 

IF YOU LOOK AT THE .rul file for English, you will note that there are
a series of ==> and <== rules.  For instance, there are 2 rules for
Epenthesis: one that is written ==>, and one that is written /<==, or
These are rules 3 and 4 i the .rul file.  Indeed, all the English
rules are written in this paired form.

The reason for this is simple: the basic lexical:surface
correspondences may best be stated as <==> ("if and only if" form).
But to write <==> rules as state tables, as we explain for epenthesis and
in general below, we SPLIT them into two parts, one is ==> and the
other is either <== or /<==. 

Let's see how this works for y-i spelling in English (spy+ed/spied), to make the rule
tables, and state the general principles.


Here are paraphrases for the ==>, <==, <==> and /<== forms Remember
that 'rc' denotes 'right context' and 'lc' denotes 'left context' -
this can be a string of symbols.

==>
ONLY BUT NOT ALWAYS: (1) The rule L:S ==> lc___rc means "the expression lc L:S rc is
    allowed, but L:S in any other context is NOT allowed."

<==
ALWAYS BUT NOT ONLY: (2)The rule L:S <== lc__rc  can be
   paraphrased as, "the expression lc L:notS rc is not allowed".  

<==>
ALWAYS AND ONLY:  (3) The rule  L:S <==> lc___rc can be paraphrased as
     "the expression lc L:S rc must be allowed, in this environment
     and only this rc___lc  environment, and no other L:S correspondence is
     allowed in this rc__lc  environment"

/<== NEVER: (4) The rule L:S /<==lc___rc can be paraphrased as
    "the expression lc L:S rc is not allowed".


OK, let's try this for y-i spelling.  We can observe the following for
English, that a lexical y is realized as a surface i after a consonant
and before a suffix:

Lexical forms:  boy+s    spy+0s    spy+ed   happy+ly    spot0+y+ness
Surface forms:  boy0s    spi0es    spi0ed   happi0ly    spott0i0ness

Please note how we use the zeroes to make sure each lexical/surface
form is the same length, a mathematical requirement for composition as
you may recall.  As soon as we have padded out boys into boy0s, that
forces the pairing of +:0.  Once we have done that, it must appear as
such in all the other forms.

Note that in the word 'spottiness' y-i spelling occurs after a
consonant that is inserted by another rule, gemination (consonant
doubling, big-bigger).  This means that y-i spelling applies after a
surface consonant, expressed as @:C.  

OK, on to automaton building.  The rule for y-i spelling therefore
looks like this.  

    y-i Spelling

R5   y:i <==> :C___+:0

NOW, to build the automata, we will take each <==> rule and break it
down into TWO automata, one to handle the ==> part, and one to handle
the <== part.  In general, this is true of ALL the rules in the
english.lex file - they are rules of the form <==> (if and only if),
and are then broken down into two.

Here is the ==> form.


   RULE "6 y:ispelling  y:i ==> :C___+:0"  3  4 

And here is the <== form.  (This is not exactly what's in the
english.lex file, as we shall see - this rule will need to be revised
because it interacts with another rule, Epenthesis (the rule that
inserts an e, as in fox+s/foxes) --- consider try+ing/trying --- here
the i does not get spelled as y.

   RULE "6 y:ispelling  y:i <== :C___+:0"  3 5

We shall start with the table for the ==> portion.

Here is how to map each kind of correspondence into an automaton
table.

First, ==> rules

The strategy for writing a state table for this is to construct a
    table that recognizes the sequence lc L:S rc, FORBIDS any other
    occurrence of L:S, and permits ANY OTHER correspondences to occur anywhere.

The steps are as follows.  Note that they close to algorithmic spec,
so one *could* write a compiler from rules to tables.... (a project
for one or more of you?)

1. To convert each ==> rule a. [Make column headers] Make a list of
   column headers for the fsa by writing down all the correspondences
   used in the expression lc L:S rc (including correspondences with @
   and subset names).  Add @:@ to the end of the list

   b. [Recognize context and L:S string] Beginning with state 1, add
   states (rows) and fill in the state 
   transitions in the appropriate cells in the fsa table to recognize
   precisely the expression  lc L:S rc.  The final symbol in the
   expression normally should result in a transition back to state 1
   (except see step 8 below - there is one exception called 'backlooping'.)

   c. [Mark final, nonfinal states] Mark state 1 with a colon to
   indicate that it is a final state (e.g., 1:).  Mark every state
   that is traversed before L:S is reached as a final state.  Mark all
   states while L:S is being recognized (ie, it is more than 1
   character long) as final).  Use a period to mark all states
   traversed *after* that point as nonfinal states.  That is, once L:S
   is encountered, it is not in the correct environment unless the
   full right context is found, thus these states cannot be final.

   d. [Forbid L:S in any other context] Since L:S in any other
   environment is not allowed, fill in the rest of the COLUMN for L:S
   with zeroes.  Further, in any state traversed during the recognition
   of right context, any correspondences other than those provided for
   in rc means that L:S is in the WRONG context. Thus, the rest of the
   cells for the states traversed in rc should be filled with zeroes.

   e. [Permit any other correspondence] All remaining cells in the
   transition table denote successful transitions as far as THIS rule
   is concerned. In most cases, these cells are filled with
   transitions back to the initial state (state 1) except in the case
   where backlooping applies - that is, a state that represents the
   characters matching the first character, or sequence of characters,
   in the expression lc L:S rc.  Transitions must be specified that
   represent the successful recognition of that character or sequence
   of characters, rather than state 1.   (Why do we have to do this?
   Because if there are TWO left-context starting symbols in a row --
   we want to make sure that we don't fail to recognize the second
   one.  We want to 'loop' until we get to the very last one, which
   would be the real start of the left context.)

OK, let's apply this to our y-i case.

 RULE "6 y:ispelling  y:i ==> :C___+:0"  3  4


 Step a. Form character columns.  If we collect all the symbols we
 need that are mentioned in the context and the L:S itself, we get:
 y:i, @:C, and +:0.  To this we add as usual @:@.

So here is our table so far:

	@  y  +  @
        C  i  0  @
        ----------

  Step b. Recognize the context and L:S string.  We want to write a
  'straight line' fsa that will recognize exactly the string 
  @:C y:i +:0    So this takes three states:
  
  1-->@:C-->2-->y:i-->3-->+0-->1


	@  y  +  @
        C  i  0  @
        ----------
      1 2
      2    3
      3       1

  
   Step c. Mark final and nonfinal states.  State 1 is marked as final
   (as usual).  State 2 is on the way to recognizing the context and L:S
   string, so it is marked final.  What about state 3?  By that time,
   since this marks the last transition to recognizing the context and
   goes back to state 1, a final state, we're done, so state 3 doesn't
   have to be marked as final.  


	 @  y  +  @
         C  i  0  @
         ----------
      1: 2
      2:    3
      3.       1


   Step d.  Forbid L:S in any other context.  Put 0's
   in all of the other places in the column for the L:S pair, i.e.,
   y:i:


	 @  y  +  @
         C  i  0  @
         ----------
      1: 2  0
      2:    3
      3.    0  1


   Put 0's for any other state, correspondene pair after this recognition point, that is,
   for state 3 and beyond (that means we fill in the remaining part of the
   row for state 3 with 0's:

	 @  y  +  @
         C  i  0  @
         ----------
      1: 2  0
      2:    3
      3. 0  0  1  0


   Step e. Allow L:S in all other contexts. Fill in transitions to
   state 1 for all other cells - EXCEPT for those that represent the
   beginning of the left context, @:C -- these are the 'backlooping'
   states, where we must 'idle' on a string of @:C's until we hit hte
   last one:


	 @  y  +  @
         C  i  0  @
         ----------
      1: 2  0  1  1
      2:    3     1
      3. 0  0  1  0

   Now, finally, add in backloops from any other states back to the
   state that represents the (string of) symbols beginning the left
   context.  All other  transitions in this column not otherwise
   filled in must be 2.  Recall that the effect of this is to 'loop'
   on the left context starting symbol @:C. That gives us:

	 @  y  y  +  @
         C  i  @  0  @
         -------------
      1: 2  1  1  1  1
      2: 2  2  3  1  1
      3: 2  1  1  0  1


The steps in building a L:S /<== rc___lc table are as follows.  Please
NOTE the difference brought in by the negation - in Step b below, we
arrive at failure (state 0) rather than success (state 1).

    Step a. [Form column headers].  Write down all the correspondenes used
    in the expression lc L:S rc (including those with @ and subset
    names). Add @:@ to the end of the list.

    Step b. [Recognize expression as failure] Beginning with state 1,
    add states and fill in the transitions to recognized lc L:S rc.
    The final symbol in the expression should result in failure (ie
    the cell representing recognition should contain 0).

    Step c. [Mark states]. Use a colon to mark EVERY state as a final
    state.

    Step d. All remaining cells denote successful transitions as far
    as this rule is concerned. In most cases, the transitions are back
    to state 1, except if backlooping occurs, as before.


Applying this to our y:i <== rule, we get the following:

   context, i.e., @:C, state 2.

	 @  y  +  @
         C  i  0  @
         ----------
      1: 2  0  1  1
      2: 2  3  2  1
      3. 0  0  1  0


And, whew, we are done.

Now, what about writing out <== rules?   We need this method.
The method for writing a <== rule as a fsa table is to construct a
table that recognizes the sequence lc L: NOT S rc and forbids it
anywhere else, while permitting any other correspondence to occur
anywhere.

The steps in building the table are:

    Step a. [Form column headers.] Put down L:S. Then, put down L:@,
    which now represents L: NOT S.  Next, write down all
    correspondences used in lc and rc (including the correspondences
    with @ and subset names).  Add @:@ to the end of the list.

    Step b. [Form recognition sequence].  Beginning with state 1, add
    states (rows) and fill in state transitions to recognize the
    expression lc L:@ rc.  (This is a straightline automaton as we did
    above - BUT NOTE the diffence from ==>) The final symbol in the
    expression should result in FAILURE (the cell should be a 0) -
    rather than, as before, success (state 1).

    Step c. Use a colon to mark EVERY state as a final state.

    Step d. All remaining cells in the transition table denote
    successful transitions as far as this rule is concerned. In most
    cases, these are filled with transitions back to the initial state
    (state 1), except in the case of backlooping, as before.

OK, now let's apply this to our <== y:i spelling rule.

 RULE "5 y:ispelling  y:i ==> :C___+:0"  3 5


 Step a. Form column headers.  We first add y:i.  Then, we add y:@.
 Now we add the context pairs, @:C and +:0.  Finally, we add @:@.

	 @  y  y  +  @
         C  i  @  0  @
         -------------

 Step b. Add the straight-line fsa to recognize the string @:C y:@  +:0:

          1-->@:C-->2-->y:@-->3-->+:0-->0

	 @  y  y  +  @
         C  i  @  0  @
         -------------
      1  2
      2        3
      3           0


 Step c. Mark final states --- all states are final.

	 @  y  y  +  @
         C  i  @  0  @
         -------------
      1: 2
      2:       3
      3:          0


 Step d.  Add success transitions for all other states, except for
 the column headed by @:C  which is the beginning of the left context.


	 @  y  y  +  @
         C  i  @  0  @
         -------------
      1: 2  1  1  1  1
      2:    1  3  1  1
      3:    1  1  0  1

  Now add the backloop states to state 2, on character @:C:


	 @  y  y  +  @
         C  i  @  0  @
         -------------
      1: 2  1  1  1  1
      2: 2  1  3  1  1
      3: 2  1  1  0  1


HANDLING RULE INTERACTIONS.

That's enough for one rule.  What happens when we have another,
though?

Consider the EPENTHESIS RULE - the one in English that inserts an 'e'
into fox+s --> foxes.

Note that gemination INTERACTS with the y:i rule --- because it applies
AFTER y:i.  (WHY? Because we have the forms spy+s/spi0es --- the 'e'
has been inserted by gemination.)

We must therefore take this into account in the Gemination rule.  In
fact, if you look at the Gemination rule in the english.lex file, you
will see that there is a column headed by y:i.  NOTE that if you just
wrote down the gemination rule all by itself, without considering y:i,
you need not include this column.  (Try it, for practice --- the
resulting automaton table is something like this, in one direction:

RULE "3 Epenthesis  0:e ==> Csib +:0___s#" 5 6

        s  Csib  +  0  #  @
        s  Csib  0  e  #  @  
        -------------------
    1:  2  2     1  0  1  1
    2:  2  2     3  0  1  1
    3:  2  2     1  4  1  1
    4.  5  0     0  0  0  0
    5.  0  0     0  0  1  0
 
   )

So THIS is where you have to think about rule interactions - and you
can do this by thinking about possible lexical, surface pairs - as in
the spies example.  

3. Writing rules: SUBSETS, rule conflicts; conflicting character descriptions in subsets.


3.1 Using SUBSETS in Automata tables

Assume that your description contains these subsets.

SUBSET D   t d s
SUBSET P   c j ^
SUBSET Vhf i e

Here is a rule using these subsets.  It states that 'alveolar'
consonants in subset D may be realized as 'palatized' consonants
(i.e. those made with the tongue at the roof of the mouth), when they
occur preceding the high, front vowels in the subset Vhf.

R5   Palatization
    
     D:P ==> ____Vhf

Specifically, we want the subset correspondence D:P to stand for the
feasible pairs t:c, d:j, and s:^.  Translating this into a state table
is straightforward:

T5  Palatization

       D   Vhf  @
       P   Vhf  @
      ------------
   1:  2   1    1
   2.  0   1    0

However, this automaton will produce NO correct results unless the
feasible pairs t:c, d:j, and s:^ are declared explicitly.  The pairs
must appear as column headers in a table somewhere in the
description. This is typically done by constructing a table
specifically for the purpose of declaring special correspondences. For
example, the following table declares the feasible pairs we want for
the column header D:P.

          Palatization correspondences
            t   d  s  @
            c   j  ^  @
           ---------------
      1:    1   1  1  1

Similarly, the feasible pairs that Vhf:Vhf stand for (i:i, e:e, etc.)
must be declared.   Since in this case the feasible pairs
are also just default correspondenes, they would typically be included
in the table with all the other default correspondences.


3.2  Overlapping Column Headers and Specificity

Using subsets in rules often leads to a situation where a state table
has column headers that potentially overlap.  THIS IS A COMMON SOURCE
OF BUGS in KIMMO rule tables.

In such a case unexpected results may occur.  For example, consider
this rule, which states that t:c occurs between any vowel and i:

    t:c ==> V__i

A first attempt at writing a state stable for this rule might look
like this:

         V  t  i  @
         V  c  i  @
       -------------
      1: 2  0  1  1
      2: 2  3  1  1
      3. 0  0  1  0

Given the lexical form "mati" this table will correctly produce the
surface form "maci".  BUT given the form "miti" it will fail to
produce the expected result "mici".  This is because of the
interaction between the column headers V:V and i:i. 
Because i:i is also an instance of V:V, we might expect that the first
i in the input would match V:V and cause a successful transition to
state 2.  BUT this is not the case.  For each table in a Kimmo
description, the ENTIRE set of feasible pairs must be partitioned
among the column headers WITH NO OVERLAP.  When PC-KIMMO interprets
the column headers of a table, it scans the list of all the feasible
pairs and assigns each one to a column header.  IF a feasible pair
matches more than one column header, it assigns it to the MOST
SPECIFIC one, where specificity is defined as the number of feasible
pairs that matches it. In order to see exactly how the feasible pairs
are assigned to the column headers of a rule, use the SHOW RULE
command.

Thus in the table above, the feasible pair i:i matches both the column
headers i:i and V:V, but because i:i is more specific than V:V, the
pair i:i is assigned to the column header i:i. This means that the
column header V:V stands for all the feasible pair of vowels EXCEPT
i:i. And i:i matches ONLY i:i. To work correctly, the automaton must
allow i:i to be an instance of V:V in the left context, by placing a 2
i the states 1 and 2 under the i:i header.  Note also that the order
of the columns has NO effect on which column header an input pair is
matched to (the automata are all applied in parallel).  The revised
table is:


        V  t  i  @
        V  c  i  @
       -------------
      1: 2  0  2  1
      2: 2  3  2  1
      3. 0  0  1  0
 

Now consider a description that constains a subset Vrd for rounded
vowels and a subset Vhi for high vowels:

SUBSET Vrd   o u
SUBSET Vhi   i e o u

Note that Vhi properly includes Vrd.  Assume that we have a rule:


    t:c ==> Vrd___Vhi

Our first attempt at a state table for this rule might look like this: 


        Vrd  t  Vhi  @
        Vrd  c  Vhi  @
       ---------------
      1: 2   0  1   1
      2: 2   3  1   1
      3. 0   0  1   0
 
But the feasible pairs o:o and u:u, which match both the Vrd:Vrd and
Vhi:Vhi column headers must belong to the Vrd:Vrd column since it is
more specific.  Thus, the Vhi column represents only the pairs i:i and
e:e. This means that a lexical form such as "utu" will NOT produce the
expected surface form "ucu" because the second u will always match
Vrd, not Vri.  This problem is fixed by included u:u and o:o as column
headers:

        Vrd  t  Vhi u  o   @
        Vrd  c  Vhi u  o   @
       ---------------------
      1: 2   0  1   2  2   1
      2: 2   3  1   2  2   1
      3. 0   0  1   1  1   0

The solution, then, in cases of overlapping column headers it to
explicitly include as headers in the table the feasible pairs that
belong to both headers.

3.3 Expressing Word Boundary environments.

Consider a rule that states that stop consonants (like b, d, p that
'stop' the air flow) are devoiced (vocal cords top vibrating) when
they occur in word final position:

    example:  lexical form  mabab
              surface form  mabap

Here, the voiced b changes to an unvoiced p at the end of a word.

Assume the subsets for voiced stops (B) and voiceless stops (P):

SUBSET B  b d g
SUBSET P  p t k

We might write the rule as follows. Remember that <==> is used when
a correspondence occurs in a given environment and in NO OTHER
environment -- ie, only at the end of words.  It means the
correspondence is allowed if and only if (iff) it is found in the
specified context.

      Devoicing

    B:P  <==> ____#

The corresponding state table is written with #:# as the column header
representing the word boundary.  Note that a boundary symbol used in a
column header can ONLY correspond to another boundary symbol; that is,
correspondences such as #:0 are ILLEGAL.


        B  B  #  @
        B  @  #  @
       ---------------
      1: 3 2  1  1
      2: 3 2  0  1
      3. 0 0  1  0

Rules that refer to an initial word boundary are written in a similar
way.


3.4 Rule Conflicts

This is all fine when one is writing just one rule, but of course you
will need more than one.  Then the rules might conflict - let us see
how.

The two main types of rule conflicts are the ==> (or environment)
conflict, and the <== (or realization) conflict.  

3.4.1 The ==> conflict arises when two conditions are met:

(1) Two ==> rules have the same correspondence on the left side of a
    rule, but

(2) They have DIFFERENT environments on the right side.  For example:


 p:b ==>  V____V   (here, V is  vowel - this is 'intervocalic voicing')

 p:b ==>  m___     (voicing after a nasal sound)

Since the rule operator ==> means that the correspondence can occur
only in the specified environment, these two rules contradict each
other.  The simplest resolution of the conflict is to combine the two
rules into one, with a disjunctive environment:

    Voicing

   p:b ==>  [V__V | m__]     where | is the usual BNF 'or'

The state table for this rule looks like this:

        V  m  p  @
        V  m  p  @
        ----------
     1: 2  4  0  1
     2: 2  4  3  1
     3. 2  0  0  0
     4: 2  4  1  1

where states 1, 2, and 3 correspond to the V___V part of the
first rule, and states 1 and 4 correspond to the m___  part.

3.4.2  A <==> conflict.

Suppose the rules above had been written as <==> (i.e., if and only
if) instead of ==>

 p:b <==>  V____V   (here, V is  vowel - this is 'intervocalic voicing')

 p:b <==>  m___     (voicing after a nasal sound)

Their two state tables will again have problems - they won't
work because <==> contains the ==> conflicts as before, and the two
sides of the ==> parts will conflict.  The solution is to write a
disjunctive automaton table to represent

   p:b ==>  [V__V | m__]     where | is the usual BNF 'or'

exactly as before, and then write two automaton tables to represent

 p:b <==  V____V   

 p:b <==   m___     

We leave these last two as exercises. They are easy to do.