This module provides predicates to read and write finite automata in a variety of formats. The defaul format for reading is determined by the global variable *read*. The default format for writing is determined by the global variable *write*.
The following formats are available both for reading and writing:
fast. Binary format of the normal format. Uses library(fastrw). Much faster reading and writing of automata. Drawback: binary files.
normal. Internal representation (single Prolog term).
old. Prolog program defining clauses start/1, final/2, trans/3, jump/2. A variant of this was used by FSA2 and FSA3, but it is still useful, for instance, if you want to input automata directly, rather than by means of regular expressions.
fsm. Format of automata as used in AT&T's fsm library.
compact. text format, fairly compact. Slow (especially for output).
For writing, the following additional formats are available:
ps (PostScript)
vcg (input to the Xvcg graph visualization tool)
davinci (input to the DaVinci graph visualization tool)
tk (starts a interactive tcl/tk widget)
dot (input for the GraphViz visualization tools dot and dotty)
pstricks (LaTeX code to be included in a document; requires pstricks macro's)
latex (LaTeX document; requires pstricks macro's).
prolog (Prolog program; interface to fsa_compiler module).
c, t_c, w_c (C program; interface to fsa_compiler_to_c module), resp. for recognizers, string-string-transducers and string-weight transducers.
java, t_java, w_java (JAVA program; interface to fsa_java module), resp. for recognizers, string-string-transducers and string-weight transducers.
fsm. Format of automata as used in AT&T's fsm library. Experimental.
The normal format is the internal format used in FSA6. The module fsa_data provides a consistent interface to this format.
Here is a table indicating the relative speed of the standard input and output formats:
format compact fast normal
writing 20 1 4
reading 5 1 4
Here is a table indicating the relative size of the standard input and output formats (measured in bytes):
compact fast normal
1 1.8 1.6
This section describes the various I/O formats.
In the normal format, a finite automaton is a single Prolog term
fa(Symbols,States,Starts,Finals,Transs,Jumps).
Symbols is a term r(Sig) (for recognizers) or t(SigD,SigR) for transducers. Here *Sig*, *SigD*, SigR are the predicate modules.
States is an integer indicating the number of states in the automaton.
Starts is an ordered list of integers indicating the start states of the automaton.
Finals is an ordered list of integers indicating the final states of the automaton.
Transs is an ordered list of triples trans(A,B,C) where A and C are integers indicating source and target state, and B is a symbol (recognizers) or a symbol pair InSym/OutSym (transducers). InSym is a symbol or the empty list. OutSym is a symbol or a (possibly empty) list of symbols.
Jumps is an ordered list of pairs jump(A,B) where A and B are integers indicating source and target state. This implies there is an epsilon transition from A to *B*.
In the old format a finite automaton is given as a Prolog program. The automaton is defined by clauses for the predicates:
start(State)
final(State)
trans(State0,Sym,State)
jump(State0,State)
Note that in this format states can be are arbitrary ground Prolog terms (these will be converted to integers). In the case of transducers, Sym is a pair Left/Right. The empty list [] is used to indicate the empty string. In the case of sequential transducers, Right must be a list of symbols.
The compact format fairly closely follows the normal format. See the documentation on the normal format in the fsa_data module for more information. In this format a file is an ordinary text file. The format is intended to be used for machines only, and is not very helpful for human consumption.
The first line of the file is the string "fsa6". This is used to differentiate the file from the pre-fsa6 compact formats (which can still be read-in).
The second line is the letter r for recognizers or t for transducers.
For recognizers, the third line is the name of the predicate module.
For transducers, there are two such lines. The first line defines the domain predicate module, the second line the range predicate module.
The next line is an integer indicating the number of states
The next line is an ordered sequence of integers, separated by tabs, indicating the start states
The next line is an ordered sequence of integers, separated by tabs, indicating the final states
The next lines each represent a transition, until an empty line is encountered. The transitions are ordered. Each transition is a triple State Symbol State. Seperator is the tab again. States are integers. Symbols are readable as Prolog terms (recognizers) or pairs of In/Out, where In is a term and Out is either a single term or a list of terms. If the source state is identical to the source state of the previous line, it can be left out. If the symbol is identical as well, then it can be left out as well.
The next lines are jumps. Jumps are ordered. Each line consists of two states separated by a tab. If the source state is identical to the source state of the previous line, it can be left out. If the symbol is identical as well, then it can be left out as well.
Example:
fsa write=compact -r '[class(a..f),{g,h}]'
fsa6
r
fsa_preds
3
0
1
0 in([a,b,c,d,e,f]) 2
2 g 1
h 1
The fast format uses the same Prolog term representation as the normal format, except that library(fastrw) is used to read and write the Prolog term. This implies that reading and writing of automata in this format is very fast; the disadvantage is that fast is a binary format and therefore cannot be (easily) treated by other programs.
This section lists the predicates defined by this module.
The automaton in File0 is copied to File1. Useful to convert between different formats, by setting the read and write global variables.
Fa is read from File. If Format is unspecified the value of the global variable read is taken as the input format.
Fa is written to File. If Format is unspecified the value of the global variable write is taken as the input format.