contents index next

10. fsa_compiler: Prolog Code Generation

This module provides predicates to create Prolog code on the basis of a finite automaton. Various tricks are employed to make the resulting code efficient (rather than readable), but functionality has an ever higher priority. The functionality is the same as that provided by the fsa_interpreter module, only faster. For pure speed, you should consider using the fsa_compiler_to_c module.

10.1. List of Predicates

This section lists the predicates defined by this module.

10.1.1. fsa_compile_to_prolog(+Fa) fsa_compile_to_prolog(+FileIn,+FileOut)

In the first variant, a Prolog program is written to standard output for the automaton Fa. The Prolog program defines the predicate accepts(?List) which succeeds if Fa accepts the list of symbols List. In the second variant, the Prolog program is written to FileOut on the basis of the automaton read in from FileIn.

accepts(?String) can be used both to recognize a given string or to produce a string according to Fa. This is why dif/2 is used in the implementation. We prefer functionality over efficiency here. If you didn't know: dif/2 is the SICStus built-in for inequality which delays until arguments are sufficiently instantiated.

Since input can be non-deterministic we check for epsilon-cycles by keeping track of a list of states visited after last consumption of input). There are cases where it would make more sense to pre-compute efree automata first. We provide it anyway.

10.1.2. fsa_compile_to_prolog_t(+Fa) fsa_compile_to_prolog_t(+FileIn,+FileOut)

In the first variant, a Prolog program is written to standard output for the transducer Fa. The Prolog program defines the predicate t_accepts(+In,?List) which succeeds if In x List is a transduction defined by Fa. In the second variant, the Prolog program is written to FileOut on the basis of the automaton read in from FileIn.

t_accepts(+In,?Out) can be used to transduce a given string or to produce pairs of strings, if the length of the input list is known. This is why we (possibly) use dif/2 in the implementation. We prefer functionality over efficiency here. If you didn't know: dif/2 is the SICStus built-in for inequality which delays until arguments are sufficiently instantiated.

Since input can be non-deterministic we check for epsilon-cycles by keeping track of a list of states visited after last consumption of input).

Uses special meta-notation |N+| for `output' loops, to indicate that last N characters can be repeated any number of times.

Supports unknown symbols, including occurrences of delayed identity constraints (using a queue; trick was explained to me by Lauri Karttunen, Xerox Grenoble. I have never seen it described in the literature). E.g. try regex tminimize({[a:b,?,c],[a,?,d]}).

10.1.3. fsa_compile_to_prolog_w(+Fa) fsa_compile_to_prolog_w(+FileIn,+FileOut)

In the first variant, a Prolog program is written to standard output for the string-to-weight-transducer Fa. The Prolog program defines the predicate w_accepts(+In,?List) which succeeds if In x List is a transduction defined by Fa. In the second variant, the Prolog program is written to FileOut on the basis of the automaton read in from FileIn.

w_accepts(+String,?Weight) can be used to transduce a given String to the corresponding Weight. In case of output loops only the minimum weight is produced.

Since input can be non-deterministic we check for epsilon-cycles by keeping track of a list of states visited after last consumption of input.

contents index next