|
 |
 |
- Build a complete set of
KIMMO rules
to handle all of Spanish verbal morphology (or, almost all).
Spanish verb conjugator/compjuga by Daniel M.German at http://compjugador.sourceforge.net/.
The data files in the archive text-compjugador-0.1.tar.gz can
conjugate
all the verbs in ‘official’ Spanish (as in the Diccionario de la Real
Academia)
- close to 10,000 verbs. See also the Ispell site at: http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html#Spanish-dicts
- Portuguese. For
some
references & data on this language see the following http://www.linguateca.pt/
This is a collection of links to various resources on the
Portuguese language.
The pages of this site contain sections such as
"Ajuda a redaccao" (that includes references to the ISPELL
dictionaries for the Portuguese spoken in Portugal and for Brazilian
Portuguese), "Componentes basicos de um sistema de Processamento de
Linguagem Natural: analisadores ou geradores da lingua",
"Conjugadores verbais", as well as links to numerous
"Dicionarios gerais" with more complete accounts of the Portuguese
inflectional system.
- Italian. For
inflectional forms, see http://members.xoom.virgilio.it/trasforma/ispell/
There are other links to Italian morphology to use, which I can
provide.
- Japanese: ftp://crl.nmsu.edu/CLR/lexica/jmorphdict/
- Greek: http://www.csd.auth.gr/~setn02/poster_papers/053.pdf
This paper explores the limits of Kimmo, and you might want to do
the
same.
- Many other language
examples are
possible: Pig-Latin; Portuguese; Esperanto at http://www.cis.upenn.edu/~cis639/home.html
(Under ‘assignments’)
- Turkish: For a set
of examples
for Turkish, along with data, we can furnish you a previous year's
laboratory
with complete instructions.
- [Harder, but more fun]
Rules
to handle a non-concatenative language, like Arabic, Hebrew, etc.
(Reference for Arabic: J. McCarthy's PhD. thesis at MIT, in the MIT
Humanities
Library). Note: in the past, people have implemented their own
system
(not Kimmo) to do this, in Scheme. See the Arabic demo at http://www.cis.upenn.edu/~cis639/home.html
for another approach. I can provide you with many
additional
references on Semitic languages.
- mplement
finite-state rule
compilation (i.e., combining the fst's from rules into one large
one, via the methods of composition and/or intersection)
- Implement a method to
use re-write ‘arrow
rules’ as input to Kimmo (partly done in the current implementation –
but incomplete),
so that one can write rules without reference at all to finite-state
tables. E.g.,
a:0 -> V a:s X, where V, X are left and right contexts.
.
|
|
|
 |