ISA is a PHP based web interface for interactive sentence alignment of
parallel XML documents. It uses as the backend the length-based Gale&Church
approach to sentence alignment but it can be used for manual alignment. The
basic idea is to use the interface for
- adding hard boundaries to improve quality and performance of the automatic
alignment
- correcting existing alignments by removing/adding new segment boundaries
The interface allows you to work only on small portions of the document or the
entire document. Alignment results can be saved (if not disabled) or sent via
e-mail (if not disabled) in various formats (XCES align with pointers to
external sentence IDs, plain text format or simple TMX).
|
ICA is a PHP based web interface for interactive word alignment. It uses as its
backend the Clue Aligner
but can be used for manual alignment as well. You can
- select clues and clue weights
- inspect alignment strategies and matching clues
- correct the alignment by adding and removing links
- display the contents of clue score databases
ICA works on one sentence pair at a time taken from a pre-defined parallel
corpus (its location is hard-coded in the script for the time being). PHP is a
server side scripting language and, therefore, the corpus has to be located on
the server running the script. An upload function could easily be
integrated. However, we would then need some form of authentication for
protection. The
script also needs to have access to appropriate clues stored in local
(server-side) database files (one for each type). These files can be produced
by the Clue Aligner off-line.
|