| Locale::Po4a::TransTractor(3pm) | Po4a Tools | Locale::Po4a::TransTractor(3pm) |
Locale::Po4a::TransTractor - generic trans(lator ex)tractor.
The po4a (PO for anything) project goal is to ease translations (and more interestingly, the maintenance of translations) using gettext tools on areas where they were not expected like documentation.
This class is the ancestor of every po4a parser used to parse a document, to search translatable strings, to extract them to a PO file and to replace them by their translation in the output document.
More formally, it takes the following arguments as input:
As output, it produces:
Here is a graphical representation of this:
Input document --\ /---> Output document
\ / (translated)
+-> parse() function -----+
/ \
Input PO --------/ \---> Output PO
(extracted)
This function is called by the process() function below, but if you choose to use the new() function, and to add content manually to your document, you will have to call this function yourself.
The following example parses a list of paragraphs beginning with "<p>". For the sake of simplicity, we assume that the document is well formatted, i.e. that '<p>' tags are the only tags present, and that this tag is at the very beginning of each paragraph.
sub parse {
my $self = shift;
PARAGRAPH: while (1) {
my ($paragraph,$pararef)=("","");
my $first=1;
my ($line,$lref)=$self->shiftline();
while (defined($line)) {
if ($line =~ m/<p>/ && !$first--; ) {
# Not the first time we see <p>.
# Reput the current line in input,
# and put the built paragraph to output
$self->unshiftline($line,$lref);
# Now that the document is formed, translate it:
# - Remove the leading tag
$paragraph =~ s/^<p>//s;
# - push to output the leading tag (untranslated) and the
# rest of the paragraph (translated)
$self->pushline( "<p>"
. $self->translate($paragraph,$pararef)
);
next PARAGRAPH;
} else {
# Append to the paragraph
$paragraph .= $line;
$pararef = $lref unless(length($pararef));
}
# Reinit the loop
($line,$lref)=$self->shiftline();
}
# Did not get a defined line? End of input file.
return;
}
}
Once you've implemented the parse function, you can use your document class, using the public interface presented in the next section.
ARGUMENTS, beside the ones accepted by new() (with expected type):
This array
"@{$self->{TT}{doc_in}}" holds this
input document data as an array of strings with alternating meanings.
* The string $textline holding each line of the
input text data.
* The string "$filename:$linenum"
holding its location and called as
"reference" ("linenum" starts
with 1).
Please note that it does not parse anything. You should use the parse() function when you're done with packing input files into the document.
This translated document data are provided by:
* "$self->docheader()" holding the
header text for the plugin, and
* "@{$self->{TT}{doc_out}}" holding
each line of the main translated text in the array.
[normal use of the po4a document...]
($percent,$hit,$queries) = $document->stats();
print "We found translations for $percent\% ($hit from $queries) of strings.\n";
This function returns a non-null integer on error.
Four functions are provided to get input and return output. They are very similar to shift/unshift and push/pop of Perl.
* Perl shift returns the first array item and drop it from the array. * Perl unshift prepends an item to the array as the first array item. * Perl pop returns the last array item and drop it from the array. * Perl push appends an item to the array as the last array item.
The first pair is about input, while the second is about output. Mnemonic: in input, you are interested in the first line, what shift gives, and in output you want to add your result at the end, like push does.
One function is provided to handle the text which should be translated.
This function can also take some extra arguments. They must be organized as a hash. For example:
$self->translate("string","ref","type",
'wrap' => 1);
Actions:
It will use the output charset specified in the command line. If it wasn't specified, it will use the input PO's charset, and if the input PO has the default "CHARSET", it will return the input document's charset, so that no encoding is performed.
One shortcoming of the current TransTractor is that it can't handle translated document containing all languages, like debconf templates, or .desktop files.
To address this problem, the only interface changes needed are:
$self->pushline_all({ "Description[".$langcode."]=".
$self->translate($line,$ref,$langcode)
});
Will see if it's enough ;)
Denis Barbier <barbier@linuxfr.org> Martin Quinson (mquinson#debian.org) Jordi Vilalta <jvprat@gmail.com>
| 2023-01-03 | Po4a Tools |