| DETOXRC(5) | File Formats Manual | DETOXRC(5) |
detoxrc —
configuration file for
detox(1)
detox allows for configuration of its
sequences through config files. This document describes how these files
work.
When setting up a new set of rules, the safe and wipeup filters should always be run after a translating filter (or series thereof), such as the utf_8 or the uncgi filters. Otherwise, the risk of introducing difficult characters into the filename is introduced.
The format of this configuration file is C-like. It is based
loosely off the configuration files used by named.
Each statement is semicolon terminated, and modifiers on a particular
statement are generally contained within braces.
sequence
"name" {sequence;
...};There is a special sequence, named
default, which is the default sequence used by
detox. This can be overridden through the
command line option -s or the environmental
variable DETOX_SEQUENCE.
Sequence names are case sensitive and unique throughout all sequences; that is, if a system-wide file defines normal_seq and a user has a sequence with the same name in their .detoxrc, the users' normal_seq will replace the system-wide version.
ignore
{filename
"filename"; ...};detox during recursion.#
commentsAll of these statements occur within a
sequence block.
iso8859_1;iso8859_1
{builtin
"name";};iso8859_1
{filename
"/path/to/filename";};If builtin is specified, a builtin table with the name specified will be used.
Under normal circumstances, the filename syntax is not needed.
detox looks in several locations for a file
called iso8859_1.tbl, which is a set of rules
defining how an ISO 8859-1 character should be translated. If
detox can't find the translation table, it will
fall back on the builtin table iso8859_1.
You can also download or create your own, and tell
detox the location of it using the filename
syntax shown above.
You can chain together multiple iso8859_1 filters, as long as the default value of all but the last one it empty. This is explained in detox.tbl(5).
This filter is mutually exclusive with the utf_8 filter.
utf_8;utf_8
{builtin
"name";};utf_8
{filename
"/path/to/filename";};This operates in a manner similar to iso8859_1, except it looks for a translation table called unicode.tbl.
Similar to the iso8859_1 filter, an internal table exists, based on the stock translation table, called unicode.
uncgi;safe;safe
{builtin
"name";};safe
{filename
"/path/to/filename";};Similar to the iso8859_1 and utf_8 filters, this can be controlled using a translation table. This filter also has an internal version of the translation table, which can be accessed via the builtin table safe.
wipeup;wipeup
{remove_trailing;};If remove_trailing is set, then
periods are added to the set of characters to work on. The period then
takes precedence, followed by the dash.
If a hash character, underscore, or dash are present at the start of the filename, they will be removed.
max_length
{length value;};For instance, given a max length of 12, and a filename of this_is_my_file.txt, the filter would output this_is_.txt.
lower;# transliterate UTF-8 to ASCII (using chained tables), clean up
sequence utf8 {
utf_8 {
filename "/usr/local/share/detox/custom.tbl";
};
utf_8 {
builtin "unicode";
};
safe {
builtin "safe";
};
wipeup {
remove_trailing;
};
max_length {
length 128;
};
};
# decode CGI, transliterate CP-1252 to ASCII, clean up
sequence "cgi-cp1252" {
uncgi;
iso8859_1 {
builtin "cp1252";
};
safe {
builtin "safe";
};
};
detox(1), inline-detox(1), detox.tbl(5), ascii(7), iso_8859-1(7), unicode(7), utf-8(7)
detox was written by Doug Harple.
| February 24, 2021 | Debian |