| Sort::Naturally(3pm) | User Contributed Perl Documentation | Sort::Naturally(3pm) |
Sort::Naturally -- sort lexically, but sort numeral parts numerically
@them = nsort(qw(
foo12a foo12z foo13a foo 14 9x foo12 fooa foolio Foolio Foo12a
));
print join(' ', @them), "\n";
Prints:
9x 14 foo fooa foolio Foolio foo12 foo12a Foo12a foo12z foo13a
(Or "foo12a" + "Foo12a" and "foolio" + "Foolio" and might be switched, depending on your locale.)
This module exports two functions, "nsort" and "ncmp"; they are used in implementing my idea of a "natural sorting" algorithm. Under natural sorting, numeric substrings are compared numerically, and other word-characters are compared lexically.
This is the way I define natural sorting:
foo => "foo", -1
foobar => "foo", -1, "bar"
foo13 => "foo", 13,
foo13xyz => "foo", 13, "xyz"
That's so that "foo" will come before "foo13", which will come before "foobar".
This function takes a list of strings, and returns a copy of the list, sorted.
This is what most people will want to use:
@stuff = nsort(...list...);
When nsort needs to compare non-numeric substrings, it uses Perl's "lc" function in scope of a <use locale>. And when nsort needs to lowercase things, it uses Perl's "lc" function in scope of a <use locale>. If you want nsort to use other functions instead, you can specify them in an arrayref as the first argument to nsort:
@stuff = nsort( [
\&string_comparator, # optional
\&lowercaser_function # optional
],
...list...
);
If you want to specify a string comparator but no lowercaser, then the options list is "[\&comparator, '']" or "[\&comparator]". If you want to specify no string comparator but a lowercaser, then the options list is "['', \&lowercaser]".
Any comparator you specify is called as "$comparator->($left, $right)", and, like a normal Perl "cmp" replacement, must return -1, 0, or 1 depending on whether the left argument is stringwise less than, equal to, or greater than the right argument.
Any lowercaser function you specify is called as "$lowercased = $lowercaser->($original)". The routine must not modify its $_[0].
Often, when sorting non-string values like this:
@objects_sorted = sort { $a->tag cmp $b->tag } @objects;
...or even in a Schwartzian transform, like this:
@strings =
map $_->[0]
sort { $a->[1] cmp $b->[1] }
map { [$_, make_a_sort_key_from($_) ]
@_
;
...you wight want something that replaces not "sort", but "cmp". That's what Sort::Naturally's "ncmp" function is for. Call it with the syntax "ncmp($left,$right)" instead of "$left cmp $right", but otherwise it's a fine replacement:
@objects_sorted = sort { ncmp($a->tag,$b->tag) } @objects;
@strings =
map $_->[0]
sort { ncmp($a->[1], $b->[1]) }
map { [$_, make_a_sort_key_from($_) ]
@_
;
Just as with "nsort" can take different a string-comparator and/or lowercaser, you can do the same with "ncmp", by passing an arrayref as the first argument:
ncmp( [
\&string_comparator, # optional
\&lowercaser_function # optional
],
$left, $right
)
You might get string comparators from Sort::ArbBiLex.
if(@set >= SOME_VERY_BIG_NUMBER) {
no locale; # vroom vroom
@sorted = sort(@set); # feh, good enough
} elsif(@set >= SOME_BIG_NUMBER) {
use locale;
@sorted = sort(@set); # feh, good enough
} else {
# but keep it pretty for normal cases
@sorted = nsort(@set);
}
Copyright 2001, Sean M. Burke "sburke@cpan.org", all rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
Sean M. Burke "sburke@cpan.org"
| 2022-11-19 | perl v5.36.0 |