| Catmandu::Importer(3pm) | User Contributed Perl Documentation | Catmandu::Importer(3pm) |
Catmandu::Importer - Namespace for packages that can import
# From the command line
# JSON is an importer and YAML an exporter
$ catmandu convert JSON to YAML < data.json
# OAI is an importer and JSON an exporter
$ catmandu convert OAI --url http://biblio.ugent.be/oai to JSON
# Fetch remote content
$ catmandu convert JSON --file http://example.com/data.json to YAML
# From Perl
use Catmandu;
use Data::Dumper;
my $importer = Catmandu->importer('JSON', file => 'data.json');
$importer->each(sub {
my $item = shift;
print Dumper($item);
});
my $num = $importer->count;
my $first_item = $importer->first;
# Convert OAI to JSON in Perl
my $importer = Catmandu->importer('OAI', url => 'http://biblio.ugent.be/oai');
my $exporter = Catmandu->exporter('JSON');
$exporter->add_many($importer);
A Catmandu::Importer is a Perl package that can generate structured data from sources such as JSON, YAML, XML, RDF or network protocols such as Atom, OAI-PMH, SRU and even DBI databases. Given an Catmandu::Importer a programmer can read data from using one of the many Catmandu::Iterable methods:
$importer->to_array;
$importer->count;
$importer->each(\&callback);
$importer->first;
$importer->rest;
...etc...
Every Catmandu::Importer is also Catmandu::Fixable and thus inherits a 'fix' parameter that can be set in the constructor. When given a 'fix' parameter, then each item returned by the generator will be automatically Fixed using one or more Catmandu::Fixes. E.g.
my $importer = Catmandu->importer('JSON',fix => ['upcase(title)']);
$importer->each( sub {
my $item = shift ; # Every $item->{title} is now upcased...
});
# or via a Fix file
my $importer = Catmandu->importer('JSON',fix => ['/my/fixes.txt']);
$importer->each( sub {
my $item = shift ; # Every $item->{title} is now upcased...
});
# given this imported item:
{abc => [{a=>1},{b=>2},{c=>3}]}
# with data_path 'abc', this item gets imported instead:
[{a=>1},{b=>2},{c=>3}]
# with data_path 'abc.*', 3 items get imported:
{a=>1}
{b=>2}
{c=>3}
# named arguments
my $importer = Catmandu->importer('JSON',
file => 'http://{server}/{path}',
variables => {server => 'biblio.ugent.be', path => 'file.json'},
);
# positional arguments
my $importer = Catmandu->importer('JSON',
file => 'http://{server}/{path}',
variables => 'biblio.ugent.be,file.json',
);
# or
my $importer = Catmandu->importer('JSON',
url => 'http://{server}/{path}',
variables => ['biblio.ugent.be','file.json'],
);
# or via the command line
$ catmandu convert JSON --file 'http://{server}/{path}' --variables 'biblio.ugent.be,file.json'
These options are only relevant if "file" is a url. See LWP::UserAgent for details about these options.
See Catmandu::Iterable for all inherited methods.
Create your own importer by creating a Perl package in the Catmandu::Importer namespace that implements "Catmandu::Importer". Basically, you need to create a method 'generate' which returns a callback that creates one Perl hash for each call:
my $importer = Catmandu::Importer::Hello->new;
$importer->generate(); # record
$importer->generate(); # next record
$importer->generate(); # undef = end of stream
Here is an example of a simple "Hello" importer:
package Catmandu::Importer::Hello;
use Catmandu::Sane;
use Moo;
with 'Catmandu::Importer';
sub generator {
my ($self) = @_;
state $fh = $self->fh;
my $n = 0;
return sub {
$self->log->debug("generating record " . ++$n);
my $name = $self->fh->readline;
return defined $name ? { "hello" => $name } : undef;
};
}
1;
This importer can be called via the command line as:
$ catmandu convert Hello to JSON < /tmp/names.txt
$ catmandu convert Hello to YAML < /tmp/names.txt
$ catmandu import Hello to MongoDB --database_name test < /tmp/names.txt
Or, via Perl
use Catmandu;
my $importer = Catmandu->importer('Hello', file => '/tmp/names.txt');
$importer->each(sub {
my $items = shift;
});
Catmandu::Iterable , Catmandu::Fix , Catmandu::Importer::CSV, Catmandu::Importer::JSON , Catmandu::Importer::YAML
| 2023-03-03 | perl v5.36.0 |