Re: Perl/DBI vs Native

From: Craig James <craig_james(at)emolecules(dot)com>
To: valiouk(at)yahoo(dot)co(dot)uk
Cc: psql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Perl/DBI vs Native
Date: 2008-07-21 17:24:03
Message-ID: 4884C633.5050506@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Valentin Bogdanov wrote:
> I have ran quite a few tests comparing how long a query takes to
> execute from Perl/DBI as compared to psql/pqlib. No matter how many
> times I run the test the results were always the same.
>
> I run a SELECT all on a fairly big table and enabled the
> log_min_duration_statement option. With psql postgres consistently
> logs half a second while the exact same query executed with Perl/DBI
> takes again consistently 2 seconds.

The problem may be that your two tests are not equivalent. When Perl executes a statement, it copies the *entire* result set back to the client before it returns the first row. The following program might appear to just be fetching the first row:

$sth = $dbh->prepare("select item from mytable");
$sth->execute();
$item = $sth->fetchrow_array();

But in fact, before Perl returns from the $sth->execute() statement, it has already run the query and copied all of the rows into a hidden, client-side cache. Each $sth->fetchrow_array() merely copies the data from the hidden cache into your local variable.

By contrast, psql executes the query, and starts returning the data a page at a time. So it may appear to be much faster.

This also means that Perl has trouble with very large tables. If the "mytable" in the above example is very large, say a hundred billion rows, you simply can't execute this statement in Perl. It will try to copy 100 billion rows into memory before returning the first answer.

The reason for Perl's behind-the-scenes caching is because it allows multiple connections to a single database, and multiple statements on each database handle. By executing each statement completely, it gives the appearance that multiple concurrent queries are supported. The downside is that it can be a huge memory hog.

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Leví Teodoro da Silva 2008-07-21 18:20:27 [BACKUPS]Little backups
Previous Message Mario Weilguni 2008-07-21 15:29:14 Re: Less rows -> better performance?