Skip site navigation (1) Skip section navigation (2)

Re: Perl/DBI vs Native

From: Craig James <craig_james(at)emolecules(dot)com>
To: valiouk(at)yahoo(dot)co(dot)uk
Cc: psql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Perl/DBI vs Native
Date: 2008-07-21 17:24:03
Message-ID: 4884C633.5050506@emolecules.com (view raw or flat)
Thread:
Lists: pgsql-performance
Valentin Bogdanov wrote: 
> I have ran quite a few tests comparing how long a query takes to
> execute from Perl/DBI as compared to psql/pqlib. No matter how many
> times I run the test the results were always the same.
> 
> I run a SELECT all on a fairly big table and enabled the
> log_min_duration_statement option. With psql postgres consistently
> logs half a second while the exact same query executed with Perl/DBI
> takes again consistently 2 seconds.

The problem may be that your two tests are not equivalent.  When Perl executes a statement, it copies the *entire* result set back to the client before it returns the first row.  The following program might appear to just be fetching the first row:

  $sth = $dbh->prepare("select item from mytable");
  $sth->execute();
  $item = $sth->fetchrow_array();

But in fact, before Perl returns from the $sth->execute() statement, it has already run the query and copied all of the rows into a hidden, client-side cache.  Each $sth->fetchrow_array() merely copies the data from the hidden cache into your local variable.

By contrast, psql executes the query, and starts returning the data a page at a time.  So it may appear to be much faster.

This also means that Perl has trouble with very large tables.  If the "mytable" in the above example is very large, say a hundred billion rows, you simply can't execute this statement in Perl.  It will try to copy 100 billion rows into memory before returning the first answer.

The reason for Perl's behind-the-scenes caching is because it allows multiple connections to a single database, and multiple statements on each database handle.  By executing each statement completely, it gives the appearance that multiple concurrent queries are supported.  The downside is that it can be a huge memory hog.

Craig

In response to

Responses

pgsql-performance by date

Next:From: LevĂ­ Teodoro da SilvaDate: 2008-07-21 18:20:27
Subject: [BACKUPS]Little backups
Previous:From: Mario WeilguniDate: 2008-07-21 15:29:14
Subject: Re: Less rows -> better performance?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group