Re: CopyReadLineText optimization

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-patches(at)postgresql(dot)org
Subject: Re: CopyReadLineText optimization
Date: 2008-03-06 20:29:18
Message-ID: Pine.GSO.4.64.0803061510370.380@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Thu, 6 Mar 2008, Heikki Linnakangas wrote:

> At the most conservative end, we could fall back to the current method
> on the first escape, quote or backslash character.

I would just count the number of escaped/quote characters on each line,
and then at the end of the line switch modes between the current code on
the new version based on what the previous line looked like. That way the
only additional overhead is a small bit only when escapes show up often,
plus a touch more just once per line. Barely noticable in the case where
nothing is escaped, very small regression for escape-heavy stuff but
certainly better than the drop you reported in the last rev of this patch.

Rev two of that design would keep a weighted moving average of the total
number of escaped characters per line (say wma=(7*wma+current)/8) and
switch modes based on that instead of the previous one. There's enough
play in the transition between where the two approaches work better at
that this should be easy enough to get a decent transition between.
Based on your data I would put the transition at wma>4, which should keep
the old code in play even if only half the lines have the bad regression
that shows up with >8 escapes per line.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-03-06 20:44:47 Re: CopyReadLineText optimization
Previous Message Heikki Linnakangas 2008-03-06 20:08:03 Re: CopyReadLineText optimization

Browse pgsql-patches by date

  From Date Subject
Next Message Alex Hunsaker 2008-03-06 20:40:02 Re: [PATCHES] BUG #3973: pg_dump using inherited tables do not always restore
Previous Message Heikki Linnakangas 2008-03-06 20:08:03 Re: CopyReadLineText optimization