Re: CopyReadLineText optimization

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: CopyReadLineText optimization
Date: 2008-03-06 20:44:47
Message-ID: 47D057BF.9030302@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Greg Smith wrote:
> On Thu, 6 Mar 2008, Heikki Linnakangas wrote:
>
>> At the most conservative end, we could fall back to the current
>> method on the first escape, quote or backslash character.
>
> I would just count the number of escaped/quote characters on each
> line, and then at the end of the line switch modes between the current
> code on the new version based on what the previous line looked like.
> That way the only additional overhead is a small bit only when escapes
> show up often, plus a touch more just once per line. Barely noticable
> in the case where nothing is escaped, very small regression for
> escape-heavy stuff but certainly better than the drop you reported in
> the last rev of this patch.
>
> Rev two of that design would keep a weighted moving average of the
> total number of escaped characters per line (say
> wma=(7*wma+current)/8) and switch modes based on that instead of the
> previous one. There's enough play in the transition between where the
> two approaches work better at that this should be easy enough to get a
> decent transition between. Based on your data I would put the
> transition at wma>4, which should keep the old code in play even if
> only half the lines have the bad regression that shows up with >8
> escapes per line.
>
>

I'd be inclined just to look at the first buffer of data we read in, and
make a one-off decision there, if we can get away with it. Then the cost
of testing is fixed rather than per line.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Decibel! 2008-03-06 20:57:55 Re: dblink doesn't honor interrupts while waiting a result
Previous Message Greg Smith 2008-03-06 20:29:18 Re: CopyReadLineText optimization

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-03-06 21:16:54 Minimum selectivity estimate for LIKE 'prefix%'
Previous Message Alex Hunsaker 2008-03-06 20:40:02 Re: [PATCHES] BUG #3973: pg_dump using inherited tables do not always restore