Re: [HACKERS] Inefficiencies in COPY command

From: Wayne Piekarski <wayne(at)senet(dot)com(dot)au>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane)
Cc: wayne(at)senet(dot)com(dot)au, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Inefficiencies in COPY command
Date: 1999-08-21 06:33:08
Message-ID: 199908210633.QAA10242@helpdesk.senet.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote -
> Wayne Piekarski <wayne(at)senet(dot)com(dot)au> writes:
> > So I made some changes to CopyAttributeOut so that it escapes the string
> > initially into a temporary buffer (allocated onto the stack) and then
> > feeds the whole string to the CopySendData which is a lot more efficient
> > because it can blast the whole string in one go, saving about 1/3 to 1/4
> > the number of memcpy and so on.
>
> copy.c is pretty much of a hack job to start with, IMHO. If you can
> speed it up without making it even uglier, have at it! However, it
> also has to be portable, and what you've done here:

Ok, well I will write up a proper patch for CopyAttributeOut so it is not
such a hack (using all those #defines and stuff wasn't very "elegant") and
then submit a proper patch for it.... This was pretty straight forward to
fix up.

> While formatting an int is pretty simple, formatting a float is not so
> simple. I'd be leery of replacing sprintf with quick-hack float
> conversion code. OTOH, if someone wanted to go to the trouble of doing
> it *right*, using our own code would tend to yield more consistent
> results across different OSes, which would be a Good Thing. I'm not
> sure it'd be any faster than the typical sprintf, but it might be worth
> doing anyway.

I understand there are issues to do with not being able to use GPL code
with Postgres, because its BSD license is not compatible, but would it be
acceptable to extract code from BSD style code? If so, my FreeBSD here has
libc code and includes the internals used by sprintf for rendering
integers (and floats) and so we could include that code in, and should
hopefully be portable at the same time as well.

This would be a lot faster than going via sprintf and lots of other
functions, and would make not just COPY, but I think any SELECT query runs
faster as well (because they get rewritten to strings by the output
functions don't they). I guess other advantages would be improvements in
the regression tests maybe, for problem types like int8 which in the past
have had trouble under some BSDs.

Does what I've written above sound ok? If so I'll go and work up something
and come back with a patch.

bye,
Wayne

------------------------------------------------------------------------------
Wayne Piekarski Tel: (08) 8221 5221
Research & Development Manager Fax: (08) 8221 5220
SE Network Access Pty Ltd Mob: 0407 395 889
222 Grote Street Email: wayne(at)senet(dot)com(dot)au
Adelaide SA 5000 WWW: http://www.senet.com.au

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ansley, Michael 1999-08-21 08:17:39 RE: [HACKERS] Postgres' lexer
Previous Message Tom Lane 1999-08-21 03:58:29 Caution: tonight's commits force initdb