Fwd: PQgetlength vs. octet_length()

From: Michael Clark <codingninja(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Fwd: PQgetlength vs. octet_length()
Date: 2009-08-18 18:42:22
Message-ID: bf5d83510908181142i6e423f7fj1ed12afd3851cb2e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, Aug 18, 2009 at 1:48 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:

> On Tue, Aug 18, 2009 at 6:39 PM, Michael Clark<codingninja(at)gmail(dot)com>
> wrote:
> > But it seems pretty crazy that a 140meg bit of data goes to 1.3 gigs.
> Does
> > that seem a bit excessive?
>
> From what you posted earlier it looked like it was turning into about
> 500M which sounds about right. Presumably either libpq or your code is
> holding two copies of it in ram at some point in the process.
>

From what I saw, stopped at this line in my code running through gdb:
const char *valC = PQgetvalue(result, rowIndex, i);
my mem usage was 300megs. Stepping over this line it went to 1.3 gigs.
Unless there is some way to misconfigure something, I can't think how my
code could do that.
I will profile it and see if I can tell who is holding on to that memory.

> 8.5 will have an option to use a denser hex encoding but it will still
> be 2x as large as the raw data.
>

Sweet!

>
> > I avoided the binary mode because that seemed to be rather confusing when
> > having to deal with non-bytea data types. The docs make it sound like
> > binary mode should be avoided because what you get back for a datetime
> > varies per platform.
>
> There are definitely disadvantages. Generally it requires you to know
> what the binary representation of your data types is and they're not
> all well documented or guaranteed not to change in the future. I
> wouldn't recommend someone try to decode a numeric or a postgres array
> for example. And floating point numbers are platform dependent. But
> bytea is a case where it seems more natural to use binary than text
> representation.
>

To do something like this, I guess it would be best for my wrapper to being
to detect when I have a bytea column in a table and do 2 fetchs, one in text
for all other columns, and one in binary for the bytea column. Is this the
best way to handle that do you think?

Thanks,
Michael.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Doug Gorley 2009-08-18 19:14:37 Any justification for sequence table vs. native sequences?
Previous Message Alban Hertroys 2009-08-18 17:59:35 Re: Unit conversion database (was: multiple paramters in aggregate function)

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-08-18 19:07:05 Re: "make install" now tries to build the documentation
Previous Message Tom Lane 2009-08-18 18:32:30 Re: "make install" now tries to build the documentation