Re: Performance Enhancement/Fix for Array Utility Functions

From: Mike Lewis <mikelikespie(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Farina <drfarina(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance Enhancement/Fix for Array Utility Functions
Date: 2010-07-28 05:20:38
Message-ID: AANLkTintJyQCON5-MNJEKia7hWoY2NxKxVHHw9up150+@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> > 1. As-is, it's a significant *pessimization* for small arrays, because
> > the heap_tuple_untoast_attr_slice code does a palloc/copy even when one
> > is not needed because the data is already not toasted. I think there
> > needs to be a code path that avoids that.
>
> This seems like it shouldn't be too hard to fix, and I think it should be
> fixed.
>

Do you have any suggestions where to start? I do agree that this should be
fixed as well. I don't have too much time to dedicate to this project. I
can try to put in some time this weekend though if it isn't looking too bad.

> > 2. Arrays that are large enough to be pushed out to toast storage are
> > almost certainly going to get compressed first. The potential win in
> > this case is very limited because heap_tuple_untoast_attr_slice will
> > fetch and decompress the whole thing. Admittedly this is a limitation
> > of the existing code and not a fault of the patch proper, but still, if
> > you want to make something that's generically useful, you need to do
> > something about that. Perhaps pglz_decompress() could be extended with
> > an argument to say "decompress no more than this much" --- although that
> > would mean adding another test to its inner loop, so we'd need to check
> > for performance degradation. I'm also unsure how to predict how much
> > compressed data needs to be read in to get at least N bytes of
> > decompressed data.
>
> But this part seems to me to be something that can, and probably
> should, be left for a separate patch. I don't see that we're losing
> anything this way; we're just not winning as much as we otherwise
> might. To really fix this, I suspect we'd need a version of
> pglz_decompress that can operate in "streaming mode"; you tell it how
> many bytes you want, and it returns a code indicating that either (a)
> it decompressed that many bytes or (b) it decompressed less than that
> many bytes and you can call it again with another chunk of data if you
> want to keep going. That'd probably be a good thing to have, but I
> don't think it needs to be a prerequisite for this patch.
>

Hopefully this is the case. I can try tackling the first part, however.

Thanks,
Mike
--
Michael Lewis
lolrus.org
mikelikespie(at)gmail(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2010-07-28 07:45:28 Re: Query optimization problem
Previous Message Tom Lane 2010-07-28 04:55:39 Re: Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.