Re: copy vs. C function

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: copy vs. C function
Date: 2011-12-15 03:18:38
Message-ID: CAKuK5J0waL-+BhHNn2EYe-tUu=iPFnnMU5--TNwBAZrvL-g=Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Dec 14, 2011 at 9:51 AM, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> wrote:
> On Wed, Dec 14, 2011 at 9:40 AM, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> wrote:
>> On Wed, Dec 14, 2011 at 9:25 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
>>>> Regarding caching, I tried caching it across calls by making the
>>>> TupleDesc static and only initializing it once.
>>>> When I tried that, I got:
>>>
>>>> ERROR:  number of columns (6769856) exceeds limit (1664)
>>>
>>>> I tried to find some documentation or examples that cache the
>>>> information, but couldn't find any.
>>>
>>> You might find reading record_in to be helpful.  What it caches is not
>>> exactly what you need to, I think, but it shows the general principles.
>>> There are lots of other functions that use fn_extra to cache info, too.
>>
>> I will definitely look into those. I'm probably doing it wrong, but in
>> the meantime, I allocated enough space (by way of MemoryContextAlloc)
>> in TopMemoryContext for an AttInMetadata pointer, switched to that
>> memory context (just the first time through), used CreateTupleDescCopy
>> + TupleDescGetAttInMetadata to duplicate (in the new memory context)
>> the TupleDesc, and then switched back. This approach seems to have
>> dropped the total run time to about 54 seconds, the bulk of which is
>> BuildTupleFromCStrings, a rather significant improvement.
>>
>> ....
>>
>> Looking at record_in, I think I see what I could be doing better.
>
> Indeed. I revised the code to make use of fcinfo->flinfo->fn_extra for
> storage and fcinfo->flinfo->fn_mcxt for the MemoryContext and
> everything seemed to work just fine.
>
> Assuming one *starts* with a char *some_var[8], would building Datum
> myself be faster than using BuildTupleFromCStrings?

The answer is: yes. At least, in my case it is.
The total run time is now down to about 32 seconds.
Versus the BuildTupleFromCStrings which takes about 54 seconds.
32 seconds is more than 10-15 seconds, but it's livable.

This experiment has been very worthwhile - thank you all for the help.

--
Jon

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2011-12-16 18:27:57 Slow nested loop execution on larger server
Previous Message Rural Hunter 2011-12-15 01:54:06 Re: Is it possible to use index on column for regexp match operator '~'?