Re: [HACKERS] 8.2 features?

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject: Re: [HACKERS] 8.2 features?
Date: 2006-07-29 06:50:05
Message-ID: 44CB051D.6090609@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
> Tom Lane wrote:
>> I thought Joe was off in a corner doing a whole new version.
>> (I'm willing to help if he needs help...)
>
> Yeah, I was going to post the latest tonight.

Sorry for the delay. Ever see the movie "The Money Pit"? This afternoon
I started to think I lived in that house :-(

Anyway, as mentioned below, I think the attached works well for the
"INSERT ... VALUES (...), (...), ..." and related cases. There are still
things wrong that I have not even tried to fix with respect to FROM
clause VALUES lists. Namely column aliases have no effect, and neither
does "ORDER BY" clause (I'm pretty sure addRangeTableEntryForValues
needs work among other places).

From a memory usage standpoint, I got the following using 1,000,000
values targetlists:

sql length = 6000032

NOTICE: enter transformInsertStmt
MessageContext: 478142520 total in 66 blocks; 5750400 free (3 chunks);
472392120 used

NOTICE: enter transformRangeValues
MessageContext: 478142520 total in 66 blocks; 5749480 free (6 chunks);
472393040 used

NOTICE: enter updateTargetListEntry
MessageContext: 629137464 total in 84 blocks; 44742464 free (999991
chunks); 584395000 used

NOTICE: exit transformInsertStmt
MessageContext: 629137464 total in 84 blocks; 44742408 free (999991
chunks); 584395056 used

NOTICE: start ExecInitValuesScan
MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks);
1008399424 used

NOTICE: end ExecInitValuesScan
MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks);
1008399424 used
ExecutorState: 8024632 total in 3 blocks; 21256 free (8 chunks); 8003376
used

This shows original SQL statement is about 6MB, by the time we get to
parse analysis we're at almost 500 MB, and that memory is never
recovered. Transforming from ResTarget to TargetEntry chews up about
100MB. Then between exiting transformInsertStmt and entering
ExecInitValuesScan we double in memory usage to about 1 GB. It isn't
shown here, but we add another 200 MB or so during tuple projection. So
we top out at about 1.2 GB. Note that mysql tops out at about 600 MB for
this same SQL.

I'm not sure what if anything can be done to improve the above -- I'm
open to suggestions.

Please note that this patch requires an initdb, although I have not yet
bothered to bump CATVERSION.

Thanks for help, comments, suggestions, etc...

Joe

>
> I'm afraid though that after 2 or so days heading down the last path you
> suggested (namely making a new jointree leaf node) I was having trouble,
> and at the same time came to the conclusion that adding a new RTE was
> alot cleaner and made more sense to me. So I'm hoping you won't want to
> send me back to the drawing board again. I believe I have cleaned up the
> things you objected to:
>
> 1. Now I'm not doing both alternative -- the targetlists are only
> attached to the RTE from the point of parse analysis onward.
> 2. I've eliminated the tuplestore in favor of runtime evaluation
> of the targetlists which are in an array (allowing forward or
> backward scanning -- although I haven't tested the latter yet).
>
> I've also solved the INSERT related issues that I had earlier:
>
> 1. Fixed the rules regression test -- now all regression tests pass
> 2. Fixed evaluation of DEFAULT values
> 3. Improved memory consumption and speed some more -- basically
> we are approximately equal to mysql as long as we don't swap,
> and we consume about twice the RAM as mysql instead of several
> times as much. I have more analysis of memory use I'd also like
> to share later.
> 4. I think the INSERT part of this is ready to go basically, but
> I need a bit more time to test corner cases.
>
> I've made some progress on "SELECT ... FROM (VALUES ...) AS ..."
>
> 1. No more shift/reduce issues
> 2. The ValuesScan work and memory improvements mentioned above
> applies here too.
> 3. This part still needs the most work though.
>
> I'll post a patch in a few hours -- there is some debug code in there
> currently that I should clean up before I send it to the list.
>
> BTW, I'm reserving Saturday, Sunday, and Monday (taking Monday off from
> my day job) to work on outstanding issues. I can continue to work
> through the end of next Friday, 4 August. After that I'm heading to
> Germany on a business trip and my "spare" time will evaporate for a few
> weeks.
>

Attachment Content-Type Size
multi-insert-r17.diff.gz application/x-gzip 20.8 KB

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2006-07-29 17:36:27 Re: Maintenance and External Projects (try 2)
Previous Message Joe Conway 2006-07-28 20:42:17 Re: [HACKERS] 8.2 features?

Browse pgsql-hackers by date

  From Date Subject
Next Message Tzahi Fadida 2006-07-29 10:37:44 Re: Formulating an sql query with CTID
Previous Message Luke Lonergan 2006-07-29 04:34:57 Re: On-disk bitmap index patch

Browse pgsql-patches by date

  From Date Subject
Next Message korryd@enterprisedb.com 2006-07-29 12:18:33 Re: [HACKERS] Possible explanation for Win32 stats regression test
Previous Message Bruce Momjian 2006-07-29 02:54:57 Re: Possible explanation for Win32 stats regression test