Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Date: 2008-03-17 02:21:24
Message-ID: 200803170221.m2H2LOp12205@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches pgsql-performance


This has been applied by Tom.

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> Tom Lane wrote:
> > "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> >> Elsewhere in our codebase where we use arrays that are enlarged as
> >> needed, we keep track of the "allocated" size and the "used" size of the
> >> array separately, and only call repalloc when the array fills up, and
> >> repalloc a larger than necessary array when it does. I chose to just
> >> call repalloc every time instead, as repalloc is smart enough to fall
> >> out quickly if the chunk the allocation was made in is already larger
> >> than the new size. There might be some gain avoiding the repeated
> >> repalloc calls, but I doubt it's worth the code complexity, and calling
> >> repalloc with a larger than necessary size can actually force it to
> >> unnecessarily allocate a new, larger chunk instead of reusing the old
> >> one. Thoughts on that?
> >
> > Seems like a pretty bad idea to me, as the behavior you're counting on
> > only applies to chunks up to 8K or thereabouts.
>
> Oh, you're right. Though I'm sure libc realloc has all kinds of smarts
> as well, it does seem better to not rely too much on that.
>
> > In a situation where
> > you are subcommitting lots of XIDs one at a time, this is likely to have
> > quite awful behavior (or at least, you're at the mercy of the local
> > malloc library as to how bad it is). I'd go with the same
> > double-it-each-time-needed approach we use elsewhere.
>
> Yep, patch attached. I also changed xactGetCommittedChildren to return
> the original array instead of copying it, as Alvaro suggested.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com

>
> --
> Sent via pgsql-patches mailing list (pgsql-patches(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-patches

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-03-17 02:53:02 Re: updated hash functions for postgresql v1
Previous Message Tom Lane 2008-03-17 02:20:39 Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2008-03-17 07:14:03 Re: Benchmark: Dell/Perc 6, 8 disk RAID 10
Previous Message Tom Lane 2008-03-17 02:20:39 Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit