Re: hash index improving v3

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Alex Hunsaker <badalex(at)gmail(dot)com>, Kenneth Marshall <ktm(at)rice(dot)edu>, Xiao Meng <mx(dot)cogito(at)gmail(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: hash index improving v3
Date: 2008-09-23 04:31:02
Message-ID: 1222144262.4445.265.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-09-23 at 00:05 -0400, Tom Lane wrote:
> "Jonah H. Harris" <jonah(dot)harris(at)gmail(dot)com> writes:
> > On Mon, Sep 22, 2008 at 11:25 PM, Alex Hunsaker <badalex(at)gmail(dot)com> wrote:
> >> On Sun, Sep 14, 2008 at 8:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> I'm considering changing hashbuild to switch over at shared_buffers instead
> >>> of effective_cache_size --- any thoughts about that?
> >>
> >> Switching to shared_buffers gets my vote, on my test table with
> >> 50,000,000 rows it takes about 8 minutes to create an index using the
> >> default effective_cache_size. With effective_cache_size set to 6GB
> >> (machine has 8GB) its still going an hour later.
>
> > Agreed. I think using shared_buffers as a cutoff is a much better idea as well.
>
> Already done in CVS a week or so back, but thanks for following up with
> some confirmation.

I think nobody noticed that change, or the discussion.

I wasn't very happy with effective_cache_size and not happy with
shared_buffers either. If building hash indexes is memory critical then
we just need to say so and encourage others to set memory use correctly.
People are already aware that maintenance_work_mem needs to be increased
for large index builds and we will confuse people if we ignore that and
use another parameter instead. At the very least, a controllable
parameter is the only appropriate choice for production systems, not one
that cannot be changed without restart. It would also throw away any
chance of having pg_restore of hash indexes run in parallel, since it
will not be able to control memory usage. Setting maintenance_work_mem
higher than shared buffers is easily possible now and we should just use
that rather than try to prejudge how people will and can configure their
systems. If maintenance_work_mem default is too low, lets increase it.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-09-23 04:44:34 Re: FSM, now without WAL-logging
Previous Message Tom Lane 2008-09-23 04:24:18 Re: pg_type.h regression?

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-09-23 04:48:36 Re: hash index improving v3
Previous Message Tom Lane 2008-09-23 04:05:59 Re: hash index improving v3