Re: GSoC on WAL-logging hash indexes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Tan Tran <tankimtran(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GSoC on WAL-logging hash indexes
Date: 2014-03-06 23:14:21
Message-ID: CA+Tgmoa=WOK7Pg8uza4LS43sucbibYk3XEhRcmRoqWXUDZQ6rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers pgsql-students

On Thu, Mar 6, 2014 at 3:44 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Putting the split-in-progress flag in the new bucket's primary page
>> makes a lot of sense. I don't have any problem with dumping the rest
>> of it for a first cut if we have a different long-term plan for how to
>> improve concurrency, but I don't see much point in going to a lot of
>> work to implement a system for WAL logging if we're going to end up
>> having to afterwards throw it out and start from scratch to get rid of
>> the heavyweight locks - and it's not obvious to me how what you have
>> here could be extended to do that.
>
> +1 I don't think we have to improve concurrency at the same time as WAL
> logging, but we at least have to implement WAL logging in a way that doesn't
> foreclose future improvements to concurrency.
>
> I've been tempted to implement a new type of hash index that allows both WAL
> and high concurrency, simply by disallowing bucket splits. At the index
> creation time you use a storage parameter to specify the number of buckets,
> and that is that. If you mis-planned, build a new index with more buckets,
> possibly concurrently, and drop the too-small one.

Yeah, we could certainly do something like that. It sort of sucks,
though. I mean, it's probably pretty easy to know that starting with
the default 2 buckets is not going to be enough; most people will at
least be smart enough to start with, say, 1024. But are you going to
know whether you need 32768 or 1048576 or 33554432? A lot of people
won't, and we have more than enough reasons for performance to degrade
over time as it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Greg Stark 2014-03-07 00:07:41 Re: GSoC on WAL-logging hash indexes
Previous Message Jeff Janes 2014-03-06 20:44:54 Re: GSoC on WAL-logging hash indexes

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-03-06 23:49:16 Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Previous Message Noah Misch 2014-03-06 22:43:40 Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:

Browse pgsql-students by date

  From Date Subject
Next Message Greg Stark 2014-03-07 00:07:41 Re: GSoC on WAL-logging hash indexes
Previous Message Jeff Janes 2014-03-06 20:44:54 Re: GSoC on WAL-logging hash indexes