Re: Multicolumn hash indexes

From: Nico Williams <nico(at)cryptonector(dot)com>
To: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomasz Ostrowski <tometzky+pg(at)ato(dot)waw(dot)pl>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multicolumn hash indexes
Date: 2017-09-27 15:33:59
Message-ID: 20170927153357.GB1251@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 27, 2017 at 09:56:26AM -0400, Jesper Pedersen wrote:
> On 09/26/2017 08:11 PM, Robert Haas wrote:
> >On Tue, Sep 26, 2017 at 7:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>Tomasz Ostrowski <tometzky+pg(at)ato(dot)waw(dot)pl> writes:
> >>>I've noticed that hash indexes can't currently (in PG10) be multicolumn.
> >>>Are they technically hard to implement or just nobody took such a feature?
> >>
> >>It's not simple, particularly not if you wish that the index would support
> >>queries specifying conditions for just a subset of the indexed columns
> >>(an assumption that's buried pretty deeply in the planner, for one thing).
> >>Then you couldn't compute the hash.
> >
> >Whoa, that seems like moving the goalposts. Somebody could design a
> >hash index that was intended to answer such queries, but it's not the
> >one we've got. I think we should just aim to support equality queries
> >on all columns. That seems like a fairly straightforward
> >generalization of what we've already got.
> >
>
> This would require that the applications that are using the database knows
> about the index structure in order to pass down the right columns.

That's hardly a problem. We (app devs) already do this all the time.
Sometimes we screw up and get a poor query plan, we figure it out, and
add an index or rewrite the query, or make some index more covering, ...

The real problems, according to Tom Lane, are that the query planner
makes deep assumptions about index keys having prefixes, which b-tree
indexes do but hash indexes do not, and also that optimizations that are
applied before selecting indexes can make it impossible to select a hash
index intended by the author of a query. The latter problem is not
fatal, but can lead to surprises if not fixed, while the former problem
is fatal until fixed.

> I would say that in most cases that applications doesn't know about index
> structures. So, multiple indexes would have to be created (col1), (col1,
> col2), (col1, col3), ... which isn't ideal.

Nonsense, and also irrelevant to the question of whether multi-column
hash indexes should be supported. It's not about what Joe User knows to
do, but about what <whoever wants this feature> knows to do.

BTW, one could probably use an expression hash index to get a multi-
column hash index anyways, by using row values, JSON encodings of row
values, or some ad-hoc text representation of row values.

Nico
--

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-09-27 15:34:22 Re: Multicolumn hash indexes
Previous Message Robert Haas 2017-09-27 15:32:37 Re: Multicolumn hash indexes