Re: Highly Variable Planning Times

From: Greg Stark <stark(at)mit(dot)edu>
To: Michael Malis <michaelmalis2(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Highly Variable Planning Times
Date: 2017-04-20 12:51:34
Message-ID: CAM-w4HPnTOH7Lvc=h3qOno5YSZrX0CYvh44SFp6H=bVRK1otrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19 April 2017 at 22:39, Michael Malis <michaelmalis2(at)gmail(dot)com> wrote:

>> *At best*, you're doing substantial work in the
>> planner to avoid the first tree descent step or two in a single
>> non-partial index.

Fwiw, in addition to replacing the first few levels of the descent
with planning-time work, there's also an advantage due to the smaller
keys. Effectively these partial indexes are emulating
prefix-compression in the btree.

> While the specific example I gave in the post could be replaced with a
> non-partial index, most of the partial indexes contain predicates that
> are not straightforward to index with non-partial indexes. About 85%
> of the partial indexes contain a regular expression in them for CSS
> selector matching. I've tried using a trigram GIN index, but it wound
> up not working too well due to the data being highly redundant (almost
> every CSS hierarchy contains 'div' in it). Additionally, most of the
> predicates for partial indexes are extremely specific making the
> partial indexes very small. The sum total size of all of the partial
> indexes is still much smaller than if we were to index every necessary
> field with regular indexes.

I wonder if you could implement a FTS parser that tokenized html in
just tokens representing the matching criteria. A GIN index using such
a parser would actually be very similar to what you have as GIN
indexes are basically a collection of btrees...

The operational problem with that is I think it would be even harder
to update a parser than adding a new partial index. I don't think you
can effectively upgrade a parser to include new tokens without
rebuilding any indexes using it. If you wanted to add new selector
critieria live you would probably end up deploying the new parser and
building a new index with CREATE INDEX CONCURRENTLY using the new
parser and then dropping the old index.

I'm not sure if it's possible to do a FTS parser for handling
arbitrary CSS selectors but if you managed that that would be a very
valuable addition to Postgres, IMHO

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-04-20 12:57:04 Re: Use sync commit for logical replication apply in TAP tests
Previous Message Nikolay Shaplov 2017-04-20 12:48:16 Re: pgbench tap tests & minor fixes