Re: GSoC 2015 proposal. Bitmap Index-only Count

From: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GSoC 2015 proposal. Bitmap Index-only Count
Date: 2015-03-25 19:30:12
Message-ID: CAP4vRV7Hb74DoC=cMe+iYxk-u70BtH48_+2PQ9UpvJX06O7qTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-03-24 18:01 GMT+04:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com> writes:
> > There is a problem of slow counting in PostgreSQL [1]. The reason why
> this
> > is slow is related to the *MVCC* implementation in PostgreSQL. Index-only
> > scans (implemented since PostgreSQL-9.2) providing some performance
> > improvements where the *visibility map* of the table allows it. That’s
> > good. But it works only for access methods which provide amgettuple
> method.
> > Unfortunately GIN supports only BitmapIndexScan and has no implementation
> > of index_getnext() interface [2].
>
> Right ...
>
> > As a GSoC student I will create new Node “Bitmap Index-Only Scan”, which
> > would catch tuples from Bitmap Index Scan node and pass them to Aggregate
> > node. Thus, new query plan will be as follow:
>
> I'm pretty hesitant about adding a whole new plan node type (which will
> require quite a lot of infrastructure) for such a narrow use-case.
> I think the odds are good that if you proceed down this path, you will
> end up with something that never gets committed to Postgres.
>

Thanks a lot for reply. It was just approximate idea. I thought is wasn't
very good.

I wonder whether it'd be possible to teach GIN to support index_getnext
> instead. Initially it would probably work only for cases where the
> index didn't have to return any columns ... but if we did it, maybe the
> door would be open to cases where GIN could reconstruct actual values.
>
> Another idea is to write index_getnext() for GIN which would return some
fake tuple. Since there is no difference for COUNT aggregate what the tuple
contains. COUNT just wants to know whether we have tuple that satisfy the
qual.
Is this idea better? Is it possible for planner to use index_getnext() for
GIN only with COUNT aggregate?

--
Best regards,
Lubennikova Anastasia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-03-25 19:45:42 Re: Remove fsync ON/OFF as a visible option?
Previous Message Ryan Pedela 2015-03-25 19:21:41 Re: deparsing utility commands