Re: GIN Index for low cardinality

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: srkrishna1(at)aol(dot)com
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: GIN Index for low cardinality
Date: 2018-10-26 15:59:17
Message-ID: CAMkU=1yVe4n5pzMy4bSjGW+0hVbib6zkRwteydH72c_D+0vE-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Oct 17, 2018 at 6:47 AM Ravi Krishna <srkrishna1(at)aol(dot)com> wrote:

> In
> https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/
>
> it is mentioned:
>
> "GIN, the most know non-default index type perhaps, has been actually
> around for ages (full-text search) and in short is perfect for indexing
> columns where there are lot of repeating values – think all kinds of
> statuses or good old Mr/Mrs/Miss. GIN only stores every unique column value
> only once as for the default B-tree you’ll have e.g. 1 millon leaf nodes
> with the integer “1” in it."
>
>
> Does it mean that GIN is a very good choice for low cardinality columns.
>

For extremely low cardinality, like Mr. Mrs. Miss., I doubt any index would
be very useful. For less extreme cases, like say one million different
values present around 50 times each, yes, it can be useful to keep the
index size down. It will not support needing to deliver rows in sorted
order, for example to fulfill an ORDER BY or a merge join. Think carefully
about what setting you want for fast_update, and, if set to on, then what
value to use for gin_pending_list_limit.

Cheers,

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Janes 2018-10-26 16:27:21 Re: GIN Index for low cardinality
Previous Message Jeff Janes 2018-10-26 15:42:59 Re: Different memory allocation strategy in Postgres 11?