Re: Hash partitioning.

From: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
To: Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Christopher Browne <cbbrowne(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Yuri Levinsky <yuril(at)celltick(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: Hash partitioning.
Date: 2013-06-27 09:45:53
Message-ID: CA+CSw_s0fVZwepWxEpv+5yh4e3QTp246STbkPFpgk2uhqdm1KQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jun 27, 2013 12:24 PM, "Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com>
wrote:
>
> 2013/6/27 Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>:
>
> > When each index requires one extra I/O (because each index is
> > one level taller), that is 50 extra I/Os. In the partitioned case,
> > each index would require the normal smaller amount of I/Os.
>
> [..]
>
> > Using those other indexes (both for look-ups and
> > updates) in the non-partitioned case, will therefore pull a huge
> > portion of each index into cache (because of the “random distribution”
> > of the non-date data). In the partitioned case, more cache can be
> > spent on the indexes that correspond to the “hot partitions.”
>
> It seems that the system described by Claudio fixes this problem another
way:
>
> Claudio wrote:
>
> > Now I just have two indices. One that indexes only hot tuples, it's
> > very heavily queried and works blazingly fast, and one that indexes by
> > (hotness, key).

This is not really related to hash partitioning, but you can also do index
partitioning while having the tables unpartitioned. If the hotness field is
a timestamp like it often is, you can create a predicate index on (key,
tstamp) where tstamp > [some date in recent past], and replace the index
with a newer one every so often to keep the size small. This way you can
have a non-partitioned index for batch queries and a small one for the OLTP
workload. If we added the option to build indexes using an index only scan,
building the replacement index would be quite cheap.

Regards,
Ants Aasma

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Farmiloe 2013-06-27 09:49:25 Re: ASYNC Privileges proposal
Previous Message Magnus Hagander 2013-06-27 09:29:54 Re: Documentation/help for materialized and recursive views