Re: Fast insertion indexes: why no developments

From: Yann Fontana <yann(dot)fontana(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Leonardo Francalanci <m_lists(at)yahoo(dot)it>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-10-30 14:34:07
Message-ID: CAAiUYKYi3xXS3HX7-F057aq7SSgyDEYvuV6+r7C0Q=JhGknf+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30 October 2013 11:23, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
>
> >> In terms of generality, do you think its worth a man year of developer
> >> effort to replicate what you have already achieved? Who would pay?
>

I work on an application that does exactly what Leonardo described. We hit
the exact same problem, and came up with the same exact same solution (down
to the 15 minutes interval). But I have also worked on other various
datamarts (all using Oracle), and they are all subject to this problem in
some form: B-tree indexes slow down bulk data inserts too much and need to
be disabled or dropped and then recreated after the load. In some cases
this is done easily enough, in others it's more complicated (example: every
day, a process imports from 1 million to 1 billion records into a table
partition that may contain from 0 to 1 billion records. To be as efficient
as possible, you need some logic to compare the number of rows to insert to
the number of rows already present, in order to decide whether to drop the
indexes or not).

Basically, my point is that this is a common problem for datawarehouses and
datamarts. In my view, indexes that don't require developers to work around
poor insert performance would be a significant feature in a
"datawarehouse-ready" DBMS.

Yann

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-10-30 15:04:36 Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Previous Message Hiroshi Saito 2013-10-30 14:29:44 Re: How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?