Re: Postgres for a "data warehouse", 5-10 TB

From: Stefan Keller <sfkeller(at)gmail(dot)com>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Robert Klemme <shortcutter(at)googlemail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres for a "data warehouse", 5-10 TB
Date: 2011-09-13 18:57:30
Message-ID: CAFcOn2_PA=e4fbhfgRrzov8_OTyzmk7mrpii-NeuCXA7XuJ9wA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Interesting debate.

2011/9/13 Marti Raudsepp <marti(at)juffo(dot)org>:
> Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
> SAVEPOINT it can roll back to in case of an error.

Are you sure? In theory I always understood that there are no
"subtransactions".

In fact when looking at the docs there is chapter 39.6.6. saying "By
default, any error occurring in a PL/pgSQL function aborts execution
of the function, and indeed of the surrounding transaction as well.
You can trap errors and recover from them by using a BEGIN block with
an EXCEPTION clause."
(http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html
)

So the doc isn't totally explicit about this. But whatever: What would
be the the function of a subtransaction? To give the possibility to
recover and continue within the surrounding transaction?

Stefan

2011/9/13 Marti Raudsepp <marti(at)juffo(dot)org>:
> On Tue, Sep 13, 2011 at 19:34, Robert Klemme <shortcutter(at)googlemail(dot)com> wrote:
>> I don't think so.  You only need to catch the error (see attachment).
>> Or does this create a sub transaction?
>
> Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
> SAVEPOINT it can roll back to in case of an error.
>
>> Yes, I mentioned the speed issue.  But regardless of the solution for
>> MySQL's "INSERT..ON DUPLICATE KEY UPDATE" which Igor mentioned you
>> will have the locking problem anyhow if you plan to insert
>> concurrently into the same table and be robust.
>
> In a mass-loading application you can often divide the work between
> threads in a manner that doesn't cause conflicts.
>
> For example, if the unique key is foobar_id and you have 4 threads,
> thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
> (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
> work.
>
> I already suggested this in my original post.
>
> Regards,
> Marti
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Samuel Gendler 2011-09-13 19:13:57 raid array seek performance
Previous Message Carlo Stonebanks 2011-09-13 18:56:53 Re: Migrated from 8.3 to 9.0 - need to update config (re-post)