Re: WAL logging problem in 9.4.3?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging problem in 9.4.3?
Date: 2015-07-10 08:59:32
Message-ID: 20150710085932.GJ10242@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-07-09 19:06:11 -0400, Tom Lane wrote:
> What evidence have you got to base that value judgement on?
>
> cab9a0656c36739f was based on an actual user complaint, so we have good
> evidence that there are people out there who care about the cost of
> truncating a table many times in one transaction. On the other hand,
> I know of no evidence that anyone's depending on multiple sequential
> COPYs, nor intermixed COPY and INSERT, to be fast. The original argument
> for having this COPY optimization at all was to make restoring pg_dump
> scripts in a single transaction fast; and that use-case doesn't care
> about anything but a single COPY into a virgin table.

Well, you'll hardly have heard complaints about COPY, given that we
behaved like currently for a long while.

I definitely know of ETL like processes that have relied on subsequent
COPYs into truncates relations being cheaper. Can't remember the same
for intermixed COPY and INSERT, but it'd not surprise me if somebody
mixed COPY and UPDATEs rather freely for ETL.

> I think you're worrying about exactly the wrong case.
>
> > My tentative guess is that the best course is to
> > a) Make heap_truncate_one_rel() create a new relfeilnode. That fixes the
> > truncation replay issue.
> > b) Force new pages to be used when using the heap_sync mode in
> > COPY. That avoids the INIT danger you found. It seems rather
> > reasonable to avoid using pages that have already been the target of
> > WAL logging here in general.
>
> And what reason is there to think that this would fix all the
> problems?

Yea, that's the big problem.

> Again, the only known field usage for the COPY optimization is the pg_dump
> scenario; were that not so, we'd have noticed the problem long since.
> So I don't have any faith that this is a well-tested area.

You need to crash in the right moment. I don't think that's that
frequently exercised...

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-07-10 09:00:48 Re: configure can't detect proper pthread flags
Previous Message Heikki Linnakangas 2015-07-10 08:50:33 Re: WAL logging problem in 9.4.3?