Re: [PoC PATCH] Parallel dump to /dev/null

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Christoph Berg <christoph(dot)berg(at)credativ(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Andreas 'ads' Scherbaum <adsmail(at)wars-nicht(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PoC PATCH] Parallel dump to /dev/null
Date: 2018-03-20 23:19:32
Message-ID: 20180320231932.GN2416@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Christoph Berg (christoph(dot)berg(at)credativ(dot)de) wrote:
> Re: Tom Lane 2018-03-20 <12960(dot)1521557852(at)sss(dot)pgh(dot)pa(dot)us>
> > It might help if the patch were less enthusiastic about trying to
> > "optimize" by avoiding extra file opens/closes in the case of output
> > to /dev/null. That seems to account for a lot of the additional
> > complication, and I can't see a reason that it'd be worth it.
>
> Note that the last patch was just a PoC to check if the extra
> open/close could be avoided. The "real" patch is the 2nd last.

Even so, I'm really not a fan of this patch either. If we could do this
in a general way where we supported parallel mode with output to stdout
or to a file and then that file could happen to be /dev/null, I'd be
more interested because it's at least reasonable for someone to want
that beyond using pg_dump to (poorly) check for corruption.

As it is, this is an extremely special case which may even end up being
confusing for users (I can run a parallel pg_dump to /dev/null, but not
to a regular file?!).

Instead of trying to use pg_dump for this, we should provide a way to
actually check for corruption across everything (instead of just the
heap..), and have all detected corruption reported in one pass.

One of the things that I really like about PostgreSQL is that we
typically try to provide appropriate tools for the job and avoid just
hacking things to give users a half-solution, which is what this seems
like to me. Let's write a proper tool (perhaps as a background
worker..?) to scan the database (all of it...) which will find and
report corruption anywhere. That'll take another release to do, but
hopefully pushing back on this will encourage that to happen, whereas
allowing this in would actively discourage someone from writing a proper
tool and we would be much worse off for that.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2018-03-20 23:33:19 Re: PQHost() undefined behavior if connecting string contains both host and hostaddr types
Previous Message Andres Freund 2018-03-20 23:18:29 Re: JIT compiling with LLVM v12.2