Re: [PoC PATCH] Parallel dump to /dev/null

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andreas 'ads' Scherbaum <adsmail(at)wars-nicht(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PoC PATCH] Parallel dump to /dev/null
Date: 2018-03-05 20:41:51
Message-ID: 1520282511.22202.21.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alvaro,

Am Montag, den 05.03.2018, 13:56 -0300 schrieb Alvaro Herrera:
> I made a few amendments (here's v5) and was ready to push, when I
> noticed that _StartBlobs() does not seem to be doing the right thing.
> Did you test this with a few large objects?

I did test it on the regression database, which leaves one or two large
objects behind:

mba(at)fock:~$ psql -h /tmp -p 65432 -d regression -c '\dl'
                Large objects
  ID   | Owner |         Description          
-------+-------+------------------------------
  3001 | mba   | testing comments
 36867 | mba   | I Wandered Lonely as a Cloud
 47229 | mba   | 
(3 rows)

What exactly is wrong with _StartBlobs()? It calls setFilePath() which
makes sure /dev/null is opened and not something else.

> The reason I noticed is I wondered about the amount of open() calls
> (plus zlib function calls) we would save by keeping an FD open to
> /dev/null, rather than opening it over and over for each object -- ie.,
> maybe not touch setFilePath() at all, if possible. That looks perhaps
> too invasive, so maybe not. But do audit other callsites that may open
> files.

I see your point; I've hacked together the attached additional PoC
patch, which keeps the dataFH open. The parallel case was tricky, I had
to add an additional flag to lclContext so that DEVNULL is only opened
once for data files cause I could not find a place where to set it for
parallel workers and it crashed otherwise.

This cuts down the number of /dev/null opens from 352 to 6 for a -j 2
dump of the regression database to /dev/null.

In my opinion, I think this is too evolved and possibly error-prone for
being just an optimization, so I'd drop that for v11 for now and maybe
revisit it later.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment Content-Type Size
parallel-pgdump-keep-dev-null-open.patch text/x-patch 4.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-05 20:52:06 Re: PATCH: psql tab completion for SELECT
Previous Message Thomas Munro 2018-03-05 20:37:01 Re: select_parallel test failure: gather sometimes losing tuples (maybe during rescans)?