Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)postgresql(dot)org, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Date: 2020-03-15 21:27:29
Message-ID: 20200315212729.GC26184@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 15, 2020 at 06:15:02PM +0100, Fabien COELHO wrote:
> Some feedback on v10:

Thanks for looking. I'm hoping to hear from Alvaro what he thinks of this
approach (all functions to show isdir, rather than one function which lists
recursively).

> All patches apply cleanly, one on top of the previous. I really wish there
> would be less than 9 patches…

I kept them separate to allow the earlier patches to be applied.
And intended to make easier to review, even if it's more work for me..

If you mean that it's a pain to apply 9 patches, I will suggest to use:
|git am -3 ./mailbox
where ./mailbox is either a copy of the mail you received, or retrieved from
the web interface.

To test that each one works (compiles, passes tests, etc), I use git rebase -i
HEAD~11 and "e"edit the target (set of) patches.

> * v10.1 doc change: ok
>
> * v10.2 doc change: ok, not sure why it is not merged with previous

As I mentioned, separate since I'm proposing that they're backpatched to
different releases. Those could be applied now (and Tom already applied a
patch identical to 0001 in a prior patchset).

> * v10.3 test add: could be merge with both previous

> Tests seem a little contrived. I'm wondering whether something more
> straightforward could be proposed. For instance, once the tablespace is just
> created but not used yet, probably we do know that the tmp file exists and
> is empty?

The tmpdir *doesn't* exist until someone creates tmpfiles there.
As it mentions:
+-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to succeed even if the tmpdir doesn't exist yet

> * v10.4 at least, some code!
> I'm not sure of the "FLAG_" prefix which seems too generic, even if it is
> local. I'd suggest "LS_DIR_*", maybe, as a more specific prefix.

Done.

> ISTM that Pg style requires spaces around operators. I'd suggest some
> parenthesis would help as well, eg: "flags&FLAG_MISSING_OK" -> "(flags &
> FLAG_MISSING_OK)" and other instances.

Partially took your suggestion.

> About:
>
> if (S_ISDIR(attrib.st_mode)) {
> if (flags&FLAG_SKIP_DIRS)
> continue;
> }
>
> and similars, why not the simpler:
>
> if (S_ISDIR(attrib.st_mode) && (flags & FLAG_SKIP_DIRS))
> continue;

That's not the same - if SKIP_DIRS isn't set, your way would fail that test for
dirs, and then hit the !ISREG test, and skip them anyway.
|else if (!S_ISREG(attrib.st_mode))
| continue

> Maybe I'd create defines for long and common flag specs, eg:
>
> #define ..._LS_SIMPLE (FLAG_SKIP_DIRS|FLAG_SKIP_HIDDEN|FLAG_SKIP_SPECIAL|FLAG_METADATA)

Done

> I'm not sure about these asserts:
>
> /* isdir depends on metadata */
> Assert(!(flags&FLAG_ISDIR) || (flags&FLAG_METADATA));
>
> Hmmm. Why?

It's not supported to show isdir without showing metadata (because that case
isn't needed to support the old and the new behaviors).

+ if (flags & FLAG_METADATA)
+ {
+ values[1] = Int64GetDatum((int64) attrib.st_size);
+ values[2] = TimestampTzGetDatum(time_t_to_timestamptz(attrib.st_mtime));
+ if (flags & FLAG_ISDIR)
+ values[3] = BoolGetDatum(S_ISDIR(attrib.st_mode));
+ }

> /* Unreasonable to show isdir and skip dirs */
> Assert(!(flags&FLAG_ISDIR) || !(flags&FLAG_SKIP_DIRS));
>
> Hmmm. Why would I prevent that, even if it has little sense, it should work.
> I do not see having false on the isdir column as an actual issue.

It's unimportant, but testing for intended use of flags during development.

> * v10.6 behavior change for existing functions, always show isdir column,
> and removal of IS_DIR flag.
>
> I'm unsure why the features are removed, some use case may benefit from the
> more complete function?
> Maybe flags defs should not be changed anyway?

Maybe. I put them back...but it means they're not being exercized by any
*used* case.

> I do not like much the "if (...) /* empty */;" code. Maybe it could be
> caught more cleanly later in the conditional structure.

This went away when I put back the SKIP_DIRS flag.

> * v10.7 adds "pg_ls_dir_recurse" function

> Doc looks incomplete and the example is very contrived and badly indented.

Why you think it's contrived? Listing a tmpdir recursively is the initial
motivation of this patch. Maybe you think I should list just the tmpdir for
one tablespace ? Note that for temp_tablespaces parameter:

|When there is more than one name in the list, PostgreSQL chooses a random member of the list each time a temporary object is to be created; except that within a transaction, successively created temporary objects are placed in successive tablespaces from the list.

> The function definition does not follow the style around: uppercase whereas
> all others are lowercase, "" instead of '', no "as"…

I used "" because of this:
| x.name||'/'||a.name
I don't know if there's a better way to join paths in SQL, or if that suggests
this is a bad way to do it.

> I do not understand why oid 8511 is given to the new function.

I used: ./src/include/catalog/unused_oids (maybe not correctly).

> I do not understand why UNION ALL and not UNION.

In general, union ALL can avoid a "distinct" plan node, but it doesn't seem to
have any effect here.

> I would have put the definition after "pg_ls_dir_metadata" definition.

Done

> pg_ls_dir_metadata seems defined as (text,bool,bool) but called as
> (text,bool,bool,bool).

fixed, thanks.

> Maybe a better alias could be given instead of x?
>
> There are no tests for the new function. I'm not sure it would work.

I added something which would've caught the issue with number of arguments.

> * v10.8 change flags & add test on pg_ls_logdir().
>
> I'm unsure why it is done at this stage.

I think it makes sense to allow ls_logdir to succeed even if ./log doesn't
exist, since it isn't created by initdb or during postmaster start, and since
we already using MISSING_OK for tmpdir.

But a separate patch since we didn't previous discuss changing logdir.

> * v10.9 change some ls functions and fix patch 10.7 issue
> I'm unsure why it is done at this stage. "make check" ok.

This is the last patch in the series, since I think it's least likely to be
agreed on.

--
Justin

Attachment Content-Type Size
v11-0001-Document-historic-behavior-about-hiding-director.patch text/x-diff 1.7 KB
v11-0002-Document-historic-behavior-about-hiding-director.patch text/x-diff 906 bytes
v11-0003-Add-tests-exercizing-pg_ls_tmpdir.patch text/x-diff 4.3 KB
v11-0004-Add-pg_ls_dir_metadata-to-list-a-dir-with-file-m.patch text/x-diff 11.6 KB
v11-0005-pg_ls_tmpdir-to-show-isdir-argument.patch text/x-diff 7.6 KB
v11-0006-pg_ls_-dir-to-show-directories-and-isdir-column.patch text/x-diff 10.6 KB
v11-0007-Add-pg_ls_dir_recurse-to-show-dir-recursively.patch text/x-diff 5.3 KB
v11-0008-pg_ls_logdir-to-ignore-error-if-initial-top-dir-.patch text/x-diff 3.0 KB
v11-0009-pg_ls_-dir-to-return-all-the-metadata-from-pg_st.patch text/x-diff 15.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-03-15 23:05:37 Re: Memory-Bounded Hash Aggregation
Previous Message James Coleman 2020-03-15 19:33:51 Re: [PATCH] Incremental sort (was: PoC: Partial sort)