Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Date: 2022-06-24 04:35:49
Message-ID: 20220624043549.GC22452@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This thread has been going for 2.5 years, so here's a(nother) recap.

This omits the patches for recursion, since they're optional and evidently a
distraction from the main patches.

On Fri, Dec 27, 2019 at 11:02:20AM -0600, Justin Pryzby wrote:
> The goal is to somehow show tmpfiles (or at least dirs) used by parallel
> workers.

On Thu, Jan 16, 2020 at 08:38:46AM -0600, Justin Pryzby wrote:
> I think if someone wants the full generality, they can do this:
>
> postgres=# SELECT name, s.size, s.modification, s.isdir FROM (SELECT 'base/pgsql_tmp'p)p, pg_ls_dir(p)name, pg_stat_file(p||'/'||name)s;
> name | size | modification | isdir
> ------+------+------------------------+-------
> .foo | 4096 | 2020-01-16 08:57:04-05 | t
>
> In my mind, pg_ls_tmpdir() is for showing tmpfiles, not just a shortcut to
> SELECT pg_ls_dir((SELECT 'base/pgsql_tmp'p)); -- or, for all tablespaces:
> WITH x AS (SELECT format('/PG_%s_%s', split_part(current_setting('server_version'), '.', 1), catalog_version_no) suffix FROM pg_control_system()), y AS (SELECT a, pg_ls_dir(a) AS d FROM (SELECT DISTINCT COALESCE(NULLIF(pg_tablespace_location(oid),'')||suffix, 'base') a FROM pg_tablespace,x)a) SELECT a, pg_ls_dir(a||'/pgsql_tmp') FROM y WHERE d='pgsql_tmp';

On Tue, Mar 10, 2020 at 01:30:37PM -0500, Justin Pryzby wrote:
> I took a step back, and I wondered whether we should add a generic function for
> listing a dir with metadata, possibly instead of changing the existing
> functions. Then one could do pg_ls_dir_metadata('pg_wal',false,false);
>
> Since pg8.1, we have pg_ls_dir() to show a list of files. Since pg10, we've
> had pg_ls_logdir and pg_ls_waldir, which show not only file names but also
> (some) metadata (size, mtime). And since pg12, we've had pg_ls_tmpfile and
> pg_ls_archive_statusdir, which also show metadata.
>
> ...but there's no a function which lists the metadata of an directory other
> than tmp, wal, log.
>
> One can do this:
> |SELECT b.*, c.* FROM (SELECT 'base' a)a, LATERAL (SELECT a||'/'||pg_ls_dir(a.a)b)b, pg_stat_file(b)c;
> ..but that's not as helpful as allowing:
> |SELECT * FROM pg_ls_dir_metadata('.',true,true);
>
> There's also no function which recurses into an arbitrary directory, so it
> seems shortsighted to provide a function to recursively list a tmpdir.
>
> Also, since pg_ls_dir_metadata indicates whether the path is a dir, one can
> write a SQL function to show the dir recursively. It'd be trivial to plug in
> wal/log/tmp (it seems like tmpdirs of other tablespace's are not entirely
> trivial).
> |SELECT * FROM pg_ls_dir_recurse('base/pgsql_tmp');

> It's pretty unfortunate if a function called
> pg_ls_tmpdir hides shared filesets, so maybe it really is best to change that
> (it's new in v12).

On Fri, Mar 13, 2020 at 08:12:32AM -0500, Justin Pryzby wrote:
> The merge conflict presents another opportunity to solicit comments on the new
> approach. Rather than making "recurse into tmpdir" the end goal:
>
> - add a function to show metadata of an arbitrary dir;
> - add isdir arguments to pg_ls_* functions (including pg_ls_tmpdir but not
> pg_ls_dir).
> - maybe add pg_ls_dir_recurse, which satisfies the original need;
> - retire pg_ls_dir (does this work with tuplestore?)
> - profit
>
> The alternative seems to be to go back to Alvaro's earlier proposal:
> - not only add "isdir", but also recurse;
>
> I think I would insist on adding a general function to recurse into any dir.
> And *optionally* change ps_ls_* to recurse (either by accepting an argument, or
> by making that a separate patch to debate).

On Tue, Mar 31, 2020 at 03:08:12PM -0500, Justin Pryzby wrote:
> The patch intends to fix the issue of "failing to show failed filesets"
> (because dirs are skipped) while also generalizing existing functions (to show
> directories and "isdir" column) and providing some more flexible ones (to list
> file and metadata of a dir, which is currently possible [only] for "special"
> directories, or by recursively calling pg_stat_file).

On Wed, Dec 23, 2020 at 01:17:10PM -0600, Justin Pryzby wrote:
> However, pg_ls_tmpdir is special since it handles tablespace tmpdirs, which it
> seems is not trivial to get from sql:
>
> +SELECT * FROM (SELECT DISTINCT COALESCE(NULLIF(pg_tablespace_location(b.oid),'')||suffix, 'base/pgsql_tmp') AS dir
> +FROM pg_tablespace b, pg_control_system() pcs,
> +LATERAL format('/PG_%s_%s', left(current_setting('server_version_num'), 2), pcs.catalog_version_no) AS suffix) AS dir,
> +LATERAL pg_ls_dir_recurse(dir) AS a;
>
> For context, the line of reasoning that led me to this patch series was
> something like this:
>
> 0) Why can't I list shared tempfiles (dirs) using pg_ls_tmpdir() ?
> 1) Implement recursion for pg_ls_tmpdir();
> 2) Eventually realize that it's silly to implement a function to recurse into
> one particular directory when no general feature exists;
> 3) Implement generic facility;

On Tue, Apr 06, 2021 at 11:01:31AM -0500, Justin Pryzby wrote:
> The first handful of patches address the original issue, and I think could be
> "ready":
>
> $ git log --oneline origin..pg-ls-dir-new |tac
> ... Document historic behavior of links to directories..
> ... Add tests on pg_ls_dir before changing it
> ... Add pg_ls_dir_metadata to list a dir with file metadata..
> ... pg_ls_tmpdir to show directories and "isdir" argument..
> ... pg_ls_*dir to show directories and "isdir" column..
>
> These others are optional:
> ... pg_ls_logdir to ignore error if initial/top dir is missing..
> ... pg_ls_*dir to return all the metadata from pg_stat_file..
>
> ..and these maybe requires more work for lstat on windows:
> ... pg_stat_file and pg_ls_dir_* to use lstat()..
> ... pg_ls_*/pg_stat_file to show file *type*..
> ... Preserve pg_stat_file() isdir..
> ... Add recursion option in pg_ls_dir_files..

On Tue, Jan 25, 2022 at 01:27:55PM -0600, Justin Pryzby wrote:
> The original motive for the patch was that pg_ls_tmpdir doesn't show shared
> filesets.

Attachment Content-Type Size
v36-0001-Document-historic-behavior-of-links-to-directori.patch text/x-diff 1.0 KB
v36-0002-Add-tests-before-changing-pg_ls_.patch text/x-diff 3.4 KB
v36-0003-Add-pg_ls_dir_metadata-to-list-a-dir-with-file-m.patch text/x-diff 17.8 KB
v36-0004-pg_ls_tmpdir-to-show-directories-and-isdir-argum.patch text/x-diff 6.6 KB
v36-0005-pg_ls_-dir-to-show-directories-and-isdir-column.patch text/x-diff 13.2 KB
v36-0006-pg_ls_logdir-to-ignore-error-if-initial-top-dir-.patch text/x-diff 3.1 KB
v36-0007-pg_ls_-dir-to-return-all-the-metadata-from-pg_st.patch text/x-diff 21.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-06-24 04:50:19 Re: Handle infinite recursion in logical replication setup
Previous Message Michael Paquier 2022-06-24 03:59:03 Re: amcheck is using a wrong macro to check compressed-ness