Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir
Date: 2021-10-22 10:48:04
Message-ID: CALj2ACXbkyA0FL7kKsiU6fh+mFHvxx=YOz+L+8a0hhPWgO3SfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 22, 2021 at 3:18 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Oct 8, 2021 at 4:39 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > At times, users want to know what are the files (snapshot and mapping
> > files) that are available under pg_logical directory and also the
> > spill files that are under pg_replslot directory and how much space
> > they occupy.
> >
>
> Why can't you use pg_ls_dir to see the contents of pg_replslot? To
> know the space taken by spilling, you might want to check
> pg_stat_replication_slots[1] as that gives information about
> spill_bytes.
>
> [1] - https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-SLOTS-VIEW

Thanks Amit!

pg_ls_dir gives the list of directories and files, but not their
sizes. And it looks like the spill_bytes from
pg_stat_replication_slots is the accumulated byte count (see [1]), not
the current size of the spills files, so it's not representing the
spill files and their size at that moment.

If we have pg_ls_logicaldir and pg_ls_replslotdir returning the
files, szies, and last modified times, it will be useful in production
environments to see the disk usage of those files at the current
moment. The data from these functions can be fed to an external
analytics tool invoking the functions at regular intervals of time and
report the disk usage of these folders. This will be super useful to
analyze the questions like: Was the disk usage more at time t1? What
happened to my database system at that time? etc. And, these
functions can run independent of the stats collector process which is
currently required for the pg_stat_replication_slots view.

Thoughts?

I plan to work on a patch if okay.

[1]
postgres=# select
pg_ls_dir('/home/bharath/postgres/inst/bin/data/pg_replslot/mysub');
pg_ls_dir
-----------
state
(1 row)

postgres=# select * from pg_stat_replication_slots;
slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
stream_count | stream_bytes | total_txns | total_bytes | stats_reset
-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+-------------
mysub | 3 | 6 | 396000000 | 0 |
0 | 0 | 5 | 396001128 |
(1 row)

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-10-22 10:50:33 Re: Multi-Column List Partitioning
Previous Message Bharath Rupireddy 2021-10-22 10:04:36 Re: add retry mechanism for achieving recovery target before emitting FATA error "recovery ended before configured recovery target was reached"