Re: making relfilenodes 56 bits

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making relfilenodes 56 bits
Date: 2022-07-12 15:26:08
Message-ID: CAFiTN-td-KeCQd0VKu0ywhvx2k655RMHC2UmDY5knDDwPbXPHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 11, 2022 at 9:49 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>

> It also makes me wonder why we're using macros rather than static
> inline functions in buf_internals.h. I wonder whether we could do
> something like this, for example, and keep InvalidForkNumber as -1:
>
> static inline ForkNumber
> BufTagGetForkNum(BufferTag *tagPtr)
> {
> int8 ret;
>
> StaticAssertStmt(MAX_FORKNUM <= INT8_MAX);
> ret = (int8) ((tagPtr->relForkDetails[0] >> BUFFERTAG_RELNUMBER_BITS);
> return (ForkNumber) ret;
> }
>
> Even if we don't use that particular trick, I think we've generally
> been moving toward using static inline functions rather than macros,
> because it provides better type-safety and the code is often easier to
> read. Maybe we should also approach it that way here. Or even commit a
> preparatory patch replacing the existing macros with inline functions.
> Or maybe it's best to leave it alone, not sure.

I think it make sense to convert existing macros as well, I have
attached a patch for the same,
>
> > I had those changes in v7-0003, now I have merged with 0002. This has
> > assert check while replaying the WAL for smgr create and smgr
> > truncate, and while during normal path when allocating the new
> > relfilenumber we are asserting for any existing file.
>
> I think a test-and-elog might be better. Most users won't be running
> assert-enabled builds, but this seems worth checking regardless.

IMHO the recovery time asserts we can convert to elog but one which we
are doing after each GetNewRelFileNumber is better to keep as an
assert as we are doing the file access so it can be costly?

> > I have done some performance tests, with very small values I can see a
> > lot of wait events for RelFileNumberGen but with bigger numbers like
> > 256 or 512 it is not really bad. See results at the end of the
> > mail[1]
>
> It's a little hard to interpret these results because you don't say
> how often you were checking the wait events, or how often the
> operation took to complete. I suppose we can guess the relative time
> scale from the number of Activity events: if there were 190
> WalWriterMain events observed, then the time to complete the operation
> is probably 190 times how often you were checking the wait events, but
> was that every second or every half second or every tenth of a second?

I am executing it after every 0.5 sec using below script in psql
\t
select wait_event_type, wait_event from pg_stat_activity where pid !=
pg_backend_pid()
\watch 0.5

And running test for 60 sec
./pgbench -c 32 -j 32 -T 60 -f create_script.sql -p 54321 postgres

$ cat create_script.sql
select create_table(100);

// function body 'create_table'
CREATE OR REPLACE FUNCTION create_table(count int) RETURNS void AS $$
DECLARE
relname varchar;
pid int;
i int;
BEGIN
SELECT pg_backend_pid() INTO pid;
relname := 'test_' || pid;
FOR i IN 1..count LOOP
EXECUTE format('CREATE TABLE %s(a int)', relname);

EXECUTE format('DROP TABLE %s', relname);
END LOOP;
END;
$$ LANGUAGE plpgsql;

> > I have done these changes during GetNewRelFileNumber() this required
> > to track the last logged record pointer as well but I think this looks
> > clean. With this I can see some reduction in RelFileNumberGen wait
> > event[1]
>
> I find the code you wrote here a little bit magical. I believe it
> depends heavily on choosing to issue the new WAL record when we've
> exhausted exactly 50% of the available space. I suggest having two
> constants, one of which is the number of relfilenumber values per WAL
> record, and the other of which is the threshold for issuing a new WAL
> record. Maybe something like RFN_VALUES_PER_XLOG and
> RFN_NEW_XLOG_THRESHOLD, or something. And then work code that works
> correctly for any value of RFN_NEW_XLOG_THRESHOLD between 0 (don't log
> new RFNs until old allocation is completely exhausted) and
> RFN_VALUES_PER_XLOG - 1 (log new RFNs after using just 1 item from the
> previous allocation). That way, if in the future someone decides to
> change the constant values, they can do that and the code still works.

ok

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v1-0001-Convert-buf_internal.h-macros-to-static-inline-fu.patch text/x-patch 12.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2022-07-12 16:19:06 Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Previous Message Euler Taveira 2022-07-12 15:22:47 Re: Introduce "log_connection_stages" setting.