Re: WIP: Avoid creation of the free space map for small tables

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Avoid creation of the free space map for small tables
Date: 2019-01-29 11:49:13
Message-ID: CAD21AoBWxiGSStZGWkaBMcK4bPvekcoHMaTU661piEVwiJYTzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 29, 2019 at 9:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jan 29, 2019 at 12:37 AM John Naylor
> <john(dot)naylor(at)2ndquadrant(dot)com> wrote:
> >
> > On Mon, Jan 28, 2019 at 12:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > 2.
> > > @@ -15,13 +15,9 @@
> > > SELECT octet_length(get_raw_page('test_rel_forks', 'main', 100)) AS main_100;
> > > ERROR: block number 100 is out of range for relation "test_rel_forks"
> > > SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 0)) AS fsm_0;
> > > - fsm_0
> > > --------
> > > - 8192
> > > -(1 row)
> > > -
> > > +ERROR: could not open file "base/50769/50798_fsm": No such file or directory
> > > SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 10)) AS fsm_10;
> > > -ERROR: block number 10 is out of range for relation "test_rel_forks"
> > > +ERROR: could not open file "base/50769/50798_fsm": No such file or directory
> > >
> > > This indicates that even though the Vacuum is executed, but the FSM
> > > doesn't get created. This could be due to different BLCKSZ, but the
> > > failed machines don't seem to have a non-default value of it. I am
> > > not sure why this could happen, maybe we need to check once in the
> > > failed regression database to see the size of relation?
> >
> > I'm also having a hard time imagining why this failed. Just in case,
> > we could return ctid in a plpgsql loop and stop as soon as we see the
> > 5th block. I've done that for some tests during development and is a
> > safer method anyway.
> >
>
> I think we can devise some concrete way, but it is better first we try
> to understand why it failed, otherwise there is always a chance that
> we will repeat the mistake in some other case. I think we have no
> other choice, but to request the buildfarm owners to either give us
> the access to see what happens or help us in investigating the
> problem. The four buildfarms where it failed were lapwing, locust,
> dromedary, prairiedog. Among these, the owner of last two is Tom
> Lane and others I don't recognize. Tom, Andrew, can you help us in
> getting the access of one of those four? Yet another alternative is
> the owner can apply the patch attached (this is same what got
> committed) or reset to commit ac88d2962a and execute below statements
> and share the results:
>
> CREATE EXTENSION pageinspect;
>
> CREATE TABLE test_rel_forks (a int);
> INSERT INTO test_rel_forks SELECT i from generate_series(1,1000) i;
> VACUUM test_rel_forks;
> SELECT octet_length(get_raw_page('test_rel_forks', 'main', 0)) AS main_0;
> SELECT octet_length(get_raw_page('test_rel_forks', 'main', 100)) AS main_100;
>
> SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 0)) AS fsm_0;
> SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 10)) AS fsm_10;
>
> SELECT octet_length(get_raw_page('test_rel_forks', 'vm', 0)) AS vm_0;
> SELECT octet_length(get_raw_page('test_rel_forks', 'vm', 1)) AS vm_1;
>
> If the above statements give error: "ERROR: could not open file ...", then run:
> Analyze test_rel_forks;
> Select oid, relname, relpages, reltuples from pg_class where relname
> like 'test%';
>
> The result of the above tests will tell us whether there are 5 pages
> in the table or not. If the table contains 5 pages and throws an
> error, then there is some bug in our code, otherwise, there is
> something specific to those systems where the above insert doesn't
> result in 5 pages.

I'd suspect the alignment of integer. In my environemnt, the tuple
actual size is 28 bytes but the aligned size is 32 bytes (=
MAXALIGN(28)), so we can store 226 tuples to single page. But if
MAXALIGN(28) = 28 then we can store 255 tuples and 1000 tuples fits
within 4 pages. The MAXALIGN of four buildfarms seem 4 accroding to
the configure script so MAXALIGN(28) might be 28 on these buildfarms.

configure:16816: checking alignment of short
configure:16839: result: 2
configure:16851: checking alignment of int
configure:16874: result: 4
configure:16886: checking alignment of long
configure:16909: result: 4
configure:16922: checking alignment of long long int
configure:16945: result: 4

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Evgeniy Efimkin 2019-01-29 11:51:56 Re: [WIP] CREATE SUBSCRIPTION with FOR TABLES clause (table filter)
Previous Message Kyotaro HORIGUCHI 2019-01-29 11:32:54 ALTER SESSION