Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-10-01 18:41:33
Message-ID: 20201001184133.oq5uq75sb45pu3aw@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 30, 2020 at 07:57:19AM -0400, John Naylor wrote:
>On Mon, Sep 28, 2020 at 10:12 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>> Is it actually all that different from the existing BRIN indexes?
>> Consider this example:
>>
>> create table x (a text, b text, c text);
>>
>> create index on x using brin (a,b,c);
>>
>> create or replace function random_str(p_len int) returns text as $$
>> select string_agg(x, '') from (select chr(1 + (254 * random())::int ) as x from generate_series(1,$1)) foo;
>> $$ language sql;
>>
>> test=# insert into x select random_str(1000), random_str(1000), random_str(1000);
>> ERROR: index row size 9056 exceeds maximum 8152 for index "x_a_b_c_idx"
>
>Hmm, okay. As for which comes first, insert or index creation, I'm
>baffled, too. I also would expect the example above would take up a
>bit over 6000 bytes, but not 9000.
>

OK, so this seems like a data corruption bug in BRIN, actually.

The ~9000 bytes is actually about right, because the strings are in
UTF-8 so roughly 1.5B per character seems about right. And we have 6
values to store (3 columns, min/max for each), so 6 * 1500 = 9000.

The real question is how come INSERT + CREATE INDEX actually manages to
create an index tuple. And the answer is pretty simple - brin_form_tuple
kinda ignores toasting, happily building index tuples where some values
are toasted.

Consider this:

create table x (a text, b text, c text);
insert into x select random_str(1000), random_str(1000), random_str(1000);

create index on x using brin (a,b,c);
delete from x;
vacuum x;

set enable_seqscan=off;

insert into x select random_str(10), random_str(10), random_str(10);
ERROR: missing chunk number 0 for toast value 16530 in pg_toast_16525

explain analyze select * from x where a = 'xxx';
ERROR: missing chunk number 0 for toast value 16530 in pg_toast_16525

select * from brin_page_items(get_raw_page('x_a_b_c_idx', 2), 'x_a_b_c_idx'::regclass);
ERROR: missing chunk number 0 for toast value 16547 in pg_toast_16541

Interestingly enough, running the select before the insert seems to be
working - not sure why.

Anyway, it behaves like this since 9.5 :-(

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2020-10-01 18:46:53 Rejecting redundant options in Create Collation
Previous Message Andres Freund 2020-10-01 18:26:30 Re: Improving connection scalability: GetSnapshotData()