Re: Hash Indexes

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Indexes
Date: 2016-12-06 20:32:58
Message-ID: CAMkU=1ydfriLCOriJ=AxtF=hhBOUUcWtf172vquDrj=3T7yXmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 6, 2016 at 4:00 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Tue, Dec 6, 2016 at 1:29 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> >
> >
> > I just occasionally insert a bunch of equal tuples, which have to be in
> > overflow pages no matter how much splitting happens.
> >
> > I am getting vacuum errors against HEAD, after about 20 minutes or so (8
> > cores).
> >
> > 49233 XX002 2016-12-05 23:06:44.087 PST:ERROR: index "foo_index_idx"
> > contains unexpected zero page at block 64941
> > 49233 XX002 2016-12-05 23:06:44.087 PST:HINT: Please REINDEX it.
> > 49233 XX002 2016-12-05 23:06:44.087 PST:CONTEXT: automatic vacuum of
> table
> > "jjanes.public.foo"
> >
>
> Thanks for the report. This can happen due to vacuum trying to access
> a new page. Vacuum can do so if (a) the calculation for maxbuckets
> (in metapage) is wrong or (b) it is trying to free the overflow page
> twice. Offhand, I don't see that can happen in code. I will
> investigate further to see if there is any another reason why vacuum
> can access the new page. BTW, have you done the test after commit
> 2f4193c3, that doesn't appear to be the cause of this problem, but
> still, it is better to test after that fix. I am trying to reproduce
> the issue, but in the meantime, if by anychance, you have a callstack,
> then please share the same.
>

It looks like I compiled the code for testing a few minutes before 2f4193c3
went in.

I've run it again with cb9dcbc1eebd8, after promoting the ERROR in
_hash_checkpage, hashutil.c:174 to a PANIC so that I can get a coredump to
feed to gdb.

This time it was an INSERT, not autovac, that got the error:

35495 INSERT XX002 2016-12-06 09:25:09.517 PST:PANIC: XX002: index
"foo_index_idx" contains unexpected zero page at block 202328
35495 INSERT XX002 2016-12-06 09:25:09.517 PST:HINT: Please REINDEX it.
35495 INSERT XX002 2016-12-06 09:25:09.517 PST:LOCATION: _hash_checkpage,
hashutil.c:174
35495 INSERT XX002 2016-12-06 09:25:09.517 PST:STATEMENT: insert into foo
(index) select $1 from generate_series(1,10000)

#0 0x0000003838c325e5 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x0000003838c325e5 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003838c33dc5 in abort () at abort.c:92
#2 0x00000000007d6adf in errfinish (dummy=<value optimized out>) at
elog.c:557
#3 0x0000000000498d93 in _hash_checkpage (rel=0x7f4d030906a0, buf=<value
optimized out>, flags=<value optimized out>) at hashutil.c:169
#4 0x00000000004967cf in _hash_getbuf_with_strategy (rel=0x7f4d030906a0,
blkno=<value optimized out>, access=2, flags=1, bstrategy=<value optimized
out>)
at hashpage.c:234
#5 0x0000000000493dbb in hashbucketcleanup (rel=0x7f4d030906a0,
cur_bucket=14544, bucket_buf=7801, bucket_blkno=22864, bstrategy=0x0,
maxbucket=276687,
highmask=524287, lowmask=262143, tuples_removed=0x0,
num_index_tuples=0x0, split_cleanup=1 '\001', callback=0,
callback_state=0x0) at hash.c:799
#6 0x0000000000497921 in _hash_expandtable (rel=0x7f4d030906a0,
metabuf=7961) at hashpage.c:668
#7 0x0000000000495596 in _hash_doinsert (rel=0x7f4d030906a0,
itup=0x1f426b0) at hashinsert.c:236
#8 0x0000000000494830 in hashinsert (rel=0x7f4d030906a0, values=<value
optimized out>, isnull=<value optimized out>, ht_ctid=0x7f4d03076404,
heapRel=<value optimized out>, checkUnique=<value optimized out>) at
hash.c:247
#9 0x00000000005c81bc in ExecInsertIndexTuples (slot=0x1f28940,
tupleid=0x7f4d03076404, estate=0x1f28280, noDupErr=0 '\000',
specConflict=0x0,
arbiterIndexes=0x0) at execIndexing.c:389
#10 0x00000000005e74ad in ExecInsert (node=0x1f284d0) at
nodeModifyTable.c:496
#11 ExecModifyTable (node=0x1f284d0) at nodeModifyTable.c:1511
#12 0x00000000005cc9d8 in ExecProcNode (node=0x1f284d0) at
execProcnode.c:396
#13 0x00000000005ca53a in ExecutePlan (queryDesc=0x1f21a30,
direction=<value optimized out>, count=0) at execMain.c:1567
#14 standard_ExecutorRun (queryDesc=0x1f21a30, direction=<value optimized
out>, count=0) at execMain.c:338
#15 0x00007f4d0c1a6dfb in pgss_ExecutorRun (queryDesc=0x1f21a30,
direction=ForwardScanDirection, count=0) at pg_stat_statements.c:877
#16 0x00000000006dfc8a in ProcessQuery (plan=<value optimized out>,
sourceText=0x1f21990 "insert into foo (index) select $1 from
generate_series(1,10000)",
params=0x1f219e0, dest=0xc191c0, completionTag=0x7ffe82a959d0 "") at
pquery.c:185
#17 0x00000000006dfeda in PortalRunMulti (portal=0x1e86900, isTopLevel=1
'\001', setHoldSnapshot=0 '\000', dest=0xc191c0, altdest=0xc191c0,
completionTag=0x7ffe82a959d0 "") at pquery.c:1299
#18 0x00000000006e056c in PortalRun (portal=0x1e86900,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x1eec870,
altdest=0x1eec870,
completionTag=0x7ffe82a959d0 "") at pquery.c:813
#19 0x00000000006de832 in exec_execute_message (argc=<value optimized out>,
argv=<value optimized out>, dbname=0x1e933b8 "jjanes",
username=<value optimized out>) at postgres.c:1977
#20 PostgresMain (argc=<value optimized out>, argv=<value optimized out>,
dbname=0x1e933b8 "jjanes", username=<value optimized out>) at
postgres.c:4132
#21 0x000000000067e8a4 in BackendRun (argc=<value optimized out>,
argv=<value optimized out>) at postmaster.c:4274
#22 BackendStartup (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:3946
#23 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at
postmaster.c:1704
#24 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:1312
#25 0x0000000000606388 in main (argc=2, argv=0x1e68320) at main.c:228

Attached is the 'bt full' output.

Cheers,

Jeff

Attachment Content-Type Size
gdb.out application/octet-stream 12.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nico Williams 2016-12-06 20:35:43 Re: WIP: Faster Expression Processing and Tuple Deforming (including JIT)
Previous Message Robert Haas 2016-12-06 20:31:16 Re: Compiler warnings