Re: [sqlsmith] Failed assertion in _hash_splitbucket_guts

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andreas Seltenreich <seltenreich(at)gmx(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <rhaas(at)postgresql(dot)org>
Subject: Re: [sqlsmith] Failed assertion in _hash_splitbucket_guts
Date: 2016-12-03 11:50:11
Message-ID: CAA4eK1KkuKGAQRnWW61Sq+Oi+ycZrfM5aMnrNn3jtCdffRfNDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 3, 2016 at 3:44 PM, Andreas Seltenreich <seltenreich(at)gmx(dot)de> wrote:
> Amit Kapila writes:
>
>> How should I connect to this database? If I use the user fdw
>> mentioned in pg_hba.conf (changed authentication method to trust in
>> pg_hba.conf), it says the user doesn't exist. Can you create a user
>> in the database which I can use?
>
> There is also a superuser "postgres" and an unprivileged user "smith"
> you should be able to login with. You could also start postgres in
> single-user mode to bypass the authentication altogether.
>

Thanks. I have checked and found that my above speculation seems to
be right which means that old bucket contains tuples from previous
split. At the location of Assert, I have printed the values of old
bucket, new bucket and actual bucket to which tuple belongs and below
is the result.

regression=# update public.hash_i4_heap set seqno = public.hash_i4_heap.random;
ERROR: wrong bucket, old bucket:37, new bucket:549, actual bucket:293

So what above means is that tuple should either belong to bucket 37 or
549, but it actually belongs to 293. Both 293 and 549 are the buckets
that are split from splitted from bucket 37 (you can find that by
using calculation as used in _hash_expandtable). I have again checked
the code and couldn't find any other reason execpt from what I
mentioned in my previous mail. So, let us wait for the results of
your new test run.

> Amit Kapila writes:
>
>> Please find attached patch to fix above code. Now, if this is the
>> reason of the problem you are seeing, it won't fix your existing
>> database as it already contains some tuples in the wrong bucket. Can
>> you please re-run the test to see if you can reproduce the problem?
>
> Ok, I'll do testing with the patch applied.
>
> Btw, I also find entries like following in the logging database:
>
> ERROR: could not read block 2638 in file "base/16384/17256": read only 0 of 8192 bytes
>
> …with relfilenode being an hash index. I usually ignore these as they
> naturally start occuring after a recovery because of an unrelated crash.
> But since 11003eb, they also occur when the cluster has not yet suffered
> a crash.
>

Hmm, I am not sure if this is related to previous problem, but it
could be. Is it possible to get the operation and or callstack for
above failure?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-12-03 13:14:15 Better support for symlinks on Windows...
Previous Message Dean Rasheed 2016-12-03 10:52:14 Re: Add support for restrictive RLS policies