Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Date: 2015-06-22 05:21:22
Message-ID: CAMkU=1wSNXi5rZgNAGiXRwqmjPG-t1aSwavqsd5eLQVerHmQQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 20, 2015 at 9:55 AM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> Hi,
>
> On 06/20/2015 03:01 AM, Jeff Janes wrote:
>
>>
>>
>> I don't think we need to really assume the density to be constant,
>> maybe we can verify that while collecting the sample? I mean, we're
>> already reading the pages, so we can track the density, and either
>> do the correction or not.
>>
>>
>> Maybe. I don't know how that would work.
>>
>
> I was thinking about something like
>
> (1) compute average tuples per page
>
> (2) generate random sample of 'samplesize' blocks
>
> (3) for each block, see what is the actual number of tuples on the
> page and perform correction (e.g. if there's only 1/2 the average
> tuples, toss a coin and decide whether to really sample the block)
>
> (4) repeat until you have sample of the requested size
>

Are you producing a block sample or a row sample? If a block is sampled,
then what do you feed it to?

Are you just picking one row out of each block that survives step 3? If
so, that would be similar to my idea of counting a given value only once
per block it was in, except I was thinking of applying that only to
n_distinct, not to the entire stats collection process.

...

>> My solution was to just unbolt the block sampling from the top, and let
>> it sample the rows (still 300 * stats_target of them) from the whole
>> table rather than a random 300 * 100 blocks of the table. On tables of
>> the size I was worried about, block sampling was not very helpful
>> anyway. Reading 30,000 blocks out of 250,000 blocks lead to no
>> meaningful IO advantage on my hardware. Any advantage from skipped
>> blocks was made up for (and sometimes exceeded) by fouling up the
>> read-ahead mechanisms.
>>
> >
>
>> With 1000 times more blocks, that probably won't work for you.
>> But I do wonder, how much time does it take to read a random 1/10,
>> 1/100, 1/1000, 1/10000 of a table of your size, just from an IO
>> perspective? How much are we gaining by doing the block sample?
>>
>
> Well, actually I think it would be even more appropriate for very large
> tables. With a 2.5TB table, you don't really care whether analyze collects
> 5GB or 8GB sample, the difference is rather minor compared to I/O generated
> by the other queries etc.

My answer was to take out the block sampling entirely and read the whole
table. That is what probably won't work for you. (Come to think of it, I
was hesitant to deploy custom code to production, so I never actually
deployed that. Instead I cranked up the stats target and let the chips fall
where they may)

I think you want to go back to the table to find new blocks to replace ones
which were included in the original block sample but ended up having no
tuples in the end tuple sample, either because they were low density
blocks, or just through the luck of the draw. But I don't see how fixes
the problem, unless you also prevent a block from contributing more than 1
tuple. And at that point, I worry about the uneven density again. If a
block has 2 rows selected by pure random chance, I think it would be fine
to keep only one (or, go find another random block to pull one row from as
a replacement). But if it has 2 rows selected because it has more rows
than average, then we would have to pull the replacement from a random
block *of similar density* to avoid swapping one bias for another one.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2015-06-22 05:47:31 Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Previous Message Prakash Itnal 2015-06-22 03:48:39 Re: Is Postgres database server works fine if there is a change in system time?