Re: BUG #16280: dead tuples (probably) effect plan and query performance

From: "Serbin, Ilya" <iserbin(at)bostonsd(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16280: dead tuples (probably) effect plan and query performance
Date: 2020-02-28 09:25:22
Message-ID: CALTXVijBNY+yqvDfRCmL_2ZAVmJN_hEvJvWLJ2mxJiFoTD=oZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello, Tom!
Thanks for your answer. My concern is that plan changes after a relatively
small number of dead tuples. Bad plan is being generated when table1
contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table
(409k+ entries).
In addition, table is growing and currently there are 425k of entries.
However even on 425k size table plan stays good until number of dead tuples
reaches 300-400, аfter that plan changes to the bad one.
As I said, I tried analyzing table with various default_statistics_target
(100-1000 with step of 100) - plan stays bad. Tried setting
random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to
good one only starting from random_page_cost=0.1 and seq_page_cost=8.
However, once I ran vacuum - plan changed to good one and stayed the same
even when I set random_page_cost=30;set seq_page_cost=1;
I realize that I can set autovacuum thresholds for this table to trigger it
once dead tuples reach 300, but it doesn't seem right to me (this number of
changes happens in something like 5 minutes and tables is ~2GB size as of
now).

Why does such a small amount (0.07%) of dead tuples changes cost
estimations so dramatically? Or am I missing something and dead tuples has
nothing to do with it?

Thanks in advance,
Ilya

чт, 27 февр. 2020 г. в 18:41, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> > Issue description:
> > After some minimal activity in database first plan changes to the second.
> > Analyze on table1 do not help (tried with various
> default_statistics_target
> > values).
> > content_idx index recreation helps for some time, but several minutes
> later
> > plan degrades back to second one.
> > The only thing helped (surprisingly) is vacuum. It also helps for some
> time,
> > but once number of dead tuples reaches something like 300-500 - plan
> > changes back to second one.
>
> Surely it is *not* a bug that dead tuples affect the plan choice.
> The density of live tuples is an important factor in the relative
> costs of different table scan techniques.
>
> In the case at hand, I wonder why your rowcount estimate is off
> by a factor of 50:
>
> -> Bitmap Index Scan on content_idx (cost=0.00..155.07 rows=409
> width=0) (actual time=4.932..4.932 rows=21952 loops=1)
> Index Cond: (content @> '{"anotherjsonkey": {"values":
> ["13"]}}'::jsonb)
> Buffers: shared hit=48
>
> If you can't improve that you're not likely to get a good plan, and
> futzing around with cost factors to make this particular query do
> "the right thing" anyway is inevitably going to make things worse
> for other queries. Maybe a larger stats target for the content column
> would help, but I fear that this @> condition is just beyond
> the ability of the planner to estimate. You might need to redesign
> the data representation to make it a bit more SQL-friendly.
>
> regards, tom lane
>

--
Илья Сербин | Старший Администратор БД
iserbin(at)bostonsd(dot)ru
Мобильный : 8-918-895-05-96 | Telegram : @iserbin_61
<https://t.me/iserbin_61>
DBI 24/7 контакт : 8-800-333-65-79 | Сайт : http://dbi.ru/

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Sandeep Thakkar 2020-02-28 13:13:52 Re: BUG #16274: Repeated Libraries in Mac
Previous Message Juan José Santamaría Flecha 2020-02-28 09:15:45 Re: BUG #15858: could not stat file - over 4GB