Re: BRIN index which is much faster never chosen by planner

From: Jeremy Finzel <finzelj(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Michael Lewis <mlewis(at)entrata(dot)com>
Subject: Re: BRIN index which is much faster never chosen by planner
Date: 2019-10-11 14:31:08
Message-ID: CAMa1XUgpAD_NHa7cSnpmjPVqXGh3gBLYaTLfQsWQUVg5PU=KGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 10, 2019 at 6:13 PM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

>
> So this seems like a combination of multiple issues. Firstly, the bitmap
> index scan on rec_insert_time_brin_1000 estimate seems somewhat poor. It
> might be worth increasing stats target on that column, or something like
> that. Not sure, but it's about the only "fixable" thing here, I think.
>

In the OP I had mentioned that I already increased it to 5000, and it made
no difference. Ah fine.... let's go ahead and try 10000... still no change:

foo_prod=# ALTER TABLE log_table ALTER COLUMN rec_insert_time SET
STATISTICS 10000;
ALTER TABLE
foo_prod=# ANALYZE log_table;
ANALYZE
foo_prod=# EXPLAIN
SELECT
category, source, MIN(rec_insert_time) OVER (partition by source order by
rec_insert_time) AS first_source_time, MAX(rec_insert_time) OVER (partition
by source order by rec_insert_time) AS last_source_time
FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
category, source(field1) AS source, rec_insert_time
FROM log_table l
INNER JOIN public.small_join_table filter ON filter.category = l.category
WHERE field1 IS NOT NULL AND l.category = 'music'
AND l.rec_insert_time >= now() - interval '10 days'
ORDER BY brand_id, last_change, log_id, rec_insert_time DESC) unique_cases;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=24451299.20..24451313.21 rows=623 width=120)
-> Sort (cost=24451299.20..24451300.75 rows=623 width=104)
Sort Key: unique_cases.source, unique_cases.rec_insert_time
-> Subquery Scan on unique_cases (cost=24451257.82..24451270.28
rows=623 width=104)
-> Unique (cost=24451257.82..24451264.05 rows=623
width=124)
-> Sort (cost=24451257.82..24451259.38 rows=623
width=124)
Sort Key: l.brand_id, l.last_change, l.log_id,
l.rec_insert_time DESC
-> Nested Loop (cost=0.00..24451228.90
rows=623 width=124)
Join Filter: ((l.category)::text =
filter.category)
-> Seq Scan on small_join_table filter
(cost=0.00..26.99 rows=1399 width=8)
-> Materialize (cost=0.00..24435949.31
rows=623 width=99)
-> Seq Scan on log_table l
(cost=0.00..24435946.20 rows=623 width=99)
Filter: ((field1 IS NOT NULL)
AND (category = 'music'::name) AND (rec_insert_time >= (now() - '10
days'::interval)))
(13 rows)

Thanks,
Jeremy

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-10-11 14:37:03 v12.0: ERROR: could not find pathkey item to sort
Previous Message Jeremy Finzel 2019-10-11 14:19:33 Re: BRIN index which is much faster never chosen by planner