Re: PostgreSQL 10.0 SELECT LIMIT performance problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: Mareks(dot)Kalnacs(at)datakom(dot)lv, pgsql-bugs(at)postgresql(dot)org, Maris(dot)Rucis(at)datakom(dot)lv
Subject: Re: PostgreSQL 10.0 SELECT LIMIT performance problem
Date: 2018-09-12 17:04:11
Message-ID: 9924.1536771851@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Victor Yegorov <vyegorov(at)gmail(dot)com> writes:
> The fact, that planner is not accurate on the estimates of JSON internal
> keys is expected, PostgreSQL is not parsing JSON values when gathering
> stats.
> You cannot expect planner to be picky about all possible corner cases, it
> would make planning time enormously huge.

Right. The fact that it doesn't make the right guesses without help
can't be considered to be a bug in all cases. These are engineering
tradeoffs we have to make.

> That is the reason I outlined, that important keys should be extracted into
> plain columns.

If that seems infeasible from an application standpoint, another
possibility is to make expression indexes on those important keys.
ANALYZE will gather stats on the values of indexed expressions, and then
perhaps the planner will have enough info to make better decisions.

In the other case mentioned, where the problem is a poor guess about
the selectivity of

where (
to_tsvector(
'simple'::regconfig,
a.ts_vector_fields
) @@ to_tsquery(
'simple',
' (decodedType5b6f0753ossAlarmsGenerator) '
)
)

it's the same problem: the planner has no stats that would let it
figure out the selectivity. It can't reasonably extract an
estimate on the fly --- if it did, you'd be complaining that
planning time was too long. The only way to get reasonable behavior
is to set things up so that ANALYZE will accumulate stats about
the values of "to_tsvector('simple',a.ts_vector_fields)". You
can arrange that either by extracting that into a column, or by
making an index on it.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Adrien Nayrat 2018-09-12 17:14:34 Re: PostgreSQL 10.0 SELECT LIMIT performance problem
Previous Message Victor Yegorov 2018-09-12 16:43:18 Re: PostgreSQL 10.0 SELECT LIMIT performance problem