Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark <mwchambers(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2
Date: 2019-01-02 18:19:25
Message-ID: 7363.1546453165@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Mark <mwchambers(at)gmail(dot)com> writes:
> Am I correct in my understanding that any row that has been modified (i.e.
> UPDATE) is in state HEAPTUPLE_INSERT_IN_PROGRESS so it will not be included
> in the sample?

An update will mark the existing tuple as delete-in-progress and then
insert a new tuple (row version) that's insert-in-progress.

A concurrent ANALYZE scan will definitely see the old tuple (modulo
sampling considerations) but it's timing-dependent which state it sees it
in --- it could still be "live" when we see it, or already
delete-in-progress. ANALYZE might or might not see the new tuple at all,
depending on timing and where the new tuple gets placed. So "count/sample
delete-in-progress but not insert-in-progress" seems like a good rule to
minimize the timing sensitivity of the results. It's not completely
bulletproof, but I think it's better than what we're doing now.

> I'm going to rework the application so there is less time between the
> DELETE and the COMMIT so I will only see the problem if ANALYZE runs during
> this smaller time window.

Yeah, that's about the best you can do from the application side.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Brannen 2019-01-02 18:25:26 RE: Relocatable Binaries (RPMs) : custom installation path for PostgreSQL
Previous Message Ron 2019-01-02 18:13:34 Re: Implementing standard SQL's DOMAIN constraint [RESOLVED]

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-01-02 19:01:27 Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
Previous Message Mark 2019-01-02 18:04:19 Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2