From: | Vince McMahon <sippingonesandzeros(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Explain returns different number of rows |
Date: | 2022-10-24 14:31:44 |
Message-ID: | CAKS+vVb2hAzd1zORNuM35DWK47vb=i-cehuh8RaTkk7jF4=dMQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-performance |
Thanks for the clarification, Peter.
On Sat, Oct 22, 2022, 05:32 Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:
> On 2022-10-20 09:56:23 -0700, Christophe Pettus wrote:
> > On Oct 20, 2022, at 09:52, Vince McMahon <sippingonesandzeros(at)gmail(dot)com>
> wrote:
> > > The number of rows are different.
> >
> > This isn't unexpected. EXPLAIN does not actually run the query and
> > determine how many rows are returned; it calculates an estimate based
> > on the current system statistics, which vary constantly depending on
> > activity in the database.
>
> EXPLAIN ANALYZE (which is what he did) does run the query and return the
> actual number of rows:
>
> #v+
> wdsah=> explain (analyze, buffers) select * from
> facttable_eurostat_comext_cpa2_1 ;
>
> ╔══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
> ║ QUERY
> PLAN ║
>
> ╟──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
> ║ Seq Scan on facttable_eurostat_comext_cpa2_1 (cost=0.00..1005741.32
> rows=39633432 width=85) (actual time=0.396..6541.701 rows=39633591 loops=1)
> ║
> ║ Buffers: shared read=609407
> ║
> ║ Planning Time: 1.650 ms
> ║
> ║ Execution Time: 7913.027 ms
> ║
>
> ╚══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
> (4 rows)
> #v-
>
> The first tuple (cost=0.00..1005741.32 rows=39633432 width=85) is an
> estimate used to plan the query. But the second one
> (actual time=0.396..6541.701 rows=39633591 loops=1)
> contains measurements from actually running the query.
>
> I think it's possible that the rows estimate in the first tuple changes
> without any actual data change (although the only reason I can think of
> right now would be an ANALYZE (in another session or by autovacuum)).
> But the actual rows definitely shouldn't change.
>
> hp
>
> --
> _ | Peter J. Holzer | Story must make more sense than reality.
> |_|_) | |
> | | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
> __/ | http://www.hjp.at/ | challenge!"
>
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2022-10-24 15:59:04 | Re: possible bug |
Previous Message | Ajin Cherian | 2022-10-24 12:29:34 | Re: Support logical replication of DDLs |
From | Date | Subject | |
---|---|---|---|
Next Message | Ramdip Gill | 2022-11-14 01:49:13 | =ANY(ARRAY) vs =ANY(ARRAY(expr)) performance |
Previous Message | Peter J. Holzer | 2022-10-22 09:32:32 | Re: Explain returns different number of rows |