Re: BUG #18909: Query creates millions of temporary files and stalls

From: Sergey Koposov <Sergey(dot)Koposov(at)ed(dot)ac(dot)uk>
To: "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "lepihov(at)gmail(dot)com" <lepihov(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18909: Query creates millions of temporary files and stalls
Date: 2025-05-03 23:05:31
Message-ID: f6fb5c3463006c1d4471bc30f9edb419cb777329.camel@ed.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, 2025-05-03 at 18:47 -0400, Tom Lane wrote:
> Sergey Koposov <Sergey(dot)Koposov(at)ed(dot)ac(dot)uk> writes:
> > wsdb=> set cursor_tuple_fraction TO 1;
> > SET
> > wsdb=> EXPLAIN DECLARE c CURSOR FOR select phot_g_mean_mag,phot_bp_mean_mag,phot_rp_mean_mag,
> > g.source_id,pmra,pmdec,parallax,parallax_error,d.ra,d.dec,d.ebv,d.gmeanpsfmag,rmeanpsfmag,imeanpsfmag,gkronmag,rkronmag from disk_sample1 as d left join
> > gaia_edr3_aux.panstarrs1bestneighbour as gaps1 on (gaps1.original_ext_source_id=d.objid) left join gaia_edr3.gaia_source as g on ( g.source_id =
> > gaps1.source_id
> > ) ;
> > QUERY PLAN
> > ---------------------------------------------------------------------------------------------------------------
> > Hash Right Join (cost=112529534.06..422118114.74 rows=65470868 width=100)
> > Hash Cond: (g.source_id = gaps1.source_id)
> > -> Seq Scan on gaia_source g (cost=0.00..124645680.12 rows=1811786112 width=48)
> > -> Hash (cost=111007847.21..111007847.21 rows=65470868 width=60)
> > -> Hash Right Join (cost=3231089.53..111007847.21 rows=65470868 width=60)
> > Hash Cond: (gaps1.original_ext_source_id = d.objid)
> > -> Seq Scan on panstarrs1bestneighbour gaps1 (cost=0.00..17443258.96 rows=951450496 width=16)
> > -> Hash (cost=1709402.68..1709402.68 rows=65470868 width=60)
> > -> Seq Scan on disk_sample1 d (cost=0.00..1709402.68 rows=65470868 width=60)
>
> Oh, *that* is interesting: now we have two hash joins, and neither one
> has panstarrs1bestneighbour as the table to hash, so it's not too
> clear which one is going crazy. But you showed the stats for
> disk_sample1.objid, and that looked pretty well distributed, so I'm
> going to guess that that hash is fine. That leaves the other join
> on panstarrs1bestneighbour.source_id as the one under suspicion.
> Can we see the stats for that column?
>

I don't quite know what you are looking for here, but this is the panstarrs1bestneighbour.source_id.
Those are object ids for ~ a billion objects, so basically I expect most of them to be different,
which is matches my expectation.

table_name | column_name | stanullfrac | stadistinct | stawidth | stakind1 |
stavalues1
| stakind2 | stavalues2 | stakind3 | stavalues3 | stakind4 | stavalues4 | stakind5 | stavalues5
---------------------------------------+-------------+-------------+-------------+----------+----------+---------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------+----------+------------+----------+------------+----------+------------+----------+------------
gaia_edr3_aux.panstarrs1bestneighbour | source_id | 0 | -1 | 8 | 2 |
{806904097912320,181280403379521536,231030517101341952,279340244824304640,395434550873163776,430064940901112960,465200384719646464,511243945963340288,55485098513
0936448,884099024809719424,1088756926588540928,1368729867335745024,1743677218036649344,1806876076961984000,1822446535998227072,1827513777076336640,18434769335570
31936,1864935449158255104,1920817367429277568,1968055999039378816,1987254777719980160,2004190452249389056,2020732153709040512,2027928972718120832,203348532469370
5856,2045418530529014272,2059732346543415936,2074956120121436416,2100442043737863296,2163620222393133568,2179506511844570496,2206141459472625664,2252050369194606
336,2591598075455078272,2906007067548492032,2949458190829542272,3032992517301187328,3068094632176092416,3120910944525529984,3215924142328292992,33363926191583658
24,3378822326196932864,3442282147328280064,3663373148030165504,4046707917046071296,4050407357085387776,4052784814808650880,4058733516320497280,406154888016680512
0,4064082807792646528,4067157871310616320,4070490250503721728,4075824943460652416,4079541052298279552,4089451977102617600,4091947563572467328,4096492394857354624
,4101247370255732864,4104701520432256512,4108024244563444608,4111583191923415552,4116149772934806784,4118390921289286016,4121139459806669056,4124851857730469120,
4135313745180553728,4145067478499619328,4150596918151346688,4157604522387932416,4169873659086912896,4186656398413699456,4202234455318397824,4208158108512840448,4
237268606566890112,4252491997962068864,4256170723281997056,4262931890855102848,4268558164899557376,4283472009313681920,4291562353308821376,4298242676736372736,43
08513249218792320,4314497062051819904,4320919893590099456,4369138185748815360,4459756193193297920,4481744016273417216,4507165210444471936,4516588295676538240,453
3771875708241024,4602125356804206464,5598986647730081280,5639341335818720640,5699272347395052928,5980496601182037504,6031590184872417536,6046668097063710464,6251
266251555253888,6755724660278190464,6794820098303412096,6917184811783233280} | 3 | | 0 | | 0 | | 0 |
(1 row)

The only thing I can add is that panstarrs1bestneighbour is ordered on disk by source_id. I don't know if that can be relevant.

S
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrei Lepikhov 2025-05-04 07:52:13 Re: BUG #18909: Query creates millions of temporary files and stalls
Previous Message Tom Lane 2025-05-03 22:47:33 Re: BUG #18909: Query creates millions of temporary files and stalls