Re: BUG #16241: Degraded hash join performance

From: Thomas Butz <tbutz(at)optitool(dot)de>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16241: Degraded hash join performance
Date: 2020-02-04 14:49:10
Message-ID: 87490169.275301.1580827750088.JavaMail.zimbra@optitool.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I was able to reduce it to a minimal example:

SELECT localized_streetname
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)

Your diagnosis with osml10n_get_streetname_from_tags was spot on. I'm going to report it to the author of the osml10n extension.

Thank you for your help!

PS: I've created the execution plans using EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS, TIMING, SUMMARY, FORMAT json) but i wasn't aware that i might have to check the "Output" section of a node to find potential performance problems. How do you find such slow outputs without using perf?

> Von: "Andres Freund" <andres(at)anarazel(dot)de>
> An: "Thomas Butz" <tbutz(at)optitool(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> CC: "pgsql-bugs" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
> Gesendet: Dienstag, 4. Februar 2020 14:44:08
> Betreff: Re: BUG #16241: Degraded hash join performance

> Hi,
>
>
> On 2020-02-04 10:02:45 +0100, Thomas Butz wrote:
>> I've executed the query twice and attached the generated reports.
>
> Interesting! The no-children one clearly shows that a lot of the the
> time is spent evaluating regular expressions (there's other regex
> functions in the profile too):
>
> 23.36% postgres postgres [.] subcolor
>
> and that this is inside a stack of plpgsql functions:
>
> | | |--18.95%--exec_stmt_block
> | | | exec_stmt
> | | | plpgsql_exec_function
> | | | plpgsql_call_handler
> | | | ExecInterpExpr
> | | | ExecEvalExpr (inlined)
> | | | exec_eval_simple_expr
> | | | (inlined)
> [...]
> | | | plpgsql_exec_function
> | | | plpgsql_call_handler
> | | | ExecInterpExpr
> | | | ExecEvalExpr (inlined)
> | | | exec_eval_simple_expr (inlined)
> [...]
> | | | plpgsql_exec_function
> | | | plpgsql_call_handler
> | | | ExecInterpExpr
> | | | ExecEvalExpr (inlined)
> [...]
>
>
> I'm not aware of any relevant regular expression evaluation changes
> between 11 and 12. Tom, does this trigger anything?
>
> Thomas, could you provide a EXPLAIN (ANALYZE, VERBOSE) for both
> versions? Clearly the hash join is where the time is spent:
>
> -> Hash Join (cost=1.000..6436.030 rows=44 width=517) (actual
> time=5.532..3383.213 rows=2779 loops=1)
> Buffers: shared hit=9077
> -> Index Scan using planet_osm_hstore_line_way_idx on planet_osm_hstore_line
> planet_osm_hstore_line (cost=0.550..6419.380 rows=491 width=352) (actual
> time=0.423..19.147 rows=2779 loops=1)
> Index Cond: (planet_osm_hstore_line.way &&
> '01030000A0110F00000100000005000000AF78D2F4ECA632411CD92E58CD2E58410000000000000000AF78D2F4ECA63241E4EED5138D3958410000000000000000D3CF6EE3EBD13241E4EED5138D3958410000000000000000D3CF6EE3EBD132411CD92E58CD2E58410000000000000000AF78D2F4ECA632411CD92E58CD2E58410000000000000000'::geometry)
> Filter: ((((planet_osm_hstore_line.tags -> 'name'::text) IS NOT NULL) OR
> ((planet_osm_hstore_line.tags -> 'oneway'::text) = ANY ('{yes,-1}'::text[])) OR
> ((planet_osm_hstore_line.tags -> 'junction'::text) = 'roundabout'::text)) AND
> ((planet_osm_hstore_line.tags -> 'highway'::text) = ANY
> ('{motorway,motorway_link,trunk,trunk_link,primary,primary_link,secondary,secondary_link,tertiary,tertiary_link,residential,unclassified,road,service,pedestrian,raceway,living_street,construction}'::text[])))
> Buffers: shared hit=9077
> -> Hash (cost=0.230..0.230 rows=18 width=32) (actual time=0.029..0.030 rows=18
> loops=1)
> -> Values Scan (cost=0.000..0.230 rows=18 width=32) (actual time=0.003..0.017
> rows=18 loops=1)
>
> but I don't think it's the hash condition itself - there's no plpgsql
> functions necessary for the join as far as I can tell. A VERBOSE EXPLAIN
> might help up with knowing what the function being called actually is.
>
> It seems quite possible that all that time is spent inside plpgsql
> functions that are part of postgis. And that the postgis version
> difference between 2.5 and 3.0 is the relevant part here, not the
> postgres version difference. Or it's just in the application code:
>
> The sort step shows among others the following column to be sorted on:
> osml10n_get_streetname_from_tags(planet_osm_hstore_line.tags, true, false, ' -
> '::text, 'de'::text, planet_osm_hstore_line.way, NULL::text)) DESC
> which is actually likely computed as the output of the HashJoin
> node. And is the only obvious candidate for a plpgsql function
> potentially taking a lot of the time.
>
> Is that function defined exactly the same between the version? Is the
> underlying dataset roughly the same size? I suggest running the query
> without it, and checking how long it takes.
>
> Regards,
>
> Andres

--

Thomas Butz

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2020-02-04 15:01:52 Re: BUG #16241: Degraded hash join performance
Previous Message Andres Freund 2020-02-04 13:44:08 Re: BUG #16241: Degraded hash join performance