Dynamic gathering the values for seq_page_cost/xxx_cost

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Dynamic gathering the values for seq_page_cost/xxx_cost
Date: 2019-11-26 00:59:22
Message-ID: CAKU4AWotEsUX3pF=KRm6xrnceAjffHELkE-jCDTsnvmvUhXtWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The optimizer cost model usually needs 2 inputs, one is used to represent
data distribution and the other one is used to represent the capacity of
the hardware, like cpu/io let's call this one as system stats.

In Oracle database, the system stats can be gathered with
dbms_stats.gather_system_stats [1] on the running hardware, In
postgresql, the value is set on based on experience (user can change the
value as well, but is should be hard to decide which values they should
use). The pg way is not perfect in theory(In practice, it may be good
enough or not). for example, HDD & SSD have different capacity regards to
seq_scan_cost/random_page_cost, cpu cost may also different on different
hardware as well.

I run into a paper [2] which did some research on dynamic gathering the
values for xxx_cost, looks it is interesting. However it doesn't provide
the code for others to do more research. before I dive into this, It
would be great to hear some suggestion from experts.

so what do you think about this method and have we have some discussion
about this before and the result?

[1] https://docs.oracle.com/database/121/ARPLS/d_stats.htm#ARPLS68580
[2] https://dsl.cds.iisc.ac.in/publications/thesis/pankhuri.pdf

Thanks

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2019-11-26 01:51:30 [PATCH] Fix possible string overflow with sscanf (xlog.c)
Previous Message Tom Lane 2019-11-26 00:39:19 Re: GROUPING SETS and SQL standard