| From: | Tatsuya Kawata <kawatatatsuya0913(at)gmail(dot)com> |
|---|---|
| To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
| Cc: | "samimseih(at)gmail(dot)com" <samimseih(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: [PATCH] Add sampling statistics to autoanalyze log output |
| Date: | 2026-01-14 15:43:32 |
| Message-ID: | CAHza6qfenjMLtvdVVY04nvE5-+9u2ufBqdM6cGQgyQ+ujJLdyA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Fujii-san, Chao-san,
Thank you for your review!
> With this change, we would end up reporting nearly the same sampling
> information twice, for example:
>
> INFO: analyzing "public.ft"
> INFO: "ft": table contains 10 rows, 10 rows in sample
> INFO: finished analyzing table "postgres.public.ft"
> sampling: 10 rows in sample, 10 estimated total rows
>
> Wouldn't it be less confusing to avoid reporting the second sampling line?
I agree, that would be better. I'll fix this.
> I'm not sure it's acceptable to change the FDW API and require
> FDW authors to update their extensions, especially since
> the benefit on the FDW side seems limited at this point.
I agree that it's premature to require FDW-side changes at this stage. I'll
remove those modifications.
> You may need to update src/tools/pgindent/typedefs.list.
Thank you! I'll check and update it.
> Based on my testing, the aggregated values look incorrect.
> In the example below, both t_0 and t_1 report 5,000,000 live rows,
> but the aggregated result is 6,779,052. Is that the expected
> aggregation behavior?
I had misunderstood the sampling behavior for inheritance tables. I
initially thought it would be straightforward to display the sum of child
table values for the parent table. However, the parent table's sampling
logic proportionally distributes samples from inherited child tables based
on their relative block counts, and stores that as the parent's statistics.
Therefore, simply changing the log output to show summed values would imply
a change in the sampling methodology itself. While such a modification
might be possible in the future, it would deviate from the original purpose
of this patch, which is to align the log output between ANALYZE VERBOSE and
autoanalyze. For now, I'll exclude this from the current patch.
I plan to post an updated version addressing Chao-san's feedback as well.
Regards,
Tatsuya Kawata
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Akshay Joshi | 2026-01-14 15:48:12 | Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement |
| Previous Message | Álvaro Herrera | 2026-01-14 15:40:16 | Re: Add IS_INDEX macro to brin and gist index |