Re: Declarative partitioning optimization for large amount of partitions

From: Andres Freund <andres(at)anarazel(dot)de>
To: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Declarative partitioning optimization for large amount of partitions
Date: 2017-03-06 20:01:50
Message-ID: 20170306200150.kez6a5frw6vv6twt@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

This issue has bothered me in non-partitioned use-cases recently, so
thanks for taking it up.

On 2017-03-06 18:22:17 +0300, Aleksander Alekseev wrote:
> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> index 2fb9a8bf58..fa906e7930 100644
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -160,6 +160,16 @@ typedef struct TabStatusArray
>
> static TabStatusArray *pgStatTabList = NULL;

Why are we keeping that list / the "batch" allocation pattern? It
doesn't actually seem useful to me after these changes. Given that we
don't shrink hash-tables, we could just use the hash-table memory for
this, no?

I think a separate list that only contains things that are "active" in
the current transaction makes sense, because the current logic requires
iterating over a potentially very large array at transaction commit.

> +/* pgStatTabHash entry */
> +typedef struct TabStatHashEntry
> +{
> + Oid t_id;
> + PgStat_TableStatus* tsa_entry;
> +} TabStatHashEntry;
> +
> +/* Hash table for faster t_id -> tsa_entry lookup */
> +static HTAB *pgStatTabHash = NULL;
> +

'faster ... lookup' doesn't strike me as a useful comment, because it's
only useful in relation of the current code to the new code.

> /*
> * Backends store per-function info that's waiting to be sent to the collector
> * in this hash table (indexed by function OID).
> @@ -815,7 +825,13 @@ pgstat_report_stat(bool force)
> pgstat_send_tabstat(this_msg);
> this_msg->m_nentries = 0;
> }
> +
> + /*
> + * Entry will be freed soon so we need to remove it from the lookup table.
> + */
> + hash_search(pgStatTabHash, &entry->t_id, HASH_REMOVE, NULL);
> }

It's not really freed...

> static PgStat_TableStatus *
> get_tabstat_entry(Oid rel_id, bool isshared)
> {
> + TabStatHashEntry* hash_entry;
> PgStat_TableStatus *entry;
> TabStatusArray *tsa;
> - TabStatusArray *prev_tsa;
> - int i;
> +
> + /* Try to find an entry */
> + entry = find_tabstat_entry(rel_id);
> + if(entry != NULL)
> + return entry;

Arguably it'd be better to HASH_ENTER directly here, instead of doing
two lookups.

> /*
> - * Search the already-used tabstat slots for this relation.
> + * Entry doesn't exist - creating one.
> + * First make sure there is a free space in a last element of pgStatTabList.
> */
> - prev_tsa = NULL;
> - for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = tsa->tsa_next)
> + if (!pgStatTabList)
> {
> - for (i = 0; i < tsa->tsa_used; i++)
> - {
> - entry = &tsa->tsa_entries[i];
> - if (entry->t_id == rel_id)
> - return entry;
> - }
> + HASHCTL ctl;
>
> - if (tsa->tsa_used < TABSTAT_QUANTUM)
> + Assert(!pgStatTabHash);
> +
> + memset(&ctl, 0, sizeof(ctl));
> + ctl.keysize = sizeof(Oid);
> + ctl.entrysize = sizeof(TabStatHashEntry);
> + ctl.hcxt = TopMemoryContext;
> +
> + pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup table",
> + TABSTAT_QUANTUM, &ctl, HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);

Think it'd be better to move the hash creation into its own function.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lukas Fittl 2017-03-06 20:01:52 Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements
Previous Message Adam Brightwell 2017-03-06 19:14:24 Re: RADIUS fallback servers