Re: BUG #13846: INSERT ON CONFLICT consumes sequencersonconflicts

From: Paul <paul(at)salesintel(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13846: INSERT ON CONFLICT consumes sequencersonconflicts
Date: 2016-01-06 22:00:17
Message-ID: 568d8e74.d1a9ca0a.d9558.ffff9026@mx.google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I’m looking at math more like a single Fact table having 500 million records, with 10 dimension columns. If INTs were used for the dimension columns, that’s 20GB. If I had to change those dimension columns to BIGINTs, that’s 40GB. This can impact how much you can fit into server memory, where olaps like to live, and just slow down moving stuff around between memory and disk and over the network and backups, etc.

On a technical note, why is the following flow considered ‘fragile’?
1) Evaluate only columns used in conflict_target
a. Conflict-resolving Columns with default nextval() increment the corresponding sequencer
i. And in this case, there were never be conflicts by definition, so ON CONFLICT can always be ignored
2) If conflict, DO UPDATE
a. If nextval()-defaulted column used in conflict_target, we never get here
b. Defaults never evaluated
3) Else Evaluate remaining columns not used in conflict_target and INSERT
a. Columns with nextval() increment their corresponding sequencer

From: Peter Geoghegan
Sent: Wednesday, January 6, 2016 2:13 PM
To: Paul
Cc: Andres Freund; pgsql-bugs
Subject: Re: [BUGS] BUG #13846: INSERT ON CONFLICT consumes sequencersonconflicts

On Wed, Jan 6, 2016 at 1:00 PM, Paul <paul(at)salesintel(dot)com> wrote:
> My only point is, there’s another great capability in Postgres, for doing
> efficient concurrent UPSERTS, but that one of the most common and prime use
> cases for it is going to cost more money from having to use 64bit numbers
> everywhere for no good reason.

Not really; at least, it won't increase the size of a plain BIGSERIAL
primary key index (which will generally not be composite):

postgres=# create table foo as select (random() * 100000000)::int4
pkey from generate_series(1, 100000);
SELECT 100000
postgres=# create index on foo (pkey);
CREATE INDEX

postgres=# \dt+ foo
List of relations
Schema │ Name │ Type │ Owner │ Size │ Description
────────┼──────┼───────┼───────┼─────────┼─────────────
public │ foo │ table │ pg │ 3544 kB │
(1 row)

postgres=# \di+ foo_pkey_idx
List of relations
Schema │ Name │ Type │ Owner │ Table │ Size │ Description
────────┼──────────────┼───────┼───────┼───────┼─────────┼─────────────
public │ foo_pkey_idx │ index │ pg │ foo │ 2208 kB │
(1 row)

postgres=# alter table foo alter column pkey type int8;
ALTER TABLE

postgres=# \di+ foo_pkey_idx
List of relations
Schema │ Name │ Type │ Owner │ Table │ Size │ Description
────────┼──────────────┼───────┼───────┼───────┼─────────┼─────────────
public │ foo_pkey_idx │ index │ pg │ foo │ 2208 kB │
(1 row)

The unchanged size of the index foo_pkey_idx seen here due to
alignment considerations. Granted, the heap might still be a bit
larger than it would otherwise be, because it will usually be
"composite", but I think the "cost" of your using int8 will not break
the bank.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2016-01-06 22:11:30 Re: BUG #13846: INSERT ON CONFLICT consumes sequencersonconflicts
Previous Message Alvaro Herrera 2016-01-06 21:42:48 Re: BUG #13594: pg_ctl.exe redirects stderr to Windows Events Log if stderr is redirected to pipe