Re: Why we lost Uber as a user

From: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why we lost Uber as a user
Date: 2016-07-28 14:53:44
Message-ID: CAB=Je-FNugDyxWOUoAuUUE5MPiKkJccTiQwh+_o+Ab=MQaS=pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> >> That's a recipe for runaway table bloat; VACUUM can't do much because
> >> there's always some minutes-old transaction hanging around (and SNAPSHOT
> >> TOO OLD doesn't really help, we're talking about minutes here), and
> >> because of all of the indexes HOT isn't effective.
>

Just curious: what if PostgreSQL supported index that stores "primary key"
(or unique key) instead of tids?
Am I right that kind of index would not suffer from that bloat? I'm
assuming the primary key is not updated, thus secondary indices build in
that way should be much less prone to bloat when updates land to other
columns (even if tid moves, its PK does not change, thus secondary index
row could be reused).

If that works, it could reduce index bloat, reduce the amount of WAL (less
indices will need be updated). Of course it will make index scan a bit
worse, however it looks like at least Uber is fine with that extra cost of
index scan.

Does it make sense to implement that kind of index as an access method?

Vladimir

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-07-28 14:53:47 BRIN vs. HOT
Previous Message Tom Lane 2016-07-28 14:46:00 Re: Wrong defeinition of pq_putmessage_noblock since 9.5