Re: questions about PG update performance

From: José Luis Tallón <jltallon(at)adv-solutions(dot)net>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Любен Каравелов <karavelov(at)mail(dot)bg>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kisung Kim <kskim(at)bitnine(dot)co(dot)kr>
Subject: Re: questions about PG update performance
Date: 2015-10-26 11:59:08
Message-ID: 562E158C.8050201@adv-solutions.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/26/2015 05:49 AM, Amit Kapila wrote:
> On Mon, Oct 26, 2015 at 9:03 AM, Любен Каравелов <karavelov(at)mail(dot)bg
> <mailto:karavelov(at)mail(dot)bg>> wrote:
> >
> >
> > ----- Цитат от Kisung Kim (kskim(at)bitnine(dot)co(dot)kr
> <mailto:kskim(at)bitnine(dot)co(dot)kr>), на 26.10.2015 в 04:36 -----
> >
> > > However, what I want to know is about the update performance
> difference
> > > between PG and Oracle if there any.
> > > The case I described is for exaggerating the difference between PG and
> > > Oracle.
> > >
> > > I want to explain for our clients that PG's update performance is
> > > comparable to Oracle's.
> > >
> >
> > Oracle is also using MVCC but copies the old row in the rollback
> segment and
> > rewrites the values in-place.
>
> I think Oracle just copies the changed part of old row to rollback
> segment.
> Also in Redo logs, it just writes the changed column value (both old and
> new). So for the case we are discussing in this thread (one changed
> column out of 200 columns), Oracle will just write the old value of that
> column in Redo and then in rollback segment, and write the new value
> in Redo and then do the in-place update in heap row.

IMV, where Oracle is heavily optimized for "most DML transactions will
commit successfully" and "no long-running transactions shall ever
exists" / "not many transactions will have to read previous
snapshots"(based on PI), Postgres does not actually make any such
assumptions.

Hence, for long running transactions / massive concurrency-many
clients reading and writing older snapshots, Postgres will be faster
(less work to do compared to re-constructing rows based on PIs)

Plus, for updates where the size of the NEW row is bigger than the
previous one (think adding text) the overhead is actually greater for
Oracle (plus, they don't compress variable length values by default / no
TOAST )... so here Postgres would be faster.
For text-intensive workloads, Postgres is measurably faster than
Oracle mostly due to this fact (plus much more efficient in it use of
storage/RAM...)

>
> In PostgreSQL, the whole new row is written in heap and diff tuple
> (difference
> of old and new tuple; this optimization is done in 9.4, commit a3115f0d)
> in WAL. I think we can try to optimize and use the same technique for
> heap as used for WAL to make PostgreSQL more efficient for such
> scenario's,
> however as of today, my guess is that PostgreSQL's update would be lagging
> in this area.

Yup. But see above for a potential reason where it might not be that
bad, especially after the optimization you mention.

> > It is still 2 writes as in Postgres.
>
> The difference is in the amount of data written per write.

Yes, but compressed (for varlena-based datum/data), batched
(group-commit) so mostly sequential, and non-duplicated (WAL vs REDO+UNDO).

So I guess the difference is quite small nowadays, and differences will
be heavily influenced by actual workload.

Just my 2 (euro-) cents.

/ J.L.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2015-10-26 12:06:58 Re: pg_basebackup and replication slots
Previous Message Joshua D. Drake 2015-10-26 11:58:59 pg_basebackup and replication slots