Re: Column lookup in a row performance

From: Павлухин Иван <vololo100(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Column lookup in a row performance
Date: 2019-04-03 11:44:37
Message-ID: CAOykqKf6GuyZV+pq6kjM6R9ToK7whA7aMKi4RFFYCPhb_7jFwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom, thanks for your answer. It definitely makes a picture in my mind
more clear.

вт, 2 апр. 2019 г. в 18:41, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>
> =?UTF-8?B?0J/QsNCy0LvRg9GF0LjQvSDQmNCy0LDQvQ==?= <vololo100(at)gmail(dot)com> writes:
> >> (1) Backwards compatibility, and (2) it's not clear that a different
> >> layout would be a win for all cases.
>
> > I am curious regarding (2), for my understanding it is good to find
> > out at least one case when layout with lengths/offsets in a header
> > will be crucially worse. I will be happy if someone can elaborate.
>
> It seems like you think the only figure of merit here is how fast
> deform_heap_tuple runs. That's not the case. There are at least
> two issues:
>
> 1. You're not going to be able to do this without making tuples
> larger overall in many cases; but more data means more I/O which
> means less performance. I base this objection on the observation
> that our existing design allows single-byte length "words" in many
> common cases, but it's really hard to see how you could avoid
> storing a full-size offset for each column if you want to be able
> to access each column in O(1) time without any examination of other
> columns.
>
> 2. Our existing system design has an across-the-board assumption
> that each variable-length datum has its length embedded in it,
> so that a single pointer carries enough information for any called
> function to work with the value. If you remove the length word
> and expect the length to be computed by subtracting two offsets that
> are not even physically adjacent to the datum, that stops working.
> There is no fix for that that doesn't add performance costs and
> complexity.
>
> Practically speaking, even if we were willing to lose on-disk database
> compatibility, point 2 breaks so many internal and extension APIs that
> there's no chance whatever that we could remove the length-word datum
> headers. That means that the added fields in tuple headers would be
> pure added space with no offsetting savings in the data size, making
> point 1 quite a lot worse.
>
> regards, tom lane

--
Best regards,
Ivan Pavlukhin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-04-03 12:01:45 Re: ToDo: show size of partitioned table
Previous Message Konstantin Knizhnik 2019-04-03 11:36:22 Re: [HACKERS] Cached plans and statement generalization