Re: Use of "long" in incremental sort code

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: James Coleman <jtc331(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use of "long" in incremental sort code
Date: 2020-11-23 11:11:04
Message-ID: 005bd231-acff-6a56-e37b-8ccfc1ef9f93@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05.11.2020 02:53, James Coleman wrote:
> On Tue, Nov 3, 2020 at 4:42 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On Tue, Nov 03, 2020 at 03:53:53AM +0100, Tomas Vondra wrote:
>>> Hi,
>>>
>>> I took another look at this, and 99% of the patch (the fixes to sort
>>> debug messages) seems fine to me. Attached is the part I plan to get
>>> committed, including commit message etc.
>>>
>> I've pushed this part. Thanks for the patch, Haiying Tang.
>>
>>> The one change I decided to remove is this change in tuplesort_free:
>>>
>>> - long spaceUsed;
>>> + int64 spaceUsed;
>>>
>>> The reason why I think this variable should be 'long' is that we're
>>> using it for this:
>>>
>>> spaceUsed = LogicalTapeSetBlocks(state->tapeset);
>>>
>>> and LogicalTapeSetBlocks is defined like this:
>>>
>>> extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);
>>>
>>> FWIW the "long" is not introduced by incremental sort - it used to be in
>>> tuplesort_end, the incremental sort patch just moved it to a different
>>> function. It's a bit confusing that tuplesort_updatemax has this:
>>>
>>> int64 spaceUsed;
>>>
>>> But I'd argue this is actually wrong, and should be "long" instead. (And
>>> this actually comes from the incremental sort patch, by me.)
>>>
>>>
>>> FWIW while looking at what the other places calling LogicalTapeSetBlocks
>>> do, and I noticed this:
>>>
>>> uint64 disk_used = LogicalTapeSetBlocks(...);
>>>
>>> in the disk-based hashagg patch. So that's a third data type ...
>>>
>> IMHO this should simply switch the current int64 variable to long, as it
>> was before. Not sure about about the hashagg uint64 variable.
> Is there anything that actually limits tape code to using at most 4GB
> on 32-bit systems?

At first glance, I haven't found anything that could limit tape code. It
uses BufFile, which is not limited by the OS file size limit.
Still, If we want to change 'long' in LogicalTapeSetBlocks, we should
probably also update nBlocksWritten and other variables.

As far as I see, the major part of the patch was committed, so l update
the status of the CF entry to "Committed". Feel free to create a new
entry, if you're going to continue working on the remaining issue.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-11-23 11:14:49 Re: Parallel Inserts in CREATE TABLE AS
Previous Message Bharath Rupireddy 2020-11-23 10:23:03 Re: Multi Inserts in CREATE TABLE AS - revived patch