Re: A Windows x64 port of PostgreSQL

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Ken Camann <kjcamann(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: A Windows x64 port of PostgreSQL
Date: 2008-07-03 07:19:49
Message-ID: 486C7D95.5010306@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

A bit long - the summary is that "intptr_t" should probably be used,
assuming I understand the problem this thread is talking about:

Ken Camann wrote:
> 1. An object in memory can have size "Size" (= size_t). So its big
> (maybe 8 bytes).
> 2. An index into the buffer containing that object has index "Index"
> (= int) So its smaller (maybe 4 bytes). Now you can't index your big
> object, unless sizeof(size_t) = sizeof(int). But sizeof(size_t) must
> be at least 8 bytes on just about any 64-bit system. And sizeof(int)
> is still 4 most of the time, right

I believe one of the mistakes here is an assumption that "int" is always
the correct type to use for an index. This is not correct. "int" will be
a type that is probably the most efficient word size for the target
machine, and since "int" is usually ~32 bits these days, it will have a
range that is sufficient for most common operations, therefore, it is
commonly used. But, the C and C++ specifications do not define that an
index into an array is of type "int". Rather, they defined E1[E2] as
*((E1) + (E2)), and then the + operator is defined such that if one
operand E1 is a pointer and operand E2 is an integer type, the result
will be a pointer to the E2th element of E1 with the same pointer type
as E1. "integer type" is not "int". It is any integer type. If the
useful range of the array is 256 values, a "char" is acceptable for use
as a "char" is an integer type. The optimizer might promote the "char"
to a 32-bit or 64-bit machine register before calculating the result of
the addition, but this is irrelevant to the definition of the C language.

I think one could successfully argue that ptrdiff_t is the correct value
to use for an array index that might use a range larger than "int" on a
machine where sizeof(int) < sizeof(void*). ptrdiff_t represents the
difference between two pointers. If P and Q are void* and I is
ptrdiff_t, and Q - P = I, then &P[I] = Q. Though, I think it might be
easier to use size_t. If I is of type size_t, and P = malloc(I), then
P[0] ... P[I-1] are guaranteed to be addressable using a size_t.

There is also the usable range, even on a machine with sizeof(size_t) of
64 bits. I don't think any existing machine can actually address 64-bits
worth of continuous memory. 48-bits perhaps. Technically, sizeof(size_t)
does not need to be sizeof(void*), and in fact, the C standard has this
to say: "The types used for size_t and ptrdiff_t should not have an
integer conversion rank greater than that of signed long int unless the
implementation supports objects large enough to make this necessary." It
doesn't define sizeof(size_t) in terms of sizeof(void*).

The C standard defines long int as:
"Their implementation-defined values shall be equal or greater in
magnitude (absolute value) to those shown, with the same sign.
...
— minimum value for an object of type long int
LONG_MIN -2147483647 // −(2**31 − 1)
— maximum value for an object of type long int
LONG_MAX +2147483647 // 2**31 − 1"

Based upon this definition, it appears that Windows 64 is compatible
with the standard. That GCC took a different route that is also
compatible with the standard is inconvenient, but a reality that should
be dealt with.

More comments from the C standard on this issue: "Any pointer type may
be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in
the integer type, the behavior is undefined. The result need not be in
the range of values of any integer type."

The "portable" answer to this problem, is supposed to be intptr_t:
"7.18.1.4 Integer types capable of holding object pointers
The following type designates a signed integer type with the property
that any valid
pointer to void can be converted to this type, then converted back to
pointer to void,
and the result will compare equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the property
that any valid
pointer to void can be converted to this type, then converted back to
pointer to void,
and the result will compare equal to the original pointer:
uintptr_t
These types are optional."

If Windows 64 has this type (not sure - I don't use Windows 64), then I
believe intptr_t is the portable way to solve this problem. Note,
though, that intptr_t does not guarantee that it can hold every integer
value. For example, on a 32-bit platform, sizeof(intptr_t) might be 32
bits, and sizeof(long long) might be 64 bits. There is also this
portable type:
" 7.18.1.5 Greatest-width integer types
The following type designates a signed integer type capable of
representing any value of
any signed integer type:
intmax_t
The following type designates an unsigned integer type capable of
representing any value
of any unsigned integer type:
uintmax_t
These types are required."

I think this means that if PostgreSQL were to be designed to support all
ISO C compliant platforms, PostgreSQL would have to use a union of
intptr_t and intmax_t. Or, PostgreSQL will choose to not support some
platforms. Windows 64 seems as if it may continue to be as popular as
Windows 32, and should probably be supported.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2008-07-03 07:31:06 Re: PATCH: CITEXT 2.0
Previous Message Teodor Sigaev 2008-07-03 07:19:26 Re: PATCH: CITEXT 2.0