Proposal: 64bit Extension in PostgreSQL 8.0.x July 7th, 2005 Koichi Suzuki (NTT DATA Intellilink) 1. Background and Purpose 64-bit architecture CPU is getting more and more popular in Intel-based CPU, such as EM64T, AMD64 and IA64. Servers based upon such CPUs can provide much more memory available. Tens of gigabytes of memory is available in a node, typically 16gigabytes or so. Obviously, 32bit-based Linux and its applications can run on such a machine. However, from the point of each process' view, size of avilable memory is limited by process user space, typically 1GB for kernel and 3GB for each process. PostgreSQL's kernel uses shared memory to hold shared data and much more memory should be available as PostgreSQL is going to handle bigger database. For this purpose, we need to extend PostgreSQL to 64-bit program and make shared memory management to handle shared memory beyond 32bit limitation. On the other hand, PostgreSQL is going to be used in mission-critical systems in enterprise environment. In such application, we need to provide long period operation without stopping service. Currently, PostgreSQL's transaction ID is limited by 32bit integer and when it is about to run out, we have to stop the database operation and run vacuum freeze to reuse the older transaction ID value. Vacuum freeze operation scans all the database space and together with the bigger database size in oparation, it's going to take longer. Because PostgreSQL is going to be used in busier systems, transaction IDs tend to run out more earlier. To provide longer continuous operation, we need to make transaction ID 64bit-based. 2. How we can do? It is basically very simple to do these two extensions. We can locate their definitions and change them into 64-bit based ones. However, there are much much more to be done. We have to find all the lines which deals with their value and have to modify them so that there will not be any loss of calculation precision. Detailed result will be given later. 3. Environment Our code will assume the following: 1) PostgreSQL server (i.e. postmaster and postgres processes) should run only on 64-bit CPU based server machies, that is, EM64T, AMD64 and IA64. 2) Both 64bit (EM64T, AMD64 and IA64) and 32bit CPU(IA32) are allowed as clients. 3) Currently, we support only Linux 2.6.x kernel. 4. Specification changes Due to these two changes, PostgreSQL's spec will change as follows: 4.1 Shared memory size Shared memory size are specified in shared_buffers entry in postgresql.conf file. In 32-bit environment, it is limited to INTMAX/BLOCKSIZE. Now new limitation is INTMAX/2. This value specifies the number of blocks, so actual memory which can be specified by this parameter will be (INITMAX/2)*BLOCKSIZE. Typical BLOCKSIZE is 8KB. Therefore, typicak maximum shared memory will be 8TB. This is beyond the limitation of Linux user space for EM64T (512GB) and should be sufficient. The reason why shared memory size specification is limited to INITMAX/2 is as follows: In 8.0.x buffer management, (buffer_number)*2 is used to produce buffer IDs and whole these subsystem is still based on 32bit calculation. We'd like to keep the change as minimum as possible and having this limit will not have bad influence to the maxumum value of actual shared memory size. 4.2 Transaction ID (XID) Type of transaction ID is pre-defined in the catalog and there are no way for users to redefine its type. In this extension, transaction IDs are handled as follows: 1) On catalog, transaction ID's (such as XMIN and XMAX) are give the type "XID", as in 32-bit environment. 2) If applications on 32-bit based environment trys to read the value of the type "XID", it has to read this value as "unsigned long long" value. 3) In the case of 64-bit based environment, the value of gXIDh can be handled as "unsigned long" value, depending on compilers. 4) The length of the type "XID" is stored in the catalog "pg_type". 4.3 Configure W have added two configuration options: --enable-64bit-shared-memory enables 64bit shared memory by defining USE_64BIT_SHARED. --enable-64bit-transaction-id enables 64-bit transaction id by defining USE_64BIT_XID. 5. Acknowledgement I'd like to thank Mr.Tom Lane, Mr.Bruce Momjan and Mr.Jan Wieck for their stimulating discussions. Many people from Fujitsu Ltd., NTT DATA and NTT DATA Intellilink helped me.