Apache Cloudberry (Incubating) 2.0.0 Released - MPP Database for Analytics & AI
Posted on 2025-09-02 by Apache Cloudberry
Related Open Source
The Apache Cloudberry (Incubating) community is pleased to announce the release of Apache Cloudberry 2.0.0, the project’s first official release under the Apache Software Foundation.
We’d like to express our gratitude to all contributors to this release, as well as to the mentors and the Apache Incubator community for their invaluable support. This significant milestone reflects a collaborative effort to meet ASF release requirements and establish Cloudberry as an open and community-driven project.
Key Highlights of Version 2.0.0
- PostgreSQL 14 Foundation: Built on PostgreSQL 14.x, bringing stable PostgreSQL features and improvements to the distributed analytics environment
- Performance Improvements:
- Optimized Query Execution: Significant improvements in distributed query planning and execution
- Enhanced Resource Management: Better memory and CPU utilization across cluster nodes
- Improved Parallel Processing: More efficient data distribution and parallel query processing
- Backup and Recovery: Improved backup strategies for distributed environments
- Dynamic Tables: A new feature that enables automatic, scheduled refresh of query results, designed for scenarios requiring up-to-date data, such as real-time analytics, lakehouse architectures, and automated ETL pipelines
- PAX Storage Format: Introduces the PAX (Partition Attributes Across) storage format, a hybrid approach that combines the advantages of row-based and column-based storage. PAX delivers high performance for both data writes and analytical queries, making it well-suited for OLAP workloads and large-scale data analysis
- ASF Compliance: Updates on license headers, LICENSE/NOTICE/DISCLAIMER files, refined dependency attributions, and more
Download
Apache Cloudberry 2.0.0 is available for download at: https://cloudberry.apache.org/releases.
Useful links
About Apache Cloudberry
Apache Cloudberry (Incubating) is an open-source Massively Parallel Processing (MPP) database for large-scale data analytics, derived from PostgreSQL and the last open-source version of Greenplum Database. It is designed to support both on-premise and cloud deployments, providing a scalable foundation for data warehousing and advanced analytics.