Zedstore is an open source compressed columnar storage layer for Postgres being developed from the ground up with the goal to be incorporated into core Postgres. Column oriented storage has often been bandied about in the Postgres community, and is a popular ask from Enterprise users. This was a major driver for the introduction of the table access method API into core Postgres in early 2019. The API, as its name suggests, provides an opportunity to plug-in a custom storage layer. Zedstore is a strict implementation of that API. The Zedstore project started in early 2019 after the table access method API was committed to Postgres, with Heikki Linnakangas, a long time Postgres committer, and members of the Greenplum Database Server team.
Greenplum is an open source MPP database and is a fork of Postgres and has its own columnar storage engine. Zedstore was envisioned with the idea of being a stand-in replacement for this engine, apart from being a column store that every Postgres user could use. Since the Greenplum storage implementation predates the table access method API, the implementation suffers from being entrenched in the Postgres backend and makes merging with upstream Postgres difficult. What’s more, this problem limits the number of features that could be realized without reimplementation due to decisions around Postgres architecture. Zedstore is cleaner, is far less invasive and uses all of the same Postgres infrastructure used by Postgres’ row-store. Case in point—it leverages Postgres’ buffer manager, and by extension, Postgres’ storage manager (which controls how Postgres talks to the file-system). It co-locates meta-data and data into the same physical file (relfile in Postgres parlance), obviating the need for additional files and/or catalog tables. It also repurposes Postgres’ fixed-size page layout to its advantage. All of this implies that, to the end user and the DBA, a Zedstore table looks and feels the same as a heap table.
A side goal for Zedstore as a project is to improve the table access method interface in general and to ensure that other parts of the Postgres backend are access method-aware. For instance, to make Zedstore crash-safe, we had to write custom WAL (Write-Ahead Log) records to support which patches had to be made to core Postgres. Similarly, we had to patch parts of the Postgres query planner and the table AM API itself to generate and pass column projection information to the storage layer. We aim to contribute these patches to ensure that the Postgres code is more amenable to other AMs, while also continuing the optimization of this project.
Apart from supporting and optimizing for OLAP workloads, Zedstore aims to support the entire feature set that the heap mechanism offers – MVCC compliance, crash safety, replication, updates, deletes, singleton inserts, all flavors of indexes, vacuum and storage-aware-query-plans – in order to be truly HTAP compliant. Zedstore offers a fresh take on column store design, opting for B-Trees as the primary on-disk structure. Its flexible design allows for features such as column families or even its use as a compressed row store.
The VMware team presented at OSS North America 2020, and the replay is available after free registration on the event site. This talk discusses the kinds of workloads column stores are specialized for, details on how Zedstore performs on those workloads and dives into the value of its unique architecture. Included are insights into how Zedstore is benchmarked, how one may monitor Zedstore deployments and open areas of development and performance improvement.
Stay tuned to the Open Source Blog and watch this space for more updates around Zedstore and Postgresql.