We just released Swarm64 DA 5.4.0, which features some important improvements to compressed columnstore index performance.
Columnstore indexes were introduced in version 5.0 last September as a new, easier way to get much faster query performance in PostgreSQL—especially for databases that support large-scale reporting, or complex queries.
Columnstore indexes – faster queries (but not always–hmmmm)
Since their introduction, we noticed that columnstore indexes on new tables, worked very fast. But on established, older schemas the indexes would sometimes have a smaller or inconsistent effect. We dug into the issue and discovered the cause… btrees!
Prior to installing the Swarm64 DA extension, many PostgreSQL users tried solving query performance issues by defining tons of btree indixes on their tables. This created two problems for columnstore indexes:
1. The PostgreSQL planner incorrectly thinks that one of the many btree indexes will perform better in an analytical query than the Swarm64 DA columnstore index.
2. Columnstore indexes are updated by the vacuum. With many btree indexes, the vacuum cycles become so long, that eventually the columnstore cache is almost devoid of valid data.
Columnstore index & btree index harmony in 5.4
Although the columnstore index typically replaces many of the btree indexes, in scenarios with mixed workloads (analytical and transactional queries) different queries benefit from different indexing mechanisms, so harmonious cooperation of btree and columnstore indexes is a great advantage. So, in 5.4.0, we added more harmony with the following improvements:
- We tweaked the cost estimations so that the planner makes a smarter decision about when to use a columnstore or a btree index. This leads to the best execution runtime for each query. For example a point lookup is better served by a btree index, while a larger scan for an aggregation is best served by a columnstore index.
- We decoupled the columnstore index update from autovacuum, and instead the columnstore now uses it’s own background worker to update the columnstore cache. On top of that, we improved management of the cache data on disk.
To further improve the interaction between the columnstore and PostgreSQL, we decided to add some functionality of the table access method interface, a fantastic work by the community that is only available since PostgreSQL 12. Therefore Swarm64 DA 5.4.0 only works with PostgreSQL versions 12 and 13.
In addition to the columnstore indexing improvements, we’ve also added support for deploying Swarm64 DA on Debian 10. This is the operating system chosen by the official PostgreSQL docker images, so now you can easily create your custom Swarm64 DA container.