As RDS databases increase in size, query complexity, or concurrent users—queries that used to run fast, may no longer do so, and people often require a way to correct this. In a recent blog post, Ervin Madaha showed how to accelerate and scale Amazon RDS PostgreSQL query performance by using a PG Nitrous PostgreSQL instance as a fast query replica.
In this post, we detail the query price-performance advantages of PG Nitrous by comparing TPC-H benchmark results for Amazon RDS and PG Nitrous.
In our tests on a 1-terabyte TPC-H dataset, PG Nitrous instances improve RDS query performance by 10x to 38x.
What’s PG Nitrous?
PG Nitrous is PostgreSQL 12 instance, hosted on Amazon EC2 and accelerated by the Swarm64 DA PostgreSQL extension. Therefore, in a PG Nitrous instance, the PostgreSQL query engine is enhanced with greater parallelism, query planning and execution efficiency, faster SQL JOIN algorithms, and columnstore indexing.
Using PG Nitrous as a fast query replica for RDS is a highly performant, easy, and cost-effective way to support large-scale reporting and end-user dashboards or advanced analytics, without having to rearchitect or switch off of PostgreSQL.
Off-loading the queries from RDS onto PG Nitrous may also allow RDS to process transactions faster since OLTP performance can also slow down when the database is supporting mixed transaction and analytic query workloads.
Our query performance test – the TPC-H benchmark
TPC-H is a widely used decision support database performance benchmark. It comprises 8 tables and 22 SQL queries. In our tests, we ran TPC-H at scale-factor 1000 (SF1000) on a 1-terabyte dataset containing 8.67 billion rows within the 8 tables.
- The Swarm64 DA open-source TPC-H benchmarking toolkit was used to execute all stages of the benchmark: data generation, data loading, query execution, and results gathering.
- The queries were executed in a specific order defined by the TPC-H standard in a query stream simulating a single active user (Power test).
- The query stream was run only once after the data loading step. This approach gives a more real-world comparison than warming the caches before executing the query stream or running each query twice.
- Each System Under Test (SUT) had a suitable instance and storage configuration (DB settings and indices) for the TPC-H benchmark.
Two tests – a small and a large PG Nitrous instance
In our tests, we ran two experiments comparing the query performance of Amazon RDS PostgreSQL on the smallest possible instance capable of running >70% of the 22 TPC-H queries without timing out, a 48-vCore r5d.12xlarge, versus:
- a similarly equipped PG Nitrous instance (48-vCore r5.12xlarge)
- the smallest available PG Nitrous instance (16-vCore r5.4xlarge)
The table below shows a summary of the selected instances and storage configuration for each SUT.
|PG Nitrous – 4xl||PG Nitrous – 12xl||RDS PostgreSQL|
|PostgreSQL version||Postgres 12||Postgres 12||Postgres 12|
|Virtual CPU cores||16||48||48|
|Storage||EBS gp2 2.4 TB||EBS gp2 7 TB||EBS gp2 4 TB|
|Number of instances||1||1||1|
RDS PostgreSQL setup
We also made some changes to the default RDS configuration to increase its parallelism and optimize it for query performance. To the best of our knowledge, the changes give PostgreSQL the best performance when running the TPC-H benchmark in our testing environment. The most relevant changes are listed below:
We also defined indexes for the TPC-H tables/queries in RDS PostgreSQL, which you can review here.
PG Nitrous setup
We set up the PG Nitrous instances similar to RDS PostgreSQL, using the same PostgreSQL parameter settings listed above for the similarly equipped instance and scaled-down values for the small instance.
Instead of using standard PostgreSQL indexes, PG Nitrous replicas use the Swarm64 DA columnstore index.
Comparing PG Nitrous to RDS PostgreSQL on systems with equal resources.
The graph below shows the runtimes of all TPC-H queries of PG Nitrous on r5.12xlarge and RDS PostgreSQL on r5d.12xlarge.
- PG Nitrous processes the entire 22-query stream within 0.43 hour.
- RDS PostgreSQL is unable to execute two of the queries (Q18 & Q21), and three queries (Q9, Q17, and Q20) each take longer than 1 hour to execute.
Now, let’s compare runtimes taking into account only the queries that RDS PostgreSQL can respond within 1 h., excluding the 5 queries mentioned above. In this case, the total runtime for the remaining 17 queries for RDS PostgreSQL is 9.22 h., while for PG Nitrous, it is 0.24 h. PG Nitrous delivers an overall speedup of 38x over RDS.
Notice that the 5 excluded from this calculation queries represent a significant portion of the total PG Nitrous runtime (0.19 h. from a total of 0.43 h.), not surprising that RDS PostgreSQL struggles to return results in 1h.
When analyzing each query individually, PG Nitrous query speedup varies between 10x for Q22 and 93x for Q4 with an average query speedup of 36x.
Comparing RDS PostgreSQL to PG Nitrous running on a smaller instance.
The graph below shows how the runtime of RDS PostgreSQL compares to PG Nitrous when the smallest possible PG Nitrous instance (16-core r5.4xlarge) is chosen.
PG Nitrous executes 21 queries in 1.23 h. It timed out on TPC-H Query 18, which joins the two biggest tables (Orders and Lineitem) in a correlated subquery (to successfully run this query with PG Nitrous on a 1 TB dataset, the instance should be an r5.8xlarge or larger).
If we compare only the queries for which RDS PostgreSQL returns results within 1 hour, PG Nitrous executes those 17 queries in 0.90 h, while RDS needs 9.22 h. on a 3x bigger instance (r5d.12xlarge).
PG Nitrous – linearly scalable query performance
One other observation worth noting is the fact that PG Nitrous query performance scales linearly as CPU core count changes. The 3x difference in core count for our two PG Nitrous instances (16 vs. 48 cores) resulted in a 3x difference in query performance
The table below summarizes the results for each SUT.
|PG Nitrous – 4xl||PG Nitrous – 12xl||RDS PostgreSQL|
|Total runtime (h.)||1.23||0.43||9.22|
|# queries completed||21||22||17|
|# queries that cannot run||1||0||2|
|Average query runtime (sec.)||206||71||1,952|
|Geometric mean (sec.)||148||58||1,324|
|Cost (3-year subscription)||$23,208||$69,625||$57,175|
The prices shown in the table above include instance and license cost. Costs are calculated based on reserves prices with a 3-year upfront payment.
This post has shown TPC-H SF 1000 benchmark results for PG Nitrous and RDS PostgreSQL.
With the speedup delivered by PG Nitrous you can:
1. Run the same workload faster in much smaller machines and reduce TCO. The TPC-H workload runs 10x faster with PG Nitrous in a 2.5x cheaper instance compared to RDS.
2. Run analytic queries 38x faster than RDS in an instance with equal resources with a 20% cost increase.
3. Since PG Nitrous is 100% PostgreSQL, with some simple steps you can add it to your existing Amazon RDS PostgreSQL deployments as read-replica, and quickly experience a performance boost in your query performance.
4. Reduce database performance tuning cost by solving your query performance issues in an easy and effective way. Run more complex queries or serve more users with your existing SaaS application.
Try PG Nitrous it yourself
I look forward to your feedback about PG Nitrous. If you have any questions or comments, let us know.