Data Systems / 2026-05-22

RDS vs StarRocks 20M Serving and Aggregation Benchmark

This benchmark compares Postgres/RDS and StarRocks for denormalized API serving tables at 20 million rows. It separates low-latency indexed row fetches from broader OLAP aggregations so the serving architecture can be chosen by query shape.

Published Research Native report page

Rows 20M

Snapshot rows in both systems

MV rows 20M

StarRocks async materialized view

MV refresh 113.6 s

Refresh time for the StarRocks MV

Aggregation winner StarRocks

Materialized views dominated broad scans

Problem

Decide whether an OLAP serving layer can replace or complement indexed RDS tables for API read paths at 20 million rows.

Approach

Built equivalent denormalized serving shapes in RDS/Postgres and StarRocks, then timed indexed row fetches and broad aggregations.

Result

RDS stayed faster for small full-card reads, while StarRocks materialized views dominated broad scans and grouped aggregations.

Technologies

AWS
RDS/PostgreSQL
StarRocks
Iceberg
S3 Tables
Materialized Views
Python

Executive summary

The result confirms the storage-engine tradeoff. RDS/Postgres is better for small indexed API-style row fetches, especially top-50 full-card reads. StarRocks is much faster for full-table aggregation over 20 million rows, and async materialized views improve aggregation reads further.

The practical conclusion is to use RDS/Postgres for low-latency indexed row sets and StarRocks with materialized views for analytical API shapes that aggregate or scan many rows.

Benchmark results

Query	RDS p50	RDS p95	StarRocks base p50	StarRocks MV p50	Fastest p50
Snapshot lookup	30.2 ms	50.1 ms	N/A	N/A	RDS
Full cards top 50 by rank	43.1 ms	43.7 ms	195.2 ms	187.9 ms	RDS
Full cards top 1000 by rank	414.6 ms	423.1 ms	216.5 ms	203.6 ms	StarRocks MV
Filtered full cards top 50 by views	39.5 ms	51.0 ms	186.0 ms	177.1 ms	RDS
Aggregate by region over 20M rows	8,012.4 ms	8,911.1 ms	212.3 ms	145.1 ms	StarRocks MV
Aggregate by country/language over 20M rows	4,081.6 ms	5,825.1 ms	280.0 ms	130.5 ms	StarRocks MV
Narrow top 1000 by rank	74.3 ms	230.4 ms	51.1 ms	48.6 ms	StarRocks MV
Count and sum over 20M rows	3,230.2 ms	3,344.4 ms	55.8 ms	54.9 ms	StarRocks MV

Interpretation

For top-50 full-card reads, RDS p50 was 43.1 ms while StarRocks MV p50 was 187.9 ms.
For top-1000 reads, StarRocks MV overtook RDS with a 203.6 ms p50 versus 414.6 ms.
For region aggregation, StarRocks MV delivered 145.1 ms p50 versus 8,012.4 ms on RDS.
For country/language aggregation, StarRocks MV improved base StarRocks from 280.0 ms p50 to 130.5 ms.

Recommendation

A hybrid serving design should use StarRocks when the API serves rollups, counts, sums, grouped summaries, or broad scans. RDS remains competitive and often faster for small selective full-row reads.

The StarRocks OLAP table and materialized views should be ordered and distributed around the actual API filters and sort keys, such as snapshot_id, tenant_id, dataset_id, platform, rank, activity_count, and published_at.

Detailed methodology and results

Supporting methodology, figures, and tables are rendered here as native page content with the same visual system as the rest of this website.

Detailed record of the AWS benchmark comparing the existing RDS/Postgres denormalized benchmark API tables with StarRocks OLAP tables and an async materialized view.

Executive Summary

The benchmark confirms the expected storage-engine tradeoff. RDS/Postgres was faster for small indexed API-style row fetches, especially top-50 full-card reads. StarRocks was much faster for full-table aggregation over 20 million rows, and the async materialized view improved those aggregation reads further.

Objects Touched

RDS/Postgres

Existing tables in the benchmark database were inspected and the main content-card benchmark table was expanded to the requested benchmark size. These objects were preserved after the benchmark.

Table	Rows	Notes
serving_api_snapshots	3	Snapshot metadata for the Analytics API overview.
serving_api_filter_options	66	Filter option metadata for the benchmark API.
serving_api_result_cards	20,005,000	Denormalized content-card serving table. Snapshot 1 was expanded to 20,000,000 rows.

StarRocks

The StarRocks benchmark objects were created under default_catalog.serving_benchmark . Probe objects used to test syntax were removed.

Object	Type	Rows	Status
default_catalog.serving_benchmark.serving_api_result_cards_base	OLAP table	20,000,000	Preserved
default_catalog.serving_benchmark.serving_api_result_cards_mv	Async materialized view	20,000,000	SUCCESS

Methodology

Benchmarks were run in an AWS benchmark environment using a temporary benchmark role.
The RDS table was populated with synthetic Analytics API rows to reach 20 million rows in snapshot 1.
Secondary indexes on the RDS content-card table were dropped before bulk insert, then rebuilt and analyzed afterward.
The StarRocks benchmark table used the same denormalized serving shape as the RDS content-card table, with StarRocks-native type mappings.
The StarRocks async MV was created over the base table and manually refreshed.
Each timed query used 2 warmup executions and 5 measured executions.
Reported p50 is the median of the 5 measured runs. Reported p95 is the nearest-rank p95 over those 5 runs.

RDS State After Load

Snapshot	Rows	Min ID	Max ID	Published range
1	20,000,000	1	20,005,000	2024-05-23 22:06:31 UTC to 2026-05-22 22:15:41 UTC
2	2,500	2	7,499	2026-02-18 19:34:50 UTC to 2026-05-18 19:34:50 UTC
3	2,500	3	7,500	2026-02-18 19:34:50 UTC to 2026-05-18 19:34:50 UTC

Snapshot 1 metadata was updated to reflect the benchmark scale: total_items = 20,000,000 , active_publishers = 100,000 , and total_engagements = 99,011,392,245,710 .

Benchmark Results

Times below are measured in milliseconds. Lower is better. Full-card queries select the denormalized API-facing columns. Narrow queries select only a small subset of numeric columns.

Query	RDS p50	RDS p95	StarRocks Base p50	StarRocks Base p95	StarRocks MV p50	StarRocks MV p95	Fastest p50
Snapshot lookup	30.2	50.1	N/A	N/A	N/A	N/A	RDS
Full cards top 50 by rank	43.1	43.7	195.2	218.2	187.9	190.8	RDS
Full cards top 1000 by rank	414.6	423.1	216.5	253.6	203.6	231.8	StarRocks MV
Filtered full cards top 50 by views	39.5	51.0	186.0	205.0	177.1	183.0	RDS
Aggregate by region over 20M rows	8,012.4	8,911.1	212.3	237.2	145.1	164.2	StarRocks MV
Aggregate by country/language over 20M rows	4,081.6	5,825.1	280.0	283.5	130.5	144.2	StarRocks MV
Narrow top 1000 by rank	74.3	230.4	51.1	56.4	48.6	55.8	StarRocks MV
Count and sum over 20M rows	3,230.2	3,344.4	55.8	76.7	54.9	70.3	StarRocks MV

Raw Timing Runs

The following are the five measured runs for each query after warmup. Values are milliseconds.

RDS/Postgres

Query	Runs	p50	p95	Rows Returned
snapshot_lookup	25.9, 28.7, 50.1, 30.2, 36.3	30.2	50.1	1
cards_rank_top50_full	43.1, 43.3, 43.7, 38.6, 38.5	43.1	43.7	50
cards_rank_top1000_full	423.1, 422.5, 414.6, 362.6, 349.7	414.6	423.1	1000
cards_views_top50_filtered	35.2, 39.1, 51.0, 48.0, 39.5	39.5	51.0	50
aggregate_by_region	4116.7, 8012.4, 4399.3, 8014.7, 8911.1	8012.4	8911.1	4
aggregate_by_country_language	5825.1, 4054.3, 4081.6, 4122.7, 3959.6	4081.6	5825.1	20
narrow_rank_top1000	46.8, 230.4, 60.1, 84.7, 74.3	74.3	230.4	1000
count_sum_all	3218.5, 3188.6, 3344.4, 3230.2, 3270.8	3230.2	3344.4	1

StarRocks Base Table

Query	Runs	p50	p95	Rows Returned
cards_rank_top50_full	192.7, 216.2, 218.2, 195.2, 183.0	195.2	218.2	50
cards_rank_top1000_full	253.6, 231.8, 216.5, 202.4, 205.9	216.5	253.6	1000
cards_views_top50_filtered	169.9, 186.0, 205.0, 160.9, 190.5	186.0	205.0	50
aggregate_by_region	232.9, 212.3, 210.1, 209.0, 237.2	212.3	237.2	5
aggregate_by_country_language	280.7, 273.3, 280.0, 279.3, 283.5	280.0	283.5	12
narrow_rank_top1000	56.4, 42.3, 53.4, 51.1, 38.8	51.1	56.4	1000
count_sum_all	54.2, 51.3, 76.7, 55.8, 56.4	55.8	76.7	1

StarRocks Async Materialized View

Query	Runs	p50	p95	Rows Returned
cards_rank_top50_full	190.8, 181.4, 188.1, 184.5, 187.9	187.9	190.8	50
cards_rank_top1000_full	203.6, 198.4, 231.8, 201.7, 209.5	203.6	231.8	1000
cards_views_top50_filtered	170.4, 177.1, 182.2, 183.0, 170.3	177.1	183.0	50
aggregate_by_region	137.5, 145.1, 136.3, 151.6, 164.2	145.1	164.2	5
aggregate_by_country_language	118.4, 119.3, 130.5, 144.2, 141.1	130.5	144.2	12
narrow_rank_top1000	39.2, 38.9, 48.6, 51.7, 55.8	48.6	55.8	1000
count_sum_all	51.4, 69.2, 54.9, 70.3, 49.9	54.9	70.3	1

Interpretation

Small Indexed Row Fetches

RDS/Postgres was clearly better for the top-50 full-card fetches where the query can use selective B-tree indexes and return a small set of complete rows. The p50 for full cards top 50 by rank was 43.1 ms on RDS versus 187.9 ms on the StarRocks MV .

This supports the claim that row-oriented storage can be better for fetching full rows through an indexed API path.

Larger Top-N Reads

For full cards top 1000 by rank, StarRocks became faster than RDS in this benchmark. RDS p50 was 414.6 ms , while the StarRocks MV p50 was 203.6 ms . This is likely because the larger row count starts to reduce the advantage of row-store index traversal and heap access, while StarRocks can scan and project columnar data efficiently.

Aggregations Across Many Rows

StarRocks was dramatically faster for aggregations over the full 20 million row snapshot. The region aggregation was 8,012.4 ms p50 on RDS versus 145.1 ms p50 on the StarRocks MV . The count/sum query was 3,230.2 ms p50 on RDS versus 54.9 ms p50 on the StarRocks MV .

This strongly supports using StarRocks for aggregate-heavy Analytics API API surfaces, especially if those aggregations can be served from async materialized views.

Base Table vs Async MV

The StarRocks MV was consistently faster than or close to the base table. The difference was most visible in aggregations: aggregate by country/language improved from 280.0 ms p50 on base to 130.5 ms p50 on MV .

Existing RDS Table Schemas

These are the existing denormalized benchmark API table schemas that should inform the StarRocks OLAP table design.

serving_api_snapshots

Column	Postgres Type	Nullable
id	bigint	NO
tenant_id	varchar	NO
workspace_id	varchar	NO
report_id	varchar	NO
platform	varchar	NO
brand_id	varchar	NO
brand_name	varchar	NO
filter_hash	varchar	NO
filters	jsonb	NO
generated_at	timestamptz	NO
title	varchar	NO
industry_vertical_label	varchar	NO
total_items	integer	NO
total_engagements	bigint	NO
brands_tracked	integer	NO
active_publishers	integer	NO
topics_identified	integer	NO
avg_sentiment	numeric	NO
tab_counts	jsonb	NO
created_at	timestamptz	NO
updated_at	timestamptz	NO

Column	Postgres Type	Nullable
id	bigint	NO
tenant_id	varchar	NO
report_id	varchar	NO
platform	varchar	NO
snapshot_id	bigint	NO
filter_type	varchar	NO
option_key	varchar	NO
label	varchar	NO
result_count	integer	NO
ordinal	integer	NO
metadata	jsonb	NO
created_at	timestamptz	NO
updated_at	timestamptz	NO

serving_api_result_cards

Column	Postgres Type	Nullable
id	bigint	NO
snapshot_id	bigint	NO
tenant_id	varchar	NO
report_id	varchar	NO
platform	varchar	NO
source_item_id	varchar	NO
rank	integer	NO
tab_keys	text[]	NO
brand_ids	text[]	NO
vertical_keys	text[]	NO
category_keys	text[]	NO
region_key	varchar	NO
country_code	varchar	NO
language_code	varchar	NO
channel_id	varchar	NO
channel_name	varchar	NO
channel_avatar_url	varchar	NO
creator_badge	varchar	NO
title	varchar	NO
thumbnail_url	varchar	NO
content_url	varchar	NO
published_at	timestamptz	NO
age_label	varchar	NO
views	bigint	NO
likes	bigint	NO
comments	bigint	NO
activity_per_day	numeric	NO
velocity_pct	numeric	NO
sentiment_score	numeric	NO
search_tsv	tsvector	NO
created_at	timestamptz	NO
updated_at	timestamptz	NO

RDS Indexes Restored

The following indexes existed on serving_api_result_cards after the load and rebuild.

StarRocks Design Notes

The StarRocks benchmark was created in the StarRocks native catalog. Fully qualified names were required because the default session database was resolving unqualified operations against the Iceberg catalog.

Type Mapping Used For Benchmark

Postgres Type	StarRocks Mapping	Reason
bigint	BIGINT	Preserve integer scale.
integer	INT	Natural StarRocks equivalent.
varchar/text	VARCHAR	API-facing string fields.
text[]	ARRAY VARCHAR	Retain array semantics for tab, brand, vertical, and category keys.
timestamptz	DATETIME	StarRocks DATETIME was used for benchmark timestamp values.
numeric	DECIMAL	Used for velocity, views per day, and sentiment values.
tsvector	search_text VARCHAR	Postgres-specific full-text vector replaced with plain searchable text.

Benchmark StarRocks MV Shape

The successful StarRocks MV metadata showed:

refresh_type = ASYNC
is_active = true
last_refresh_state = SUCCESS
rows = 20,000,000
query_rewrite_status = VALID

Cleanup Status

Item	Status	Reason
Temporary local helper scripts	Removed	They were only used to run the benchmark.
Temporary Python virtualenv	Removed	It was only used for temporary database clients.
Temporary AdministratorAccess AWS config	Removed	No local admin profile was left behind.
Temporary admin kubeconfig	Removed	No temporary kubeconfig was left behind.
StarRocks syntax probe objects	Removed	Only benchmark objects were preserved.
RDS 20M benchmark data	Preserved	Preserved because it is benchmark data in the existing denormalized API tables.
StarRocks benchmark base table and MV	Preserved	Preserved for follow-up validation and comparison.

Recommendation

For the new Analytics API serving design, the table should live in StarRocks if the API is going to serve analytical rollups, counts, sums, grouped summaries, or broad scans. StarRocks materially outperformed RDS for those access patterns in this benchmark.

If the endpoint primarily returns a small number of full rows through highly selective filters, RDS remains competitive and may be faster at top-50 style API reads. For StarRocks to be the better serving layer for those paths, the OLAP table or MV should be ordered and distributed around the actual API filters and sort keys, for example snapshot_id , tenant_id , report_id , platform , rank , views , and published_at .

The practical path is to create the canonical Analytics API OLAP table and async materialized views in StarRocks, then have this API repo read from those StarRocks objects instead of maintaining a separate RDS materialized-view serving copy.

Executive summary

Benchmark results

Interpretation

Recommendation

Detailed methodology and results

Executive Summary

Objects Touched

RDS/Postgres

StarRocks

Methodology

RDS State After Load

Benchmark Results

Raw Timing Runs

RDS/Postgres

StarRocks Base Table

StarRocks Async Materialized View

Interpretation

Small Indexed Row Fetches

Larger Top-N Reads

Aggregations Across Many Rows

Base Table vs Async MV

Existing RDS Table Schemas

serving_api_snapshots

serving_api_filter_options

serving_api_result_cards

RDS Indexes Restored

StarRocks Design Notes

Type Mapping Used For Benchmark

Benchmark StarRocks MV Shape

Cleanup Status

Recommendation