Back to work

Data Systems / 2026-05-22

RDS vs StarRocks 20M Serving and Aggregation Benchmark

This benchmark compares Postgres/RDS and StarRocks for denormalized API serving tables at 20 million rows. It separates low-latency indexed row fetches from broader OLAP aggregations so the serving architecture can be chosen by query shape.

Published Benchmark Technical writeup
Rows 20M

Snapshot rows in both systems

MV rows 20M

StarRocks async materialized view

MV refresh 113.6 s

Refresh time for the StarRocks MV

Aggregation winner StarRocks

Materialized views dominated broad scans

Problem

Decide whether an OLAP serving layer can replace or complement indexed RDS tables for API read paths at 20 million rows.

Approach

Built equivalent denormalized serving shapes in RDS/Postgres and StarRocks, then timed indexed row fetches and broad aggregations.

Result

RDS stayed faster for small full-card reads, while StarRocks materialized views dominated broad scans and grouped aggregations.

Technologies
  • AWS
  • RDS/PostgreSQL
  • StarRocks
  • Iceberg
  • S3 Tables
  • Materialized Views
  • Python

Executive summary

The result confirms the storage-engine tradeoff. RDS/Postgres is better for small indexed API-style row fetches, especially top-50 full-card reads. StarRocks is much faster for full-table aggregation over 20 million rows, and async materialized views improve aggregation reads further.

The practical conclusion is to use RDS/Postgres for low-latency indexed row sets and StarRocks with materialized views for analytical API shapes that aggregate or scan many rows.

Benchmark results

QueryRDS p50RDS p95StarRocks base p50StarRocks MV p50Fastest p50
Snapshot lookup30.2 ms50.1 msN/AN/ARDS
Full cards top 50 by rank43.1 ms43.7 ms195.2 ms187.9 msRDS
Full cards top 1000 by rank414.6 ms423.1 ms216.5 ms203.6 msStarRocks MV
Filtered full cards top 50 by views39.5 ms51.0 ms186.0 ms177.1 msRDS
Aggregate by region over 20M rows8,012.4 ms8,911.1 ms212.3 ms145.1 msStarRocks MV
Aggregate by country/language over 20M rows4,081.6 ms5,825.1 ms280.0 ms130.5 msStarRocks MV
Narrow top 1000 by rank74.3 ms230.4 ms51.1 ms48.6 msStarRocks MV
Count and sum over 20M rows3,230.2 ms3,344.4 ms55.8 ms54.9 msStarRocks MV

Interpretation

  • For top-50 full-card reads, RDS p50 was 43.1 ms while StarRocks MV p50 was 187.9 ms.
  • For top-1000 reads, StarRocks MV overtook RDS with a 203.6 ms p50 versus 414.6 ms.
  • For region aggregation, StarRocks MV delivered 145.1 ms p50 versus 8,012.4 ms on RDS.
  • For country/language aggregation, StarRocks MV improved base StarRocks from 280.0 ms p50 to 130.5 ms.

Recommendation

A hybrid serving design should use StarRocks when the API serves rollups, counts, sums, grouped summaries, or broad scans. RDS remains competitive and often faster for small selective full-row reads.

The StarRocks OLAP table and materialized views should be ordered and distributed around the actual API filters and sort keys, such as snapshot_id, tenant_id, dataset_id, platform, rank, activity_count, and published_at.

Detailed methodology and results

Supporting methodology, figures, and tables are rendered here as native page content with the same visual system as the rest of this website.

Detailed record of the AWS benchmark comparing the existing RDS/Postgres denormalized benchmark API tables with StarRocks OLAP tables and an async materialized view.

Executive Summary

The benchmark confirms the expected storage-engine tradeoff. RDS/Postgres was faster for small indexed API-style row fetches, especially top-50 full-card reads. StarRocks was much faster for full-table aggregation over 20 million rows, and the async materialized view improved those aggregation reads further.

Objects Touched

RDS/Postgres

Existing tables in the benchmark database were inspected and the main content-card benchmark table was expanded to the requested benchmark size. These objects were preserved after the benchmark.

TableRowsNotes
serving_api_snapshots3Snapshot metadata for the Analytics API overview.
serving_api_filter_options66Filter option metadata for the benchmark API.
serving_api_result_cards20,005,000Denormalized content-card serving table. Snapshot 1 was expanded to 20,000,000 rows.

StarRocks

The StarRocks benchmark objects were created under default_catalog.serving_benchmark . Probe objects used to test syntax were removed.

ObjectTypeRowsStatus
default_catalog.serving_benchmark.serving_api_result_cards_baseOLAP table20,000,000Preserved
default_catalog.serving_benchmark.serving_api_result_cards_mvAsync materialized view20,000,000SUCCESS

Methodology

  • Benchmarks were run in an AWS benchmark environment using a temporary benchmark role.
  • The RDS table was populated with synthetic Analytics API rows to reach 20 million rows in snapshot 1.
  • Secondary indexes on the RDS content-card table were dropped before bulk insert, then rebuilt and analyzed afterward.
  • The StarRocks benchmark table used the same denormalized serving shape as the RDS content-card table, with StarRocks-native type mappings.
  • The StarRocks async MV was created over the base table and manually refreshed.
  • Each timed query used 2 warmup executions and 5 measured executions.
  • Reported p50 is the median of the 5 measured runs. Reported p95 is the nearest-rank p95 over those 5 runs.

RDS State After Load

SnapshotRowsMin IDMax IDPublished range
120,000,000120,005,0002024-05-23 22:06:31 UTC to 2026-05-22 22:15:41 UTC
22,50027,4992026-02-18 19:34:50 UTC to 2026-05-18 19:34:50 UTC
32,50037,5002026-02-18 19:34:50 UTC to 2026-05-18 19:34:50 UTC

Snapshot 1 metadata was updated to reflect the benchmark scale: total_items = 20,000,000 , active_publishers = 100,000 , and total_engagements = 99,011,392,245,710 .

Benchmark Results

Times below are measured in milliseconds. Lower is better. Full-card queries select the denormalized API-facing columns. Narrow queries select only a small subset of numeric columns.

QueryRDS p50RDS p95StarRocks Base p50StarRocks Base p95StarRocks MV p50StarRocks MV p95Fastest p50
Snapshot lookup30.250.1N/AN/AN/AN/ARDS
Full cards top 50 by rank43.143.7195.2218.2187.9190.8RDS
Full cards top 1000 by rank414.6423.1216.5253.6203.6231.8StarRocks MV
Filtered full cards top 50 by views39.551.0186.0205.0177.1183.0RDS
Aggregate by region over 20M rows8,012.48,911.1212.3237.2145.1164.2StarRocks MV
Aggregate by country/language over 20M rows4,081.65,825.1280.0283.5130.5144.2StarRocks MV
Narrow top 1000 by rank74.3230.451.156.448.655.8StarRocks MV
Count and sum over 20M rows3,230.23,344.455.876.754.970.3StarRocks MV

Raw Timing Runs

The following are the five measured runs for each query after warmup. Values are milliseconds.

RDS/Postgres

QueryRunsp50p95Rows Returned
snapshot_lookup25.9, 28.7, 50.1, 30.2, 36.330.250.11
cards_rank_top50_full43.1, 43.3, 43.7, 38.6, 38.543.143.750
cards_rank_top1000_full423.1, 422.5, 414.6, 362.6, 349.7414.6423.11000
cards_views_top50_filtered35.2, 39.1, 51.0, 48.0, 39.539.551.050
aggregate_by_region4116.7, 8012.4, 4399.3, 8014.7, 8911.18012.48911.14
aggregate_by_country_language5825.1, 4054.3, 4081.6, 4122.7, 3959.64081.65825.120
narrow_rank_top100046.8, 230.4, 60.1, 84.7, 74.374.3230.41000
count_sum_all3218.5, 3188.6, 3344.4, 3230.2, 3270.83230.23344.41

StarRocks Base Table

QueryRunsp50p95Rows Returned
cards_rank_top50_full192.7, 216.2, 218.2, 195.2, 183.0195.2218.250
cards_rank_top1000_full253.6, 231.8, 216.5, 202.4, 205.9216.5253.61000
cards_views_top50_filtered169.9, 186.0, 205.0, 160.9, 190.5186.0205.050
aggregate_by_region232.9, 212.3, 210.1, 209.0, 237.2212.3237.25
aggregate_by_country_language280.7, 273.3, 280.0, 279.3, 283.5280.0283.512
narrow_rank_top100056.4, 42.3, 53.4, 51.1, 38.851.156.41000
count_sum_all54.2, 51.3, 76.7, 55.8, 56.455.876.71

StarRocks Async Materialized View

QueryRunsp50p95Rows Returned
cards_rank_top50_full190.8, 181.4, 188.1, 184.5, 187.9187.9190.850
cards_rank_top1000_full203.6, 198.4, 231.8, 201.7, 209.5203.6231.81000
cards_views_top50_filtered170.4, 177.1, 182.2, 183.0, 170.3177.1183.050
aggregate_by_region137.5, 145.1, 136.3, 151.6, 164.2145.1164.25
aggregate_by_country_language118.4, 119.3, 130.5, 144.2, 141.1130.5144.212
narrow_rank_top100039.2, 38.9, 48.6, 51.7, 55.848.655.81000
count_sum_all51.4, 69.2, 54.9, 70.3, 49.954.970.31

Interpretation

Small Indexed Row Fetches

RDS/Postgres was clearly better for the top-50 full-card fetches where the query can use selective B-tree indexes and return a small set of complete rows. The p50 for full cards top 50 by rank was 43.1 ms on RDS versus 187.9 ms on the StarRocks MV .

This supports the claim that row-oriented storage can be better for fetching full rows through an indexed API path.

Larger Top-N Reads

For full cards top 1000 by rank, StarRocks became faster than RDS in this benchmark. RDS p50 was 414.6 ms , while the StarRocks MV p50 was 203.6 ms . This is likely because the larger row count starts to reduce the advantage of row-store index traversal and heap access, while StarRocks can scan and project columnar data efficiently.

Aggregations Across Many Rows

StarRocks was dramatically faster for aggregations over the full 20 million row snapshot. The region aggregation was 8,012.4 ms p50 on RDS versus 145.1 ms p50 on the StarRocks MV . The count/sum query was 3,230.2 ms p50 on RDS versus 54.9 ms p50 on the StarRocks MV .

This strongly supports using StarRocks for aggregate-heavy Analytics API API surfaces, especially if those aggregations can be served from async materialized views.

Base Table vs Async MV

The StarRocks MV was consistently faster than or close to the base table. The difference was most visible in aggregations: aggregate by country/language improved from 280.0 ms p50 on base to 130.5 ms p50 on MV .

Existing RDS Table Schemas

These are the existing denormalized benchmark API table schemas that should inform the StarRocks OLAP table design.

serving_api_snapshots

ColumnPostgres TypeNullable
idbigintNO
tenant_idvarcharNO
workspace_idvarcharNO
report_idvarcharNO
platformvarcharNO
brand_idvarcharNO
brand_namevarcharNO
filter_hashvarcharNO
filtersjsonbNO
generated_attimestamptzNO
titlevarcharNO
industry_vertical_labelvarcharNO
total_itemsintegerNO
total_engagementsbigintNO
brands_trackedintegerNO
active_publishersintegerNO
topics_identifiedintegerNO
avg_sentimentnumericNO
tab_countsjsonbNO
created_attimestamptzNO
updated_attimestamptzNO

serving_api_filter_options

ColumnPostgres TypeNullable
idbigintNO
tenant_idvarcharNO
report_idvarcharNO
platformvarcharNO
snapshot_idbigintNO
filter_typevarcharNO
option_keyvarcharNO
labelvarcharNO
result_countintegerNO
ordinalintegerNO
metadatajsonbNO
created_attimestamptzNO
updated_attimestamptzNO

serving_api_result_cards

ColumnPostgres TypeNullable
idbigintNO
snapshot_idbigintNO
tenant_idvarcharNO
report_idvarcharNO
platformvarcharNO
source_item_idvarcharNO
rankintegerNO
tab_keystext[]NO
brand_idstext[]NO
vertical_keystext[]NO
category_keystext[]NO
region_keyvarcharNO
country_codevarcharNO
language_codevarcharNO
channel_idvarcharNO
channel_namevarcharNO
channel_avatar_urlvarcharNO
creator_badgevarcharNO
titlevarcharNO
thumbnail_urlvarcharNO
content_urlvarcharNO
published_attimestamptzNO
age_labelvarcharNO
viewsbigintNO
likesbigintNO
commentsbigintNO
activity_per_daynumericNO
velocity_pctnumericNO
sentiment_scorenumericNO
search_tsvtsvectorNO
created_attimestamptzNO
updated_attimestamptzNO

RDS Indexes Restored

The following indexes existed on serving_api_result_cards after the load and rebuild.

StarRocks Design Notes

The StarRocks benchmark was created in the StarRocks native catalog. Fully qualified names were required because the default session database was resolving unqualified operations against the Iceberg catalog.

Type Mapping Used For Benchmark

Postgres TypeStarRocks MappingReason
bigintBIGINTPreserve integer scale.
integerINTNatural StarRocks equivalent.
varchar/textVARCHARAPI-facing string fields.
text[]ARRAY VARCHARRetain array semantics for tab, brand, vertical, and category keys.
timestamptzDATETIMEStarRocks DATETIME was used for benchmark timestamp values.
numericDECIMALUsed for velocity, views per day, and sentiment values.
tsvectorsearch_text VARCHARPostgres-specific full-text vector replaced with plain searchable text.

Benchmark StarRocks MV Shape

The successful StarRocks MV metadata showed:

  • refresh_type = ASYNC
  • is_active = true
  • last_refresh_state = SUCCESS
  • rows = 20,000,000
  • query_rewrite_status = VALID

Cleanup Status

ItemStatusReason
Temporary local helper scriptsRemovedThey were only used to run the benchmark.
Temporary Python virtualenvRemovedIt was only used for temporary database clients.
Temporary AdministratorAccess AWS configRemovedNo local admin profile was left behind.
Temporary admin kubeconfigRemovedNo temporary kubeconfig was left behind.
StarRocks syntax probe objectsRemovedOnly benchmark objects were preserved.
RDS 20M benchmark dataPreservedPreserved because it is benchmark data in the existing denormalized API tables.
StarRocks benchmark base table and MVPreservedPreserved for follow-up validation and comparison.

Recommendation

For the new Analytics API serving design, the table should live in StarRocks if the API is going to serve analytical rollups, counts, sums, grouped summaries, or broad scans. StarRocks materially outperformed RDS for those access patterns in this benchmark.

If the endpoint primarily returns a small number of full rows through highly selective filters, RDS remains competitive and may be faster at top-50 style API reads. For StarRocks to be the better serving layer for those paths, the OLAP table or MV should be ordered and distributed around the actual API filters and sort keys, for example snapshot_id , tenant_id , report_id , platform , rank , views , and published_at .

The practical path is to create the canonical Analytics API OLAP table and async materialized views in StarRocks, then have this API repo read from those StarRocks objects instead of maintaining a separate RDS materialized-view serving copy.