Data Systems / 2026-05-22
RDS vs StarRocks 20M Serving and Aggregation Benchmark
This benchmark compares Postgres/RDS and StarRocks for denormalized API serving tables at 20 million rows. It separates low-latency indexed row fetches from broader OLAP aggregations so the serving architecture can be chosen by query shape.
Snapshot rows in both systems
StarRocks async materialized view
Refresh time for the StarRocks MV
Materialized views dominated broad scans
Decide whether an OLAP serving layer can replace or complement indexed RDS tables for API read paths at 20 million rows.
Built equivalent denormalized serving shapes in RDS/Postgres and StarRocks, then timed indexed row fetches and broad aggregations.
RDS stayed faster for small full-card reads, while StarRocks materialized views dominated broad scans and grouped aggregations.
- AWS
- RDS/PostgreSQL
- StarRocks
- Iceberg
- S3 Tables
- Materialized Views
- Python
Executive summary
The result confirms the storage-engine tradeoff. RDS/Postgres is better for small indexed API-style row fetches, especially top-50 full-card reads. StarRocks is much faster for full-table aggregation over 20 million rows, and async materialized views improve aggregation reads further.
The practical conclusion is to use RDS/Postgres for low-latency indexed row sets and StarRocks with materialized views for analytical API shapes that aggregate or scan many rows.
Benchmark results
| Query | RDS p50 | RDS p95 | StarRocks base p50 | StarRocks MV p50 | Fastest p50 |
|---|---|---|---|---|---|
| Snapshot lookup | 30.2 ms | 50.1 ms | N/A | N/A | RDS |
| Full cards top 50 by rank | 43.1 ms | 43.7 ms | 195.2 ms | 187.9 ms | RDS |
| Full cards top 1000 by rank | 414.6 ms | 423.1 ms | 216.5 ms | 203.6 ms | StarRocks MV |
| Filtered full cards top 50 by views | 39.5 ms | 51.0 ms | 186.0 ms | 177.1 ms | RDS |
| Aggregate by region over 20M rows | 8,012.4 ms | 8,911.1 ms | 212.3 ms | 145.1 ms | StarRocks MV |
| Aggregate by country/language over 20M rows | 4,081.6 ms | 5,825.1 ms | 280.0 ms | 130.5 ms | StarRocks MV |
| Narrow top 1000 by rank | 74.3 ms | 230.4 ms | 51.1 ms | 48.6 ms | StarRocks MV |
| Count and sum over 20M rows | 3,230.2 ms | 3,344.4 ms | 55.8 ms | 54.9 ms | StarRocks MV |
Interpretation
- For top-50 full-card reads, RDS p50 was 43.1 ms while StarRocks MV p50 was 187.9 ms.
- For top-1000 reads, StarRocks MV overtook RDS with a 203.6 ms p50 versus 414.6 ms.
- For region aggregation, StarRocks MV delivered 145.1 ms p50 versus 8,012.4 ms on RDS.
- For country/language aggregation, StarRocks MV improved base StarRocks from 280.0 ms p50 to 130.5 ms.
Recommendation
A hybrid serving design should use StarRocks when the API serves rollups, counts, sums, grouped summaries, or broad scans. RDS remains competitive and often faster for small selective full-row reads.
The StarRocks OLAP table and materialized views should be ordered and distributed around the actual API filters and sort keys, such as snapshot_id, tenant_id, dataset_id, platform, rank, activity_count, and published_at.
Detailed methodology and results
Supporting methodology, figures, and tables are rendered here as native page content with the same visual system as the rest of this website.
Detailed record of the AWS benchmark comparing the existing RDS/Postgres denormalized benchmark API tables with StarRocks OLAP tables and an async materialized view.
Executive Summary
The benchmark confirms the expected storage-engine tradeoff. RDS/Postgres was faster for small indexed API-style row fetches, especially top-50 full-card reads. StarRocks was much faster for full-table aggregation over 20 million rows, and the async materialized view improved those aggregation reads further.
Objects Touched
RDS/Postgres
Existing tables in the benchmark database were inspected and the main content-card benchmark table was expanded to the requested benchmark size. These objects were preserved after the benchmark.
| Table | Rows | Notes |
|---|---|---|
| serving_api_snapshots | 3 | Snapshot metadata for the Analytics API overview. |
| serving_api_filter_options | 66 | Filter option metadata for the benchmark API. |
| serving_api_result_cards | 20,005,000 | Denormalized content-card serving table. Snapshot 1 was expanded to 20,000,000 rows. |
StarRocks
The StarRocks benchmark objects were created under default_catalog.serving_benchmark . Probe objects used to test syntax were removed.
| Object | Type | Rows | Status |
|---|---|---|---|
| default_catalog.serving_benchmark.serving_api_result_cards_base | OLAP table | 20,000,000 | Preserved |
| default_catalog.serving_benchmark.serving_api_result_cards_mv | Async materialized view | 20,000,000 | SUCCESS |
Methodology
- Benchmarks were run in an AWS benchmark environment using a temporary benchmark role.
- The RDS table was populated with synthetic Analytics API rows to reach 20 million rows in snapshot 1.
- Secondary indexes on the RDS content-card table were dropped before bulk insert, then rebuilt and analyzed afterward.
- The StarRocks benchmark table used the same denormalized serving shape as the RDS content-card table, with StarRocks-native type mappings.
- The StarRocks async MV was created over the base table and manually refreshed.
- Each timed query used 2 warmup executions and 5 measured executions.
- Reported p50 is the median of the 5 measured runs. Reported p95 is the nearest-rank p95 over those 5 runs.
RDS State After Load
| Snapshot | Rows | Min ID | Max ID | Published range |
|---|---|---|---|---|
| 1 | 20,000,000 | 1 | 20,005,000 | 2024-05-23 22:06:31 UTC to 2026-05-22 22:15:41 UTC |
| 2 | 2,500 | 2 | 7,499 | 2026-02-18 19:34:50 UTC to 2026-05-18 19:34:50 UTC |
| 3 | 2,500 | 3 | 7,500 | 2026-02-18 19:34:50 UTC to 2026-05-18 19:34:50 UTC |
Snapshot 1 metadata was updated to reflect the benchmark scale: total_items = 20,000,000 , active_publishers = 100,000 , and total_engagements = 99,011,392,245,710 .
Benchmark Results
Times below are measured in milliseconds. Lower is better. Full-card queries select the denormalized API-facing columns. Narrow queries select only a small subset of numeric columns.
| Query | RDS p50 | RDS p95 | StarRocks Base p50 | StarRocks Base p95 | StarRocks MV p50 | StarRocks MV p95 | Fastest p50 |
|---|---|---|---|---|---|---|---|
| Snapshot lookup | 30.2 | 50.1 | N/A | N/A | N/A | N/A | RDS |
| Full cards top 50 by rank | 43.1 | 43.7 | 195.2 | 218.2 | 187.9 | 190.8 | RDS |
| Full cards top 1000 by rank | 414.6 | 423.1 | 216.5 | 253.6 | 203.6 | 231.8 | StarRocks MV |
| Filtered full cards top 50 by views | 39.5 | 51.0 | 186.0 | 205.0 | 177.1 | 183.0 | RDS |
| Aggregate by region over 20M rows | 8,012.4 | 8,911.1 | 212.3 | 237.2 | 145.1 | 164.2 | StarRocks MV |
| Aggregate by country/language over 20M rows | 4,081.6 | 5,825.1 | 280.0 | 283.5 | 130.5 | 144.2 | StarRocks MV |
| Narrow top 1000 by rank | 74.3 | 230.4 | 51.1 | 56.4 | 48.6 | 55.8 | StarRocks MV |
| Count and sum over 20M rows | 3,230.2 | 3,344.4 | 55.8 | 76.7 | 54.9 | 70.3 | StarRocks MV |
Raw Timing Runs
The following are the five measured runs for each query after warmup. Values are milliseconds.
RDS/Postgres
| Query | Runs | p50 | p95 | Rows Returned |
|---|---|---|---|---|
| snapshot_lookup | 25.9, 28.7, 50.1, 30.2, 36.3 | 30.2 | 50.1 | 1 |
| cards_rank_top50_full | 43.1, 43.3, 43.7, 38.6, 38.5 | 43.1 | 43.7 | 50 |
| cards_rank_top1000_full | 423.1, 422.5, 414.6, 362.6, 349.7 | 414.6 | 423.1 | 1000 |
| cards_views_top50_filtered | 35.2, 39.1, 51.0, 48.0, 39.5 | 39.5 | 51.0 | 50 |
| aggregate_by_region | 4116.7, 8012.4, 4399.3, 8014.7, 8911.1 | 8012.4 | 8911.1 | 4 |
| aggregate_by_country_language | 5825.1, 4054.3, 4081.6, 4122.7, 3959.6 | 4081.6 | 5825.1 | 20 |
| narrow_rank_top1000 | 46.8, 230.4, 60.1, 84.7, 74.3 | 74.3 | 230.4 | 1000 |
| count_sum_all | 3218.5, 3188.6, 3344.4, 3230.2, 3270.8 | 3230.2 | 3344.4 | 1 |
StarRocks Base Table
| Query | Runs | p50 | p95 | Rows Returned |
|---|---|---|---|---|
| cards_rank_top50_full | 192.7, 216.2, 218.2, 195.2, 183.0 | 195.2 | 218.2 | 50 |
| cards_rank_top1000_full | 253.6, 231.8, 216.5, 202.4, 205.9 | 216.5 | 253.6 | 1000 |
| cards_views_top50_filtered | 169.9, 186.0, 205.0, 160.9, 190.5 | 186.0 | 205.0 | 50 |
| aggregate_by_region | 232.9, 212.3, 210.1, 209.0, 237.2 | 212.3 | 237.2 | 5 |
| aggregate_by_country_language | 280.7, 273.3, 280.0, 279.3, 283.5 | 280.0 | 283.5 | 12 |
| narrow_rank_top1000 | 56.4, 42.3, 53.4, 51.1, 38.8 | 51.1 | 56.4 | 1000 |
| count_sum_all | 54.2, 51.3, 76.7, 55.8, 56.4 | 55.8 | 76.7 | 1 |
StarRocks Async Materialized View
| Query | Runs | p50 | p95 | Rows Returned |
|---|---|---|---|---|
| cards_rank_top50_full | 190.8, 181.4, 188.1, 184.5, 187.9 | 187.9 | 190.8 | 50 |
| cards_rank_top1000_full | 203.6, 198.4, 231.8, 201.7, 209.5 | 203.6 | 231.8 | 1000 |
| cards_views_top50_filtered | 170.4, 177.1, 182.2, 183.0, 170.3 | 177.1 | 183.0 | 50 |
| aggregate_by_region | 137.5, 145.1, 136.3, 151.6, 164.2 | 145.1 | 164.2 | 5 |
| aggregate_by_country_language | 118.4, 119.3, 130.5, 144.2, 141.1 | 130.5 | 144.2 | 12 |
| narrow_rank_top1000 | 39.2, 38.9, 48.6, 51.7, 55.8 | 48.6 | 55.8 | 1000 |
| count_sum_all | 51.4, 69.2, 54.9, 70.3, 49.9 | 54.9 | 70.3 | 1 |
Interpretation
Small Indexed Row Fetches
RDS/Postgres was clearly better for the top-50 full-card fetches where the query can use selective B-tree indexes and return a small set of complete rows. The p50 for full cards top 50 by rank was 43.1 ms on RDS versus 187.9 ms on the StarRocks MV .
This supports the claim that row-oriented storage can be better for fetching full rows through an indexed API path.
Larger Top-N Reads
For full cards top 1000 by rank, StarRocks became faster than RDS in this benchmark. RDS p50 was 414.6 ms , while the StarRocks MV p50 was 203.6 ms . This is likely because the larger row count starts to reduce the advantage of row-store index traversal and heap access, while StarRocks can scan and project columnar data efficiently.
Aggregations Across Many Rows
StarRocks was dramatically faster for aggregations over the full 20 million row snapshot. The region aggregation was 8,012.4 ms p50 on RDS versus 145.1 ms p50 on the StarRocks MV . The count/sum query was 3,230.2 ms p50 on RDS versus 54.9 ms p50 on the StarRocks MV .
This strongly supports using StarRocks for aggregate-heavy Analytics API API surfaces, especially if those aggregations can be served from async materialized views.
Base Table vs Async MV
The StarRocks MV was consistently faster than or close to the base table. The difference was most visible in aggregations: aggregate by country/language improved from 280.0 ms p50 on base to 130.5 ms p50 on MV .
Existing RDS Table Schemas
These are the existing denormalized benchmark API table schemas that should inform the StarRocks OLAP table design.
serving_api_snapshots
| Column | Postgres Type | Nullable |
|---|---|---|
| id | bigint | NO |
| tenant_id | varchar | NO |
| workspace_id | varchar | NO |
| report_id | varchar | NO |
| platform | varchar | NO |
| brand_id | varchar | NO |
| brand_name | varchar | NO |
| filter_hash | varchar | NO |
| filters | jsonb | NO |
| generated_at | timestamptz | NO |
| title | varchar | NO |
| industry_vertical_label | varchar | NO |
| total_items | integer | NO |
| total_engagements | bigint | NO |
| brands_tracked | integer | NO |
| active_publishers | integer | NO |
| topics_identified | integer | NO |
| avg_sentiment | numeric | NO |
| tab_counts | jsonb | NO |
| created_at | timestamptz | NO |
| updated_at | timestamptz | NO |
serving_api_filter_options
| Column | Postgres Type | Nullable |
|---|---|---|
| id | bigint | NO |
| tenant_id | varchar | NO |
| report_id | varchar | NO |
| platform | varchar | NO |
| snapshot_id | bigint | NO |
| filter_type | varchar | NO |
| option_key | varchar | NO |
| label | varchar | NO |
| result_count | integer | NO |
| ordinal | integer | NO |
| metadata | jsonb | NO |
| created_at | timestamptz | NO |
| updated_at | timestamptz | NO |
serving_api_result_cards
| Column | Postgres Type | Nullable |
|---|---|---|
| id | bigint | NO |
| snapshot_id | bigint | NO |
| tenant_id | varchar | NO |
| report_id | varchar | NO |
| platform | varchar | NO |
| source_item_id | varchar | NO |
| rank | integer | NO |
| tab_keys | text[] | NO |
| brand_ids | text[] | NO |
| vertical_keys | text[] | NO |
| category_keys | text[] | NO |
| region_key | varchar | NO |
| country_code | varchar | NO |
| language_code | varchar | NO |
| channel_id | varchar | NO |
| channel_name | varchar | NO |
| channel_avatar_url | varchar | NO |
| creator_badge | varchar | NO |
| title | varchar | NO |
| thumbnail_url | varchar | NO |
| content_url | varchar | NO |
| published_at | timestamptz | NO |
| age_label | varchar | NO |
| views | bigint | NO |
| likes | bigint | NO |
| comments | bigint | NO |
| activity_per_day | numeric | NO |
| velocity_pct | numeric | NO |
| sentiment_score | numeric | NO |
| search_tsv | tsvector | NO |
| created_at | timestamptz | NO |
| updated_at | timestamptz | NO |
RDS Indexes Restored
The following indexes existed on serving_api_result_cards after the load and rebuild.
StarRocks Design Notes
The StarRocks benchmark was created in the StarRocks native catalog. Fully qualified names were required because the default session database was resolving unqualified operations against the Iceberg catalog.
Type Mapping Used For Benchmark
| Postgres Type | StarRocks Mapping | Reason |
|---|---|---|
| bigint | BIGINT | Preserve integer scale. |
| integer | INT | Natural StarRocks equivalent. |
| varchar/text | VARCHAR | API-facing string fields. |
| text[] | ARRAY VARCHAR | Retain array semantics for tab, brand, vertical, and category keys. |
| timestamptz | DATETIME | StarRocks DATETIME was used for benchmark timestamp values. |
| numeric | DECIMAL | Used for velocity, views per day, and sentiment values. |
| tsvector | search_text VARCHAR | Postgres-specific full-text vector replaced with plain searchable text. |
Benchmark StarRocks MV Shape
The successful StarRocks MV metadata showed:
- refresh_type = ASYNC
- is_active = true
- last_refresh_state = SUCCESS
- rows = 20,000,000
- query_rewrite_status = VALID
Cleanup Status
| Item | Status | Reason |
|---|---|---|
| Temporary local helper scripts | Removed | They were only used to run the benchmark. |
| Temporary Python virtualenv | Removed | It was only used for temporary database clients. |
| Temporary AdministratorAccess AWS config | Removed | No local admin profile was left behind. |
| Temporary admin kubeconfig | Removed | No temporary kubeconfig was left behind. |
| StarRocks syntax probe objects | Removed | Only benchmark objects were preserved. |
| RDS 20M benchmark data | Preserved | Preserved because it is benchmark data in the existing denormalized API tables. |
| StarRocks benchmark base table and MV | Preserved | Preserved for follow-up validation and comparison. |
Recommendation
For the new Analytics API serving design, the table should live in StarRocks if the API is going to serve analytical rollups, counts, sums, grouped summaries, or broad scans. StarRocks materially outperformed RDS for those access patterns in this benchmark.
If the endpoint primarily returns a small number of full rows through highly selective filters, RDS remains competitive and may be faster at top-50 style API reads. For StarRocks to be the better serving layer for those paths, the OLAP table or MV should be ordered and distributed around the actual API filters and sort keys, for example snapshot_id , tenant_id , report_id , platform , rank , views , and published_at .
The practical path is to create the canonical Analytics API OLAP table and async materialized views in StarRocks, then have this API repo read from those StarRocks objects instead of maintaining a separate RDS materialized-view serving copy.