Order Aggregates¶
File: app/core/models/order_aggregates.py
Pre-computed daily order metrics for fast campaign reporting.
Key fields (DailyAggregate)¶
- Dimensions:
organization_id,campaign_id,day_bucket(DATE),is_holdout - Metrics (JSONB):
total_orders,revenue,first_purchase_orders/revenue,repeat_orders/revenue,subscription_orders/revenue,discount_matched_orders,unique_customers,new_subscribers,aov
Computation¶
- Frequency: Every 15 minutes via Celery Beat
- Source: Raw
attributionstable, filtered bypasses_rules=true,archived=false,order_value>0 - Idempotent: ON CONFLICT DO UPDATE for safe re-runs
- Progress tracking:
attribution_aggregation_progresstable trackslast_processed_dayper org - Sliding window: Each tick recomputes buckets touched in the last
ORDER_AGGREGATION_SLIDING_WINDOW_DAYS(default 2) days plus any updated since the previous run. Today's in-progress bucket is included — the upsert refreshes it every tick so dashboards show today's running total. (Hourly granularity used to filter out the in-progress hour; carrying that to daily would have stalled dashboards 24h — the per-tick refresh of today's bucket is the conscious trade-off.)
Holdout-Aware Aggregates¶
Daily aggregates are split by the is_holdout boolean dimension. Experiment analysis reads is_holdout=True rows directly from pre-computed aggregates instead of scanning raw attributions — this powers fast A/B test reporting on the dashboard.
Reporting Features¶
Bulk Campaign Stats¶
GET /v1/reporting/summary/<org_id>/campaigns — fetches stats for all campaigns in a single pass via generate_bulk_campaign_stats_from_aggregates(). Partitions daily + recipient totals by campaign and holdout status, computes experiment results per campaign.
Excluded Recipient Compensation¶
When certain recipients are excluded from stats (e.g., during experiment analysis), compute_excluded_recipient_stats() queries raw attributions for just those recipients and subtracts their totals from the pre-computed aggregates. This ensures excluded recipients match what the raw reporting path would produce.
Rolling Stats from Aggregates¶
generate_rolling_stats_from_daily_aggregates() takes daily aggregate rows (one per (campaign, day, holdout)) and applies rolling windows over the time series. AOV is computed from daily totals (revenue / orders) rather than averaging per-row AOVs — avoiding mean-of-means errors.
Summary-Only Mode¶
The exclude_rolling_stats flag on generate_order_stats_from_aggregates() skips the daily breakdown and uses fetch_reporting_recipient_totals() directly. This optimizes summary-only endpoints that don't need the full time series.
Admin rebuild¶
Operators can rebuild a single org or one campaign's aggregates out-of-band (e.g. after an attribution rule change or to validate a fix) without shelling into a container.
- Endpoint:
POST /api/v1/admin/aggregations/order/rebuild— admin-only, body{organization_id, campaign_id?, full_refresh?}.full_refreshdefaults tofalse(incremental); passtrueto additionally sweep stale rows after the rebuild. Returns202 {message, task_id, ...}. Seeroutes/aggregation_trigger_routes.py. - Task:
attribution_aggregates.rebuild_order_aggregates_for_organization(organization_id, campaign_id=None, full_refresh=False)on theslowqueue, deduped byQueueOnceon(organization_id, campaign_id). Delegates tomethods/aggregation_rebuild.backfill_order_aggregatesand resets the Redis reporting cache on completion. The same method powers thescripts/backfill_order_aggregates.pyCLI. - Mark-and-sweep on
full_refresh=True: existing rows stay visible the whole run. The orchestrator capturesdb_now()before the chunk loop, every chunk'sUPSERTrefresheslast_updated = NOW(), and at the enddelete_stale_org_aggregates(org_id, campaign_id, before=run_started_at)removes only rows the run did not touch. Readers never see an empty stats window.
Config-triggered rebuild¶
Saving a change to organizations.configuration.excluded_attributions from PUT /api/v1/organizations/<id> dispatches recalc_campaign_pipeline(org_id, campaign_id=None, full_refresh=True) (see app/celery/recalc_pipeline.py) rather than the order-only admin task. The pipeline runs order rebuild → recipient rebuild → (skipped at org scope) stats warm; the recipient rebuild is redundant for an attributions-only change but keeps one entry point for all recalc dispatches on the same admin, low-frequency operation. QueueOnce on (organization_id, campaign_id=None) dedupes rapid back-to-back saves on the same org.
Key files¶
- Model:
core/models/order_aggregates.py - Rebuild orchestration:
methods/aggregation_rebuild.py - Scheduled + admin-trigger task:
celery/attribution_aggregates.py - Config-triggered recalc pipeline:
celery/recalc_pipeline.py - Admin route:
routes/aggregation_trigger_routes.py - Backfill CLI:
scripts/backfill_order_aggregates.py - Reporting:
methods/reporting_aggregates.py - Routes:
routes/reporting_routes.py