Order Attribution Methodology¶
Executive Summary¶
PaperRun's attribution system tracks customer orders back to marketing campaigns using multiple matching methods with sophisticated timing constraints. The system handles both experimental groups (recipients who receive mailers) and control groups/holdouts (recipients excluded from mailings to measure campaign effectiveness).
Table of Contents¶
- Attribution Flow Overview
- Attribution Methods
- Critical Timing Constraints
- Order Syncing & Checkpoint Management
- Test vs Control Group Differences
- Multi-Order Attribution & Order Count
- Additional Business Rules
- Experiment Analysis
- Edge Cases & Important Behaviors
- Code References
1. Attribution Flow Overview¶
1.1 High-Level Process¶
- Order Syncing (every 5 minutes via
app/celery/sync_orders.py:process_orders) - Fetches orders from integration providers (Klaviyo, Ometria, Shopify)
- Uses checkpoint-based incremental syncing
-
Processes orders through attribution pipeline
-
Order Attribution (
attribute_orderfunction) - Attempts to match orders to campaign recipients using cascading methods
-
Creates unprocessed attribution records
-
Attribution Processing (
process_attributionfunction) - Validates attribution timing windows
- Applies business rules
- Marks attributions as valid/invalid
1.2 Key Components¶
- Order Syncing:
app/celery/sync_orders.py - Attribution Logic:
app/methods/attributions.py - Holdout Logic:
app/methods/holdouts.py - Data Models:
app/core/models/attributions.py,app/core/models/campaign_recipients.py
2. Attribution Methods (Priority Order)¶
2.1 Method Hierarchy¶
The system tries attribution methods in this order:
1. Email Match (AttributionMethod.Email)¶
- Match Criteria:
Order email == recipient email(case-insensitive) - Requirements:
- Recipient status =
Sent sent_attimestamp must exist- Timing: Uses
sent_atas attribution window start - Implementation:
app/celery/sync_orders.py:check_for_matched_emails(line 210)
2. Physical Address Match (AttributionMethod.PhysicalAddress)¶
- Match Criteria:
Order shipping address == recipient address - Concatenates:
address1 + address2 + zip(lowercased, spaces removed) - Requirements:
- Recipient status =
Sent sent_atis not None- Special Case: Skips attribution if active subscription existed before
sent_at - Timing: Uses
sent_atas attribution window start - Implementation:
app/celery/sync_orders.py:check_for_physical_address_match(line 261)
3. Discount Code Match (AttributionMethod.DiscountCode)¶
- Match Criteria: Order discount code matches campaign or recipient code
- Two Types:
- Campaign-level codes: Stored in
campaign.settings.discount_code - Unique recipient codes: When
campaign.settings.use_unique_discount = True - Important: This is the ONLY method that passes rules for unsent recipients
- Timing: Can work without
sent_at - Implementation:
app/celery/sync_orders.py:attribute_order(line 372)
4. Holdout Match (Fallback)¶
- When Checked: Only if no experimental group attribution found
- Match Criteria: Email or physical address match with recipient status =
Holdout - Creates: Separate holdout attribution with
holdout=Trueflag - Timing: Uses
created_atas attribution window start (different from experimental!) - Implementation:
app/celery/sync_orders.py:check_for_holdout_order(line 78)
3. Critical Timing Constraints¶
3.1 Key Date Fields¶
Campaign Recipient Fields¶
created_at: When recipient record was created in databasesent_at: When mailer was successfully sent to print provider (status changed toSent)status: Recipient lifecycle state- Pending → ProofingStaged → Proofing → Proofed → SendingStaged → Sending → Sent
- Or: Holdout (never progresses to Sent)
Attribution Fields¶
created: Order datetime (when customer placed order)
Campaign Fields¶
start_date: Campaign start dateend_date: Campaign end date (used for BFCM absolute cutoff)
3.2 Attribution Window Rules for Experimental Groups (Sent Recipients)¶
Location: app/methods/attributions.py:process_attribution (lines 150-171)
The attribution system validates orders based on TWO timing constraints that must both be satisfied:
- Campaign timing:
campaign.first_send_date + min_days - Recipient timing:
recipient.created_at + min_days
The order must occur on or after BOTH thresholds to be attributed. The system takes the maximum (latest) of these two dates as the effective attribution window start.
# Calculate campaign threshold
campaign_min_date = campaign.first_send_date + timedelta(days=min_days)
# Calculate recipient threshold
recipient_min_date = recipient.created_at + timedelta(days=min_days)
# Effective attribution window starts at the LATER of the two dates
effective_start = max(campaign_min_date, recipient_min_date)
# Order must be >= effective_start and within max_days window
if order_date >= effective_start and order_date < (effective_start + max_days):
attribution.passes_rules = True
else:
attribution.passes_rules = False
Attribution Window Sizes¶
Location: app/core/models/campaigns.py:get_attribution_window_max/min
| Campaign Type | Min Days (Campaign & Recipient) | Max Days | Notes |
|---|---|---|---|
| Holdout Enabled | 1 | 60 | Both campaign and recipient use 1 day |
| FirstPurchase Flow | 3 | 180 | Longer window for acquisition |
| BFCM Flow | 3 | 63 | PLUS absolute cutoff at end_date + 2 days |
| Standard Campaign | 3 | 63 | Default window |
Important: The min_days value applies to BOTH the campaign first_send_date and the recipient created_at. This ensures orders are attributed only after both: - Sufficient time has passed since the campaign first sent (to allow mail delivery) - Sufficient time has passed since the recipient was created (to allow for recipient-specific processing)
BFCM Special Case¶
Location: app/methods/attributions.py:process_attribution (lines 164-171)
BFCM campaigns have an additional hard cutoff date:
- Calculated as: min(end_date + 2 days, December 31)
- 2-day buffer accounts for timezone differences (UK vs US campaigns)
- Prevents attribution windows from extending into next year
- Orders after this date fail attribution even if within min/max day window
Example:
# BFCM campaign with end_date = 2024-11-29
absolute_cutoff = min(2024-12-01, 2024-12-31) = 2024-12-01
# Order placed on 2024-12-05 (35 days after sent_at)
# Within 63-day window BUT after absolute cutoff
# Result: passes_rules = False
3.3 Attribution Window Rules for Control Groups (Holdouts)¶
Location: app/celery/sync_orders.py:check_for_holdout_order (lines 78-207)
Critical Difference: Holdouts use created_at not sent_at because they never receive mailers!
Like experimental groups, holdout attribution also validates orders based on TWO timing constraints:
- Campaign timing:
campaign.first_send_date + min_days - Recipient timing:
recipient.created_at + min_days
The order must occur on or after BOTH thresholds to be attributed.
# Calculate campaign threshold
campaign_min_date = campaign.first_send_date + timedelta(days=min_days)
# Calculate recipient threshold (using created_at since holdouts never have sent_at)
recipient_min_date = holdout_recipient.created_at + timedelta(days=min_days)
# Effective attribution window starts at the LATER of the two dates
effective_start = max(campaign_min_date, recipient_min_date)
# Order must be >= effective_start and within max_days window
if order_datetime >= effective_start and order_datetime < (effective_start + max_days):
passes_attribution_rules = True
else:
passes_attribution_rules = False
Holdout-Specific Rules¶
- Uses recipient
created_at(notsent_at) as the recipient timing constraint - Same min_days value as experimental group (from campaign settings)
- 1 day for campaigns with holdout enabled
- 3 days for standard campaigns
- Same max_days value as experimental group (from campaign settings)
- Orders must satisfy BOTH campaign and recipient timing thresholds
- Campaign must be in post-launch state (
campaign.should_process_stats()) - Orders < $1 are excluded
3.4 Unsent Recipients - Special Handling¶
Location: app/methods/attributions.py:process_attribution (lines 140-148)
# Early exit for recipients without sent_at
if recipient.sent_at is None:
logger.info(f"Did not process attribution...")
# ONLY DiscountCode attributions pass rules for unsent recipients
passes_rules = attribution.method == AttributionMethod.DiscountCode
attribution.processed = True
attribution.passes_rules = passes_rules
att.upsert_attribution_by_provider_id(attribution=attribution)
return
Key Insight: If sent_at is None:
- ✅ DiscountCode attributions → passes_rules = True
- ❌ All other methods (Email, PhysicalAddress, etc.) → passes_rules = False
Why? - Discount codes can be used before mailers are sent (pre-campaign purchases) - Email/address matching requires the mailer to have been sent to influence behavior
4. Order Syncing & Checkpoint Management¶
4.1 Sync Process¶
Frequency: Every 5 minutes (300 seconds)
Location: app/celery/sync_orders.py:process_orders
Flow:
1. Fetch all enabled organizations
2. For each organization with sync_orders = True:
- Get mail integration (Klaviyo, Ometria, or Shopify)
- Fetch orders since last checkpoint
- Process each order through attribution pipeline
- Update checkpoint to latest order datetime
4.2 Checkpoint Logic¶
Location: app/celery/sync_orders.py:get_organization_sync_checkpoint
def get_organization_sync_checkpoint(organization: Organization) -> Optional[datetime]:
# 1. Try to use last_order_sync from organization config
last_sync_checkpoint = organization.configuration.last_order_sync
# 2. If no checkpoint, use timestamp of most recent order in DB
last_order = get_last_order(organization=organization.id)
if last_order is not None and not last_sync_checkpoint:
last_sync_checkpoint = last_order.datetime
return last_sync_checkpoint
Checkpoint Update: app/celery/sync_orders.py:sync_organization_page_order (line 613)
# After processing all orders, update checkpoint to latest order datetime
events = sorted(events, key=lambda x: x.datetime, reverse=True)
next_checkpoint = events[0].datetime
update_organization_order_checkpoint(organization, next_checkpoint)
4.3 Integration Providers¶
Klaviyo¶
- Uses UNIX timestamp for checkpoint
- Fetches "Placed Order" metric events via
find_events_for_metric() - Converts events to
OrderEventobjects
Ometria¶
- Uses ISO datetime for checkpoint
- Fetches orders via
list_orders(since=checkpoint, limit=150) - Converts to
OrderEventobjects
Shopify¶
- Uses ISO datetime for checkpoint
- Fetches order events via GraphQL:
list_order_events_since(checkpoint, per_page=100, limit=1000) - Converts to
OrderEventobjects
4.4 Attribution Caching¶
Location: app/celery/sync_orders.py (multiple functions)
The system uses Redis caching to pass attribution data between order processing and attribution creation:
# Cache key format
redis_key = f"attribution_helper_order_provider_id_{order.provider_id}"
# Cached for 600 seconds (10 minutes)
RedisClient.set_dict(redis_key, {
"email": order.email,
"datetime": order.datetime,
"recipient_id": recipient.id,
"method": att.AttributionMethod.Email, # or other method
"holdout": False,
"discount_code": code, # if discount code match
"discount_match": True # if discount matched
}, 600)
Read: app/methods/attributions.py:insert_unprocessed_attribution (line 56)
This cache is read when creating the initial attribution record and then deleted.
5. Test vs Control Group Differences¶
5.1 Recipient Status Tracking¶
Experimental Group (recipients who receive mail)¶
- Status Progression:
sent_at: Populated when status changes toSent- Attribution Start: Uses
sent_atas window start
Control Group (holdout recipients)¶
- Status:
Holdout(never progresses to Sent) sent_at: RemainsNone- Attribution Start: Uses
created_atas window start
5.2 Attribution Process Differences¶
| Aspect | Experimental Group | Control Group |
|---|---|---|
| Attribution Window Start | max(campaign.first_send_date + min_days, recipient.created_at + min_days) |
max(campaign.first_send_date + min_days, recipient.created_at + min_days) |
| Recipient Date Used | recipient.created_at (for min_days calculation) |
recipient.created_at (for min_days calculation) |
| Checked When | Primary attribution flow | Fallback if no experimental match |
| Methods Supported | Email, PhysicalAddress, DiscountCode | Email, PhysicalAddress |
| Order in Process | Checked first | Checked last (only if attributed_campaign <= 0) |
| Database Flag | attribution.holdout = False |
attribution.holdout = True |
| Min Days Value | 1 for holdout campaigns, 3 for standard campaigns | Same as experimental (1 for holdout campaigns, 3 for standard) |
| Processed Status | Set during attribution processing job | Set immediately during sync |
Key Insight: Both experimental and control groups now use the same min_days logic for recipient timing. This creates fair A/B testing by ensuring both groups have the same minimum wait period from recipient creation before orders can be attributed.
5.3 Holdout Campaign Settings¶
Location: app/core/models/campaigns.py:CampaignHoldout
Campaigns can enable holdouts via settings:
campaign.settings.holdout = CampaignHoldout(
enabled=True,
holdout_percentage=10.0, # 10% of recipients
holdout_absolute_maximum=500 # Cap at 500 recipients
)
Important: When campaign.settings.holdout.enabled = True:
- ALL recipients (including experimental) use created_at for attribution window
- Min days = 0 (instead of 3)
- Max days = 60 (instead of 63)
This is different from having some recipients with status = Holdout. The holdout.enabled flag changes timing for everyone.
5.4 Flow Diagram¶
┌─────────────────────────────────────────────────────────────┐
│ Order Received from Integration Provider │
└────────────────────┬────────────────────────────────────────┘
│
├─────────────────────────────────────────┐
│ │
┌───────────▼──────────┐ ┌─────────────▼────────────┐
│ Check Experimental │ │ Check Holdout Match │
│ Attribution Methods │ │ (if no experimental) │
└───────────┬──────────┘ └─────────────┬────────────┘
│ │
┌───────────▼──────────┐ ┌─────────────▼────────────┐
│ 1. Email Match │ │ Email/Address Match │
│ (sent_at required) │ │ Status = Holdout │
├───────────────────────┤ ├──────────────────────────┤
│ 2. Address Match │ │ Window Start: │
│ (sent_at required) │ │ created_at │
├───────────────────────┤ │ │
│ 3. Discount Code │ │ holdout = True │
│ (no sent_at req'd) │ │ │
└───────────┬──────────┘ └─────────────┬────────────┘
│ │
┌───────────▼──────────┐ ┌─────────────▼────────────┐
│ Window Start: │ │ Processed immediately │
│ sent_at │ │ during sync │
│ │ │ │
│ holdout = False │ │ Processed immediately │
│ │ │ during sync │
└───────────┬──────────┘ └─────────────┬────────────┘
│ │
┌───────────▼──────────┐ │
│ Processed by │ │
│ background job │ │
└───────────┬──────────┘ │
│ │
└─────────────┬───────────────────────────┘
│
┌─────────────▼────────────┐
│ Attribution Record Saved │
│ in Database │
└──────────────────────────┘
6. Multi-Order Attribution & Order Count¶
6.1 Subsequent Orders¶
Location: app/methods/attributions.py:process_attribution (lines 173-189)
The system tracks multiple orders from the same recipient:
# Fetch all attributions for this recipient-campaign pair
all_attributions = att.list_attributions_for_recipient_and_campaign_id(
recipient_id=attribution.recipient_id,
campaign_id=attribution.campaign_id
)
# Exclude current attribution from the list
all_attributions = [other for other in all_attributions
if other.attribution_id != attribution.attribution_id]
# Find previous attributions (orders before current)
previous_attributions = [other for other in all_attributions
if other.created < attribution.created]
# Store order count (1 for first order, 2 for second, etc.)
attribution.json["order_count"] = len(previous_attributions) + 1
6.2 Fallback Attribution Logic¶
Location: app/methods/attributions.py:process_attribution (lines 182-189)
Important: If current order is outside attribution window BUT a previous order from the same recipient passed rules, the current order can still pass:
# Even if current order is outside the attribution window
if len(all_attributions) > 0 and not attribution.passes_rules:
first_attribution = all_attributions[0]
if first_attribution.passes_rules:
# Current order ALSO passes rules!
attribution.passes_rules = True
Example:
Recipient sent mail: 2024-01-01
First order: 2024-01-10 (9 days later) → Within window, passes_rules = True
Second order: 2024-03-15 (74 days later) → Outside 63-day window
BUT first order passed, so second order also passes_rules = True
This allows attributing repeat purchases even if they occur after the attribution window expires, as long as the customer's first purchase was within the window.
6.3 Order Count Usage¶
The order_count field is used in:
- Reporting and analytics
- Identifying first-time buyers vs repeat customers
- Calculating customer lifetime value (LTV)
- Holdout analysis for first purchase vs all purchases (ExperimentMetric.FirstOrder vs ExperimentMetric.AllOrders)
7. Additional Business Rules¶
7.1 Zero-Value Orders¶
Location: app/methods/attributions.py:process_attribution (lines 191-193)
# Always fail attribution for $0 orders
if attribution.json.get("order_value") == 0:
attribution.passes_rules = False
This prevents test orders, cancelled orders, or data errors from being counted as conversions.
7.2 Archived Campaigns¶
Location: app/methods/attributions.py:process_attribution (lines 196-201)
Archived attributions are: - Still processed and stored in database - Excluded from reporting queries - Used for historical analysis
7.3 Blacklist Filtering¶
Location: app/celery/sync_orders.py:check_for_blacklist (line 320)
Organization-specific blacklist logic at sync time:
# Example: Organization 10 excludes orders with "wholesale" discount codes
if order.attributed_campaign > 0 and order.organization == 10:
discount_codes = order.json.get("discount_codes", [])
if len(discount_codes) > 0:
for code in discount_codes:
code = code.lower().strip()
if "wholesale" in code:
order.attributed_campaign = -1 # Clear attribution
break
This is organization-specific logic that prevents certain order types from being attributed.
7.5 Internal Orders¶
Location: app/celery/sync_orders.py:is_order_made_internally (line 480)
Orders from company email domains are excluded:
def is_order_made_internally(order: OrderEvent, organization: Organization) -> bool:
organization_email_domain = organization.get_email_domain()
valid_org_domain = organization_email_domain is not None and len(organization_email_domain) > 3
if order.email is not None and valid_org_domain and order.email.endswith(organization_email_domain):
return True
return False
Example: If organization email is support@acmeco.com, orders from john@acmeco.com are excluded.
7.6 Active Subscription Check (Physical Address Only)¶
Location: app/celery/sync_orders.py:check_for_physical_address_match (lines 286-299)
For physical address matches only, check if customer had active subscription before mail was sent:
active_subscription_start = datetime.fromisoformat(
order.json.get("included_profile", {}).get("active_subscription_start_date")
)
if active_subscription_start < recipient.sent_at:
logger.info(f"active subscription found for recipient. Should ignore order {order.id}")
return order # No attribution
This prevents attributing orders from existing subscribers who were sent winback campaigns.
8. Experiment Analysis¶
8.1 Statistical Significance Testing¶
Location: app/methods/holdouts.py:ab_test_statistical_significance (line 205)
Uses Z-test for proportions to determine if experimental group outperformed control group:
def ab_test_statistical_significance(control, experiment, alpha=0.05):
# Pooled conversion rate
pooled_prob = (conv_rate_a * n_a + conv_rate_b * n_b) / (n_a + n_b)
# Standard error
se = sqrt(pooled_prob * (1 - pooled_prob) * ((1/n_a) + (1/n_b)))
# Z-score
z_score = (conv_rate_b - conv_rate_a) / se
# P-value (one-tailed)
p_value = 1.0 - norm.cdf(z_score)
# Win probability
win_prob = norm.cdf(z_score)
# Statistical significance (default alpha = 0.05)
stat_significance = p_value < alpha
return {
"winner": "experiment" if win_prob > 0.5 else "control",
"win_probability": win_prob,
"statsig": stat_significance
}
8.2 Incremental Metrics¶
Location: app/methods/holdouts.py:calculate_holdout_analysis (line 130)
1. Conversion Uplift¶
metric_difference = experiment_conversion_rate - control_conversion_rate
metric_uplift = metric_difference / control_conversion_rate
Example: If control = 2% and experiment = 3%, uplift = 50%
2. Incremental Revenue¶
revenue_per_recipient_exp = experiment_revenue / experiment_recipients
revenue_per_recipient_ctrl = control_revenue / control_recipients
incremental_revenue = (revenue_per_recipient_exp - revenue_per_recipient_ctrl) * experiment_recipients
This calculates total additional revenue generated by the campaign.
3. Incremental ROAS (Return on Ad Spend)¶
Example: If incremental_revenue = $10,000 and cost = $2,000, ROAS = 5.0 (5x return)
4. Customer Acquisition Cost (CAC)¶
test_orders_per_send = experiment_orders / experiment_recipients
holdout_orders_per_send = control_orders / control_recipients
incremental_customers_per_send = test_orders_per_send - holdout_orders_per_send
incremental_customers = incremental_customers_per_send * experiment_recipients
cac = total_cost / incremental_customers
This calculates the cost to acquire one incremental customer through the campaign.
8.3 Experiment Metrics Modes¶
Location: app/methods/holdouts.py:ExperimentMetric (line 33)
Two analysis modes:
AllOrders Mode¶
- Counts all attributed orders (first + repeat purchases)
- Uses total revenue from all orders
- Better for understanding total campaign impact
FirstOrder Mode¶
- Counts only first purchase per customer (
order_count == 1) - Uses revenue only from first orders
- Better for understanding customer acquisition effectiveness
Example:
# Generate holdout stats with first order focus
holdout_stats = generate_holdout_stats(
holdout_recipient_count=500,
holdout_attributions=all_holdout_attributions,
metric=ExperimentMetric.FirstOrder # Only count first purchases
)
9. Edge Cases & Important Behaviors¶
9.1 Recipient Created but Not Sent¶
Scenario: Recipient exists with created_at but sent_at = None
Outcome:
- Discount code attributions: ✅ Pass rules
- All other methods: ❌ Fail rules
- Attribution record is still created and processed
- processed = True, but passes_rules depends on method
Why: Discount codes can work before sending (pre-campaign purchases), but email/address matching requires the mail to have influenced behavior.
Test Reference: app/tests/methods/test_attributions.py:TestProcessAttributionUnsentRecipient
9.2 Order Before Recipient Creation¶
Experimental Group¶
Not possible in normal flow because:
- Attribution window starts at sent_at
- sent_at must be after created_at
- Orders before created_at would have negative initial_days
Control Group (Holdout)¶
Explicitly checked and rejected:
order_newer_than_recipient = order_datetime > holdout_recipient.created_at
if not order_newer_than_recipient:
return False # No attribution
Location: app/celery/sync_orders.py:check_for_holdout_order (line 110)
9.3 Multiple Recipients with Same Email¶
Scenario: Two recipients in different campaigns have the same email, both status = Sent
Outcome: Order is attributed to most recently created recipient
Location: app/celery/sync_orders.py:check_for_matched_emails (line 223)
# Sort recipients by created_at, most recent first
recipients = sorted(recipients, key=lambda x: x.created_at, reverse=True)
for recipient in recipients:
if matched is False:
matched = True
order.attributed_campaign = recipient.campaign_id
else:
# Log duplicate attribution
logger.warning(
"Order duplicate attributed",
...
)
Additional recipients trigger an "Order duplicate attributed" log event but don't create separate attributions.
9.4 Both Email and Address Match¶
Scenario: Order matches both email and physical address for same recipient
Outcome: Email match wins (checked first)
Why: Email matching runs first, and address matching only executes if order.attributed_campaign < 0:
# Check email first
if order.email is not None and any_address_match:
order = check_for_matched_emails(...)
# Only check address if no attribution yet
if order.attributed_campaign < 0 and any_address_match:
order = check_for_physical_address_match(...)
Location: app/celery/sync_orders.py:attribute_order (lines 363-369)
9.5 Active Subscription Before Send (Address Match Only)¶
Scenario: Customer had active subscription starting before sent_at, then places order
Outcome: No attribution created (for physical address match only)
Why: Prevents attributing orders from existing subscribers who received winback campaigns
Location: app/celery/sync_orders.py:check_for_physical_address_match (lines 286-299)
active_subscription_start = datetime.fromisoformat(
order.json.get("included_profile", {}).get("active_subscription_start_date")
)
if active_subscription_start < recipient.sent_at:
return order # Skip attribution
Note: This check only applies to physical address matches, not email matches.
9.6 Campaign Not in Post-Launch State (Holdout Only)¶
Scenario: Holdout recipient created, order comes in, but campaign status = Pending
Outcome: No attribution created
Location: app/celery/sync_orders.py:check_for_holdout_order (lines 119-121)
if campaign and not campaign.should_process_stats():
return False
# should_process_stats() returns True for: Active, Completed, Paused
# Returns False for: Pending, Archived, Error
This prevents holdout attributions before campaign launches.
9.7 BFCM Campaign Without end_date¶
Scenario: Campaign has flow_trigger = BFCM but no end_date set
Outcome: Validation error when creating/updating campaign
Location: app/core/models/campaigns.py:validate_bfcm_end_date (line 276)
if self.settings.flow_trigger == FlowTrigger.Bfcm:
if not self.end_date:
raise BfcmEndDateValidationError("BFCM campaigns require an end_date")
# Validate end_date is between Nov 13 - Dec 31
if not ((month == 11 and day >= 13) or (month == 12)):
raise BfcmEndDateValidationError(
f"BFCM end_date must be between November 13 - December 31, got {self.end_date}"
)
10. Code References¶
10.1 Primary Files¶
| File | Purpose | Key Functions |
|---|---|---|
app/celery/sync_orders.py |
Order syncing and initial attribution | process_orders, attribute_order, check_for_holdout_order |
app/methods/attributions.py |
Attribution processing and validation | process_attribution, insert_unprocessed_attribution |
app/methods/holdouts.py |
Holdout analysis and statistics | calculate_holdout_analysis, ab_test_statistical_significance |
app/core/models/attributions.py |
Attribution data model and queries | Attribution, upsert_attribution_by_provider_id |
app/core/models/campaign_recipients.py |
Recipient data model and queries | CampaignRecipient, get_campaign_recipient |
app/core/models/campaigns.py |
Campaign data model and settings | Campaign, get_attribution_window_max/min |
10.2 Key Line References¶
Experimental Group Attribution Window¶
- Start date calculation:
app/methods/attributions.py:151 - Holdout campaign override:
app/methods/attributions.py:154-155 - Window validation:
app/methods/attributions.py:159-162 - BFCM cutoff:
app/methods/attributions.py:164-171
Control Group Attribution Window¶
- Holdout check:
app/celery/sync_orders.py:78-207 - Date calculation:
app/celery/sync_orders.py:109 - Window validation:
app/celery/sync_orders.py:124-125
Unsent Recipient Handling¶
- Early exit:
app/methods/attributions.py:140-148 - Discount code exception:
app/methods/attributions.py:144
Attribution Methods¶
- Email match:
app/celery/sync_orders.py:210-258 - Address match:
app/celery/sync_orders.py:261-317 - Discount code:
app/celery/sync_orders.py:372-423 - Holdout check:
app/celery/sync_orders.py:430-431
Multi-Order Tracking¶
- Previous attributions:
app/methods/attributions.py:173-179 - Fallback logic:
app/methods/attributions.py:182-189
10.3 Test References¶
| Test File | Coverage |
|---|---|
app/tests/methods/test_attributions.py |
Unsent recipients, attribution timing |
app/tests/methods/test_holdouts.py |
Holdout analysis, statistics |
app/tests/celery/test_sync_orders.py |
Order syncing, attribution methods |
app/tests/models/test_campaigns.py |
Campaign settings, BFCM validation |
Appendix: Flow Diagram¶
┌─────────────────────────────────────────────────────────────┐
│ Background Job: process_orders() - Every 5 minutes │
└────────────────────┬────────────────────────────────────────┘
│
├─ For each enabled organization
│
┌───────────▼──────────┐
│ Get checkpoint date │
│ (last synced order) │
└───────────┬──────────┘
│
┌───────────▼──────────┐
│ Fetch orders from │
│ Klaviyo/Ometria/ │
│ Shopify since │
│ checkpoint │
└───────────┬──────────┘
│
┌───────────▼──────────┐
│ For each order: │
│ │
│ 1. Skip if internal │
│ 2. attribute_order() │
│ 3. check_blacklist() │
│ 4. Save order if │
│ attributed │
└───────────┬──────────┘
│
┌───────────▼──────────────────────────────────┐
│ attribute_order() - Cascading Match Logic │
├───────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ 1. Email Match (Status=Sent) │ │
│ │ - Match email │ │
│ │ - Require sent_at │ │
│ │ - Use most recent recipient │ │
│ │ - Cache in Redis │ │
│ └─────────────────────────────────────────┘ │
│ │ If no match │
│ ┌─────────────────▼─────────────────────┐ │
│ │ 2. Physical Address Match (Status=Sent) │ │
│ │ - Match address1+address2+zip │ │
│ │ - Require sent_at │ │
│ │ - Skip if subscription before send │ │
│ │ - Cache in Redis │ │
│ └─────────────────────────────────────────┘ │
│ │ If no match │
│ ┌─────────────────▼─────────────────────┐ │
│ │ 3. Discount Code Match │ │
│ │ - Match campaign-level codes │ │
│ │ - Match unique recipient codes │ │
│ │ - Works without sent_at! │ │
│ │ - Cache in Redis │ │
│ └─────────────────────────────────────────┘ │
│ │ If no match │
│ ┌─────────────────▼─────────────────────┐ │
│ │ 4. Holdout Match (Status=Holdout) │ │
│ │ - Match email or address │ │
│ │ - Use created_at (not sent_at!) │ │
│ │ - Check campaign post-launch │ │
│ │ - Create with holdout=True │ │
│ │ - Process & save immediately │ │
│ └─────────────────────────────────────────┘ │
│ │
└───────────────────┬───────────────────────────┘
│
┌───────────────────▼────────────────────────┐
│ If attributed_campaign > 0: │
│ - Save order to database │
│ - insert_unprocessed_attribution() │
│ (reads Redis cache, creates attribution) │
│ - Log OrderAttributed event │
└───────────────────┬────────────────────────┘
│
┌───────────────────▼────────────────────────┐
│ Update checkpoint to latest order datetime │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Background Job: process_attributions() - Separate worker │
└────────────────────┬────────────────────────────────────────┘
│
┌───────────▼──────────┐
│ Get all unprocessed │
│ attributions │
└───────────┬──────────┘
│
┌───────────▼──────────────────────────────────┐
│ For each attribution: │
│ process_attribution() │
├───────────────────────────────────────────────┤
│ │
│ 1. Load recipient, campaign, organization │
│ │
│ 2. Check if sent_at exists │
│ ├─ If None and method != DiscountCode: │
│ │ → Set passes_rules = False, processed = │
│ │ True, return │
│ └─ If None and method == DiscountCode: │
│ → Set passes_rules = True, processed = │
│ True, return │
│ │
│ 3. Calculate attribution window start │
│ ├─ Default: start_date = sent_at │
│ └─ If holdout.enabled: start_date = │
│ created_at │
│ │
│ 4. Calculate days_since_dispatch │
│ initial_days = (order_date - start_date) │
│ .days │
│ │
│ 5. Check if within window │
│ if min_days <= initial_days < max_days: │
│ passes_rules = True │
│ else: │
│ passes_rules = False │
│ │
│ 6. Apply BFCM absolute cutoff (if applicable) │
│ if BFCM and order_date > cutoff_date: │
│ passes_rules = False │
│ │
│ 7. Check previous attributions │
│ if first_attribution.passes_rules: │
│ current.passes_rules = True │
│ │
│ 8. Exclude $0 orders │
│ if order_value == 0: │
│ passes_rules = False │
│ │
│ 9. Mark archived campaigns │
│ if campaign.status == Archived: │
│ attribution.archived = True │
│ │
│ 10. Set processed = True, save │
│ │
└───────────────────────────────────────────────┘
Summary¶
The PaperRun attribution system is sophisticated, with different timing rules for experimental vs control groups, multiple attribution methods with cascading priority, and complex edge case handling.
Critical Insights for Test vs Control Groups¶
- Dual Attribution Window Constraints:
- Both experimental and control groups validate against TWO timing constraints:
- Campaign constraint:
campaign.first_send_date + min_days - Recipient constraint:
recipient.created_at + min_days
- Campaign constraint:
-
Order must occur on or after BOTH thresholds (the later of the two dates)
-
Same min_days Logic for Both Groups:
- Experimental recipients:
recipient.created_at + min_days - Control recipients:
recipient.created_at + min_days - min_days = 1 for holdout-enabled campaigns, 3 for standard campaigns
-
This ensures fair comparison by applying the same recipient timing rules to both groups
-
Different Processing Paths:
- Experimental: Checked first via email/address/discount methods
-
Control: Checked last as fallback, immediately processed
-
Discount Code Exception:
- ONLY method that works for unsent recipients
-
Can attribute orders before mailers are sent
-
Holdout Campaign Setting:
- When
campaign.settings.holdout.enabled = True - min_days = 1 (instead of 3) for ALL recipients
- max_days = 60 (instead of 63)
-
Changes windows for everyone, ensuring fair A/B testing
-
Fair Comparison:
- Both groups use the same min_days from recipient creation
- Both groups use the same min_days from campaign first send
- Attribution windows align to ensure valid statistical comparison
- Differences in attribution rates can be confidently attributed to the campaign treatment
This design enables accurate A/B testing by ensuring experimental and control groups use identical timing logic, with the only difference being which date triggers their respective attribution windows (sent_at vs created_at for the experimental group's sent-based logic, while both use created_at for the recipient timing constraint).