Order Attribution Methodology¶

Executive Summary¶

PaperRun's attribution system tracks customer orders back to marketing campaigns using multiple matching methods with sophisticated timing constraints. The system handles both experimental groups (recipients who receive mailers) and control groups/holdouts (recipients excluded from mailings to measure campaign effectiveness).

1. Attribution Flow Overview¶

1.1 High-Level Process¶

Order Syncing (every 5 minutes via app/celery/sync_orders.py:process_orders)
Fetches orders from integration providers (Klaviyo, Ometria, Shopify)
Uses checkpoint-based incremental syncing
Processes orders through attribution pipeline
Order Attribution (attribute_order function)
Attempts to match orders to campaign recipients using cascading methods
Creates unprocessed attribution records
Attribution Processing (process_attribution function)
Validates attribution timing windows
Applies business rules
Marks attributions as valid/invalid

1.2 Key Components¶

Order Syncing: app/celery/sync_orders.py
Attribution Logic: app/methods/attributions.py
Holdout Logic: app/methods/holdouts.py
Data Models: app/core/models/attributions.py, app/core/models/campaign_recipients.py

2. Attribution Methods (Priority Order)¶

2.1 Method Hierarchy¶

The system tries attribution methods in this order:

1. Email Match (`AttributionMethod.Email`)¶

Match Criteria: Order email == recipient email (case-insensitive)
Requirements:
Recipient status = Sent
sent_at timestamp must exist
Timing: Uses sent_at as attribution window start
Implementation: app/celery/sync_orders.py:check_for_matched_emails (line 210)

2. Physical Address Match (`AttributionMethod.PhysicalAddress`)¶

Match Criteria: Order shipping address == recipient address
Concatenates: address1 + address2 + zip (lowercased, spaces removed)
Requirements:
Recipient status = Sent
sent_at is not None
Special Case: Skips attribution if active subscription existed before sent_at
Timing: Uses sent_at as attribution window start
Implementation: app/celery/sync_orders.py:check_for_physical_address_match (line 261)

3. Discount Code Match (`AttributionMethod.DiscountCode`)¶

Match Criteria: Order discount code matches campaign or recipient code
Two Types:
Campaign-level codes: Stored in campaign.settings.discount_code
Unique recipient codes: When campaign.settings.use_unique_discount = True
Important: This is the ONLY method that passes rules for unsent recipients
Timing: Can work without sent_at
Implementation: app/celery/sync_orders.py:attribute_order (line 372)

4. Holdout Match (Fallback)¶

When Checked: Only if no experimental group attribution found
Match Criteria: Email or physical address match with recipient status = Holdout
Creates: Separate holdout attribution with holdout=True flag
Timing: Uses created_at as attribution window start (different from experimental!)
Implementation: app/celery/sync_orders.py:check_for_holdout_order (line 78)

3. Critical Timing Constraints¶

3.1 Key Date Fields¶

Campaign Recipient Fields¶

created_at: When recipient record was created in database
sent_at: When mailer was successfully sent to print provider (status changed to Sent)
status: Recipient lifecycle state
Pending → ProofingStaged → Proofing → Proofed → SendingStaged → Sending → Sent
Or: Holdout (never progresses to Sent)

Attribution Fields¶

created: Order datetime (when customer placed order)

Campaign Fields¶

start_date: Campaign start date
end_date: Campaign end date (used for BFCM absolute cutoff)

3.2 Attribution Window Rules for Experimental Groups (Sent Recipients)¶

Location: app/methods/attributions.py:process_attribution (lines 150-171)

The attribution system validates orders based on TWO timing constraints that must both be satisfied:

Campaign timing: campaign.first_send_date + min_days
Recipient timing: recipient.created_at + min_days

The order must occur on or after BOTH thresholds to be attributed. The system takes the maximum (latest) of these two dates as the effective attribution window start.

# Calculate campaign threshold
campaign_min_date = campaign.first_send_date + timedelta(days=min_days)

# Calculate recipient threshold
recipient_min_date = recipient.created_at + timedelta(days=min_days)

# Effective attribution window starts at the LATER of the two dates
effective_start = max(campaign_min_date, recipient_min_date)

# Order must be >= effective_start and within max_days window
if order_date >= effective_start and order_date < (effective_start + max_days):
    attribution.passes_rules = True
else:
    attribution.passes_rules = False

Attribution Window Sizes¶

Location: app/core/models/campaigns.py:get_attribution_window_max/min

Campaign Type	Min Days (Campaign & Recipient)	Max Days	Notes
Holdout Enabled	1	60	Both campaign and recipient use 1 day
FirstPurchase Flow	3	180	Longer window for acquisition
BFCM Flow	3	63	PLUS absolute cutoff at `end_date + 2 days`
Standard Campaign	3	63	Default window

Important: The min_days value applies to BOTH the campaign first_send_date and the recipient created_at. This ensures orders are attributed only after both: - Sufficient time has passed since the campaign first sent (to allow mail delivery) - Sufficient time has passed since the recipient was created (to allow for recipient-specific processing)

BFCM Special Case¶

Location: app/methods/attributions.py:process_attribution (lines 164-171)

BFCM campaigns have an additional hard cutoff date: - Calculated as: min(end_date + 2 days, December 31) - 2-day buffer accounts for timezone differences (UK vs US campaigns) - Prevents attribution windows from extending into next year - Orders after this date fail attribution even if within min/max day window

Example:

# BFCM campaign with end_date = 2024-11-29
absolute_cutoff = min(2024-12-01, 2024-12-31) = 2024-12-01

# Order placed on 2024-12-05 (35 days after sent_at)
# Within 63-day window BUT after absolute cutoff
# Result: passes_rules = False

3.3 Attribution Window Rules for Control Groups (Holdouts)¶

Location: app/celery/sync_orders.py:check_for_holdout_order (lines 78-207)

Critical Difference: Holdouts use created_at not sent_at because they never receive mailers!

Like experimental groups, holdout attribution also validates orders based on TWO timing constraints:

Campaign timing: campaign.first_send_date + min_days
Recipient timing: recipient.created_at + min_days

The order must occur on or after BOTH thresholds to be attributed.

# Calculate campaign threshold
campaign_min_date = campaign.first_send_date + timedelta(days=min_days)

# Calculate recipient threshold (using created_at since holdouts never have sent_at)
recipient_min_date = holdout_recipient.created_at + timedelta(days=min_days)

# Effective attribution window starts at the LATER of the two dates
effective_start = max(campaign_min_date, recipient_min_date)

# Order must be >= effective_start and within max_days window
if order_datetime >= effective_start and order_datetime < (effective_start + max_days):
    passes_attribution_rules = True
else:
    passes_attribution_rules = False

Holdout-Specific Rules¶

Uses recipient created_at (not sent_at) as the recipient timing constraint
Same min_days value as experimental group (from campaign settings)
1 day for campaigns with holdout enabled
3 days for standard campaigns
Same max_days value as experimental group (from campaign settings)
Orders must satisfy BOTH campaign and recipient timing thresholds
Campaign must be in post-launch state (campaign.should_process_stats())
Orders < $1 are excluded

3.4 Unsent Recipients - Special Handling¶

Location: app/methods/attributions.py:process_attribution (lines 140-148)

# Early exit for recipients without sent_at
if recipient.sent_at is None:
    logger.info(f"Did not process attribution...")

    # ONLY DiscountCode attributions pass rules for unsent recipients
    passes_rules = attribution.method == AttributionMethod.DiscountCode

    attribution.processed = True
    attribution.passes_rules = passes_rules
    att.upsert_attribution_by_provider_id(attribution=attribution)
    return

Key Insight: If sent_at is None: - ✅ DiscountCode attributions → passes_rules = True - ❌ All other methods (Email, PhysicalAddress, etc.) → passes_rules = False

Why? - Discount codes can be used before mailers are sent (pre-campaign purchases) - Email/address matching requires the mailer to have been sent to influence behavior

4. Order Syncing & Checkpoint Management¶

4.1 Sync Process¶

Frequency: Every 5 minutes (300 seconds)

Location: app/celery/sync_orders.py:process_orders

Flow: 1. Fetch all enabled organizations 2. For each organization with sync_orders = True: - Get mail integration (Klaviyo, Ometria, or Shopify) - Fetch orders since last checkpoint - Process each order through attribution pipeline - Update checkpoint to latest order datetime

4.2 Checkpoint Logic¶

Location: app/celery/sync_orders.py:get_organization_sync_checkpoint

def get_organization_sync_checkpoint(organization: Organization) -> Optional[datetime]:
    # 1. Try to use last_order_sync from organization config
    last_sync_checkpoint = organization.configuration.last_order_sync

    # 2. If no checkpoint, use timestamp of most recent order in DB
    last_order = get_last_order(organization=organization.id)
    if last_order is not None and not last_sync_checkpoint:
        last_sync_checkpoint = last_order.datetime

    return last_sync_checkpoint

Checkpoint Update: app/celery/sync_orders.py:sync_organization_page_order (line 613)

# After processing all orders, update checkpoint to latest order datetime
events = sorted(events, key=lambda x: x.datetime, reverse=True)
next_checkpoint = events[0].datetime
update_organization_order_checkpoint(organization, next_checkpoint)

4.3 Integration Providers¶

Klaviyo¶

Uses UNIX timestamp for checkpoint
Fetches "Placed Order" metric events via find_events_for_metric()
Converts events to OrderEvent objects

Ometria¶

Uses ISO datetime for checkpoint
Fetches orders via list_orders(since=checkpoint, limit=150)
Converts to OrderEvent objects

Shopify¶

Uses ISO datetime for checkpoint
Fetches order events via GraphQL: list_order_events_since(checkpoint, per_page=100, limit=1000)
Converts to OrderEvent objects

4.4 Attribution Caching¶

Location: app/celery/sync_orders.py (multiple functions)

The system uses Redis caching to pass attribution data between order processing and attribution creation:

# Cache key format
redis_key = f"attribution_helper_order_provider_id_{order.provider_id}"

# Cached for 600 seconds (10 minutes)
RedisClient.set_dict(redis_key, {
    "email": order.email,
    "datetime": order.datetime,
    "recipient_id": recipient.id,
    "method": att.AttributionMethod.Email,  # or other method
    "holdout": False,
    "discount_code": code,  # if discount code match
    "discount_match": True  # if discount matched
}, 600)

Read: app/methods/attributions.py:insert_unprocessed_attribution (line 56)

This cache is read when creating the initial attribution record and then deleted.

5. Test vs Control Group Differences¶

5.1 Recipient Status Tracking¶

Experimental Group (recipients who receive mail)¶

Status Progression:

Pending → ProofingStaged → Proofing → Proofed →
SendingStaged → Sending → Sent

sent_at: Populated when status changes to Sent
Attribution Start: Uses sent_at as window start

Control Group (holdout recipients)¶

Status: Holdout (never progresses to Sent)
sent_at: Remains None
Attribution Start: Uses created_at as window start

5.2 Attribution Process Differences¶

Aspect	Experimental Group	Control Group
Attribution Window Start	`max(campaign.first_send_date + min_days, recipient.created_at + min_days)`	`max(campaign.first_send_date + min_days, recipient.created_at + min_days)`
Recipient Date Used	`recipient.created_at` (for min_days calculation)	`recipient.created_at` (for min_days calculation)
Checked When	Primary attribution flow	Fallback if no experimental match
Methods Supported	Email, PhysicalAddress, DiscountCode	Email, PhysicalAddress
Order in Process	Checked first	Checked last (only if `attributed_campaign <= 0`)
Database Flag	`attribution.holdout = False`	`attribution.holdout = True`
Min Days Value	1 for holdout campaigns, 3 for standard campaigns	Same as experimental (1 for holdout campaigns, 3 for standard)
Processed Status	Set during attribution processing job	Set immediately during sync

Key Insight: Both experimental and control groups now use the same min_days logic for recipient timing. This creates fair A/B testing by ensuring both groups have the same minimum wait period from recipient creation before orders can be attributed.

5.3 Holdout Campaign Settings¶

Location: app/core/models/campaigns.py:CampaignHoldout

Campaigns can enable holdouts via settings:

campaign.settings.holdout = CampaignHoldout(
    enabled=True,
    holdout_percentage=10.0,  # 10% of recipients
    holdout_absolute_maximum=500  # Cap at 500 recipients
)

Important: When campaign.settings.holdout.enabled = True: - ALL recipients (including experimental) use created_at for attribution window - Min days = 0 (instead of 3) - Max days = 60 (instead of 63)

This is different from having some recipients with status = Holdout. The holdout.enabled flag changes timing for everyone.

5.4 Flow Diagram¶

┌─────────────────────────────────────────────────────────────┐
│ Order Received from Integration Provider                    │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ├─────────────────────────────────────────┐
                     │                                         │
         ┌───────────▼──────────┐               ┌─────────────▼────────────┐
         │ Check Experimental    │               │ Check Holdout Match     │
         │ Attribution Methods   │               │ (if no experimental)    │
         └───────────┬──────────┘               └─────────────┬────────────┘
                     │                                         │
         ┌───────────▼──────────┐               ┌─────────────▼────────────┐
         │ 1. Email Match        │               │ Email/Address Match      │
         │    (sent_at required) │               │ Status = Holdout         │
         ├───────────────────────┤               ├──────────────────────────┤
         │ 2. Address Match      │               │ Window Start:            │
         │    (sent_at required) │               │   created_at             │
         ├───────────────────────┤               │                          │
         │ 3. Discount Code      │               │ holdout = True           │
         │    (no sent_at req'd) │               │                          │
         └───────────┬──────────┘               └─────────────┬────────────┘
                     │                                         │
         ┌───────────▼──────────┐               ┌─────────────▼────────────┐
         │ Window Start:         │               │ Processed immediately    │
         │   sent_at             │               │ during sync              │
         │                       │               │                          │
         │ holdout = False       │               │ Processed immediately    │
         │                       │               │ during sync              │
         └───────────┬──────────┘               └─────────────┬────────────┘
                     │                                         │
         ┌───────────▼──────────┐                             │
         │ Processed by          │                             │
         │ background job        │                             │
         └───────────┬──────────┘                             │
                     │                                         │
                     └─────────────┬───────────────────────────┘
                                   │
                     ┌─────────────▼────────────┐
                     │ Attribution Record Saved │
                     │ in Database              │
                     └──────────────────────────┘

6. Multi-Order Attribution & Order Count¶

6.1 Subsequent Orders¶

Location: app/methods/attributions.py:process_attribution (lines 173-189)

The system tracks multiple orders from the same recipient:

# Fetch all attributions for this recipient-campaign pair
all_attributions = att.list_attributions_for_recipient_and_campaign_id(
    recipient_id=attribution.recipient_id,
    campaign_id=attribution.campaign_id
)

# Exclude current attribution from the list
all_attributions = [other for other in all_attributions
                   if other.attribution_id != attribution.attribution_id]

# Find previous attributions (orders before current)
previous_attributions = [other for other in all_attributions
                        if other.created < attribution.created]

# Store order count (1 for first order, 2 for second, etc.)
attribution.json["order_count"] = len(previous_attributions) + 1

6.2 Fallback Attribution Logic¶

Location: app/methods/attributions.py:process_attribution (lines 182-189)

Important: If current order is outside attribution window BUT a previous order from the same recipient passed rules, the current order can still pass:

# Even if current order is outside the attribution window
if len(all_attributions) > 0 and not attribution.passes_rules:
    first_attribution = all_attributions[0]
    if first_attribution.passes_rules:
        # Current order ALSO passes rules!
        attribution.passes_rules = True

Example:

Recipient sent mail: 2024-01-01
First order: 2024-01-10 (9 days later) → Within window, passes_rules = True
Second order: 2024-03-15 (74 days later) → Outside 63-day window
    BUT first order passed, so second order also passes_rules = True

This allows attributing repeat purchases even if they occur after the attribution window expires, as long as the customer's first purchase was within the window.

6.3 Order Count Usage¶

The order_count field is used in: - Reporting and analytics - Identifying first-time buyers vs repeat customers - Calculating customer lifetime value (LTV) - Holdout analysis for first purchase vs all purchases (ExperimentMetric.FirstOrder vs ExperimentMetric.AllOrders)

7. Additional Business Rules¶

7.1 Zero-Value Orders¶

Location: app/methods/attributions.py:process_attribution (lines 191-193)

# Always fail attribution for $0 orders
if attribution.json.get("order_value") == 0:
    attribution.passes_rules = False

This prevents test orders, cancelled orders, or data errors from being counted as conversions.

7.2 Archived Campaigns¶

Location: app/methods/attributions.py:process_attribution (lines 196-201)

if campaign.status == CampaignStatus.Archived:
    attribution.archived = True

Archived attributions are: - Still processed and stored in database - Excluded from reporting queries - Used for historical analysis

7.3 Blacklist Filtering¶

Location: app/celery/sync_orders.py:check_for_blacklist (line 320)

Organization-specific blacklist logic at sync time:

# Example: Organization 10 excludes orders with "wholesale" discount codes
if order.attributed_campaign > 0 and order.organization == 10:
    discount_codes = order.json.get("discount_codes", [])
    if len(discount_codes) > 0:
        for code in discount_codes:
            code = code.lower().strip()
            if "wholesale" in code:
                order.attributed_campaign = -1  # Clear attribution
                break

This is organization-specific logic that prevents certain order types from being attributed.

7.5 Internal Orders¶

Location: app/celery/sync_orders.py:is_order_made_internally (line 480)

Orders from company email domains are excluded:

def is_order_made_internally(order: OrderEvent, organization: Organization) -> bool:
    organization_email_domain = organization.get_email_domain()
    valid_org_domain = organization_email_domain is not None and len(organization_email_domain) > 3

    if order.email is not None and valid_org_domain and order.email.endswith(organization_email_domain):
        return True
    return False

Example: If organization email is support@acmeco.com, orders from john@acmeco.com are excluded.

7.6 Active Subscription Check (Physical Address Only)¶

Location: app/celery/sync_orders.py:check_for_physical_address_match (lines 286-299)

For physical address matches only, check if customer had active subscription before mail was sent:

active_subscription_start = datetime.fromisoformat(
    order.json.get("included_profile", {}).get("active_subscription_start_date")
)

if active_subscription_start < recipient.sent_at:
    logger.info(f"active subscription found for recipient. Should ignore order {order.id}")
    return order  # No attribution

This prevents attributing orders from existing subscribers who were sent winback campaigns.

8. Experiment Analysis¶

8.1 Statistical Significance Testing¶

Location: app/methods/holdouts.py:ab_test_statistical_significance (line 205)

Uses Z-test for proportions to determine if experimental group outperformed control group:

def ab_test_statistical_significance(control, experiment, alpha=0.05):
    # Pooled conversion rate
    pooled_prob = (conv_rate_a * n_a + conv_rate_b * n_b) / (n_a + n_b)

    # Standard error
    se = sqrt(pooled_prob * (1 - pooled_prob) * ((1/n_a) + (1/n_b)))

    # Z-score
    z_score = (conv_rate_b - conv_rate_a) / se

    # P-value (one-tailed)
    p_value = 1.0 - norm.cdf(z_score)

    # Win probability
    win_prob = norm.cdf(z_score)

    # Statistical significance (default alpha = 0.05)
    stat_significance = p_value < alpha

    return {
        "winner": "experiment" if win_prob > 0.5 else "control",
        "win_probability": win_prob,
        "statsig": stat_significance
    }

8.2 Incremental Metrics¶

Location: app/methods/holdouts.py:calculate_holdout_analysis (line 130)

1. Conversion Uplift¶

metric_difference = experiment_conversion_rate - control_conversion_rate
metric_uplift = metric_difference / control_conversion_rate

Example: If control = 2% and experiment = 3%, uplift = 50%

2. Incremental Revenue¶

revenue_per_recipient_exp = experiment_revenue / experiment_recipients
revenue_per_recipient_ctrl = control_revenue / control_recipients
incremental_revenue = (revenue_per_recipient_exp - revenue_per_recipient_ctrl) * experiment_recipients

This calculates total additional revenue generated by the campaign.

3. Incremental ROAS (Return on Ad Spend)¶

incremental_roas = incremental_revenue / total_campaign_cost

Example: If incremental_revenue = $10,000 and cost = $2,000, ROAS = 5.0 (5x return)

4. Customer Acquisition Cost (CAC)¶

test_orders_per_send = experiment_orders / experiment_recipients
holdout_orders_per_send = control_orders / control_recipients
incremental_customers_per_send = test_orders_per_send - holdout_orders_per_send
incremental_customers = incremental_customers_per_send * experiment_recipients

cac = total_cost / incremental_customers

This calculates the cost to acquire one incremental customer through the campaign.

8.3 Experiment Metrics Modes¶

Location: app/methods/holdouts.py:ExperimentMetric (line 33)

Two analysis modes:

AllOrders Mode¶

Counts all attributed orders (first + repeat purchases)
Uses total revenue from all orders
Better for understanding total campaign impact

FirstOrder Mode¶

Counts only first purchase per customer (order_count == 1)
Uses revenue only from first orders
Better for understanding customer acquisition effectiveness

Example:

# Generate holdout stats with first order focus
holdout_stats = generate_holdout_stats(
    holdout_recipient_count=500,
    holdout_attributions=all_holdout_attributions,
    metric=ExperimentMetric.FirstOrder  # Only count first purchases
)

9. Edge Cases & Important Behaviors¶

9.1 Recipient Created but Not Sent¶

Scenario: Recipient exists with created_at but sent_at = None

Outcome: - Discount code attributions: ✅ Pass rules - All other methods: ❌ Fail rules - Attribution record is still created and processed - processed = True, but passes_rules depends on method

Why: Discount codes can work before sending (pre-campaign purchases), but email/address matching requires the mail to have influenced behavior.

Test Reference: app/tests/methods/test_attributions.py:TestProcessAttributionUnsentRecipient

9.2 Order Before Recipient Creation¶

Experimental Group¶

Not possible in normal flow because: - Attribution window starts at sent_at - sent_at must be after created_at - Orders before created_at would have negative initial_days

Control Group (Holdout)¶

Explicitly checked and rejected:

order_newer_than_recipient = order_datetime > holdout_recipient.created_at
if not order_newer_than_recipient:
    return False  # No attribution

Location: app/celery/sync_orders.py:check_for_holdout_order (line 110)

9.3 Multiple Recipients with Same Email¶

Scenario: Two recipients in different campaigns have the same email, both status = Sent

Outcome: Order is attributed to most recently created recipient

Location: app/celery/sync_orders.py:check_for_matched_emails (line 223)

# Sort recipients by created_at, most recent first
recipients = sorted(recipients, key=lambda x: x.created_at, reverse=True)

for recipient in recipients:
    if matched is False:
        matched = True
        order.attributed_campaign = recipient.campaign_id
    else:
        # Log duplicate attribution
        logger.warning(
            "Order duplicate attributed",
            ...
        )

Additional recipients trigger an "Order duplicate attributed" log event but don't create separate attributions.

9.4 Both Email and Address Match¶

Scenario: Order matches both email and physical address for same recipient

Outcome: Email match wins (checked first)

Why: Email matching runs first, and address matching only executes if order.attributed_campaign < 0:

# Check email first
if order.email is not None and any_address_match:
    order = check_for_matched_emails(...)

# Only check address if no attribution yet
if order.attributed_campaign < 0 and any_address_match:
    order = check_for_physical_address_match(...)

Location: app/celery/sync_orders.py:attribute_order (lines 363-369)

9.5 Active Subscription Before Send (Address Match Only)¶

Scenario: Customer had active subscription starting before sent_at, then places order

Outcome: No attribution created (for physical address match only)

Why: Prevents attributing orders from existing subscribers who received winback campaigns

Location: app/celery/sync_orders.py:check_for_physical_address_match (lines 286-299)

active_subscription_start = datetime.fromisoformat(
    order.json.get("included_profile", {}).get("active_subscription_start_date")
)

if active_subscription_start < recipient.sent_at:
    return order  # Skip attribution

Note: This check only applies to physical address matches, not email matches.

9.6 Campaign Not in Post-Launch State (Holdout Only)¶

Scenario: Holdout recipient created, order comes in, but campaign status = Pending

Outcome: No attribution created

Location: app/celery/sync_orders.py:check_for_holdout_order (lines 119-121)

if campaign and not campaign.should_process_stats():
    return False

# should_process_stats() returns True for: Active, Completed, Paused
# Returns False for: Pending, Archived, Error

This prevents holdout attributions before campaign launches.

9.7 BFCM Campaign Without end_date¶

Scenario: Campaign has flow_trigger = BFCM but no end_date set

Outcome: Validation error when creating/updating campaign

Location: app/core/models/campaigns.py:validate_bfcm_end_date (line 276)

if self.settings.flow_trigger == FlowTrigger.Bfcm:
    if not self.end_date:
        raise BfcmEndDateValidationError("BFCM campaigns require an end_date")

    # Validate end_date is between Nov 13 - Dec 31
    if not ((month == 11 and day >= 13) or (month == 12)):
        raise BfcmEndDateValidationError(
            f"BFCM end_date must be between November 13 - December 31, got {self.end_date}"
        )

10. Code References¶

10.1 Primary Files¶

File	Purpose	Key Functions
`app/celery/sync_orders.py`	Order syncing and initial attribution	`process_orders`, `attribute_order`, `check_for_holdout_order`
`app/methods/attributions.py`	Attribution processing and validation	`process_attribution`, `insert_unprocessed_attribution`
`app/methods/holdouts.py`	Holdout analysis and statistics	`calculate_holdout_analysis`, `ab_test_statistical_significance`
`app/core/models/attributions.py`	Attribution data model and queries	`Attribution`, `upsert_attribution_by_provider_id`
`app/core/models/campaign_recipients.py`	Recipient data model and queries	`CampaignRecipient`, `get_campaign_recipient`
`app/core/models/campaigns.py`	Campaign data model and settings	`Campaign`, `get_attribution_window_max/min`

10.2 Key Line References¶

Experimental Group Attribution Window¶

Start date calculation: app/methods/attributions.py:151
Holdout campaign override: app/methods/attributions.py:154-155
Window validation: app/methods/attributions.py:159-162
BFCM cutoff: app/methods/attributions.py:164-171

Control Group Attribution Window¶

Holdout check: app/celery/sync_orders.py:78-207
Date calculation: app/celery/sync_orders.py:109
Window validation: app/celery/sync_orders.py:124-125

Unsent Recipient Handling¶

Early exit: app/methods/attributions.py:140-148
Discount code exception: app/methods/attributions.py:144

Attribution Methods¶

Email match: app/celery/sync_orders.py:210-258
Address match: app/celery/sync_orders.py:261-317
Discount code: app/celery/sync_orders.py:372-423
Holdout check: app/celery/sync_orders.py:430-431

Multi-Order Tracking¶

Previous attributions: app/methods/attributions.py:173-179
Fallback logic: app/methods/attributions.py:182-189

10.3 Test References¶

Test File	Coverage
`app/tests/methods/test_attributions.py`	Unsent recipients, attribution timing
`app/tests/methods/test_holdouts.py`	Holdout analysis, statistics
`app/tests/celery/test_sync_orders.py`	Order syncing, attribution methods
`app/tests/models/test_campaigns.py`	Campaign settings, BFCM validation

Appendix: Flow Diagram¶

┌─────────────────────────────────────────────────────────────┐
│ Background Job: process_orders() - Every 5 minutes          │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ├─ For each enabled organization
                     │
         ┌───────────▼──────────┐
         │ Get checkpoint date   │
         │ (last synced order)   │
         └───────────┬──────────┘
                     │
         ┌───────────▼──────────┐
         │ Fetch orders from     │
         │ Klaviyo/Ometria/      │
         │ Shopify since         │
         │ checkpoint            │
         └───────────┬──────────┘
                     │
         ┌───────────▼──────────┐
         │ For each order:       │
         │                       │
         │ 1. Skip if internal   │
         │ 2. attribute_order()  │
         │ 3. check_blacklist()  │
         │ 4. Save order if      │
         │    attributed         │
         └───────────┬──────────┘
                     │
         ┌───────────▼──────────────────────────────────┐
         │ attribute_order() - Cascading Match Logic     │
         ├───────────────────────────────────────────────┤
         │                                               │
         │ ┌─────────────────────────────────────────┐  │
         │ │ 1. Email Match (Status=Sent)            │  │
         │ │    - Match email                        │  │
         │ │    - Require sent_at                    │  │
         │ │    - Use most recent recipient          │  │
         │ │    - Cache in Redis                     │  │
         │ └─────────────────────────────────────────┘  │
         │                     │ If no match             │
         │ ┌─────────────────▼─────────────────────┐  │
         │ │ 2. Physical Address Match (Status=Sent) │  │
         │ │    - Match address1+address2+zip        │  │
         │ │    - Require sent_at                    │  │
         │ │    - Skip if subscription before send   │  │
         │ │    - Cache in Redis                     │  │
         │ └─────────────────────────────────────────┘  │
         │                     │ If no match             │
         │ ┌─────────────────▼─────────────────────┐  │
         │ │ 3. Discount Code Match                  │  │
         │ │    - Match campaign-level codes         │  │
         │ │    - Match unique recipient codes       │  │
         │ │    - Works without sent_at!             │  │
         │ │    - Cache in Redis                     │  │
         │ └─────────────────────────────────────────┘  │
         │                     │ If no match             │
         │ ┌─────────────────▼─────────────────────┐  │
         │ │ 4. Holdout Match (Status=Holdout)       │  │
         │ │    - Match email or address             │  │
         │ │    - Use created_at (not sent_at!)      │  │
         │ │    - Check campaign post-launch         │  │
         │ │    - Create with holdout=True           │  │
         │ │    - Process & save immediately         │  │
         │ └─────────────────────────────────────────┘  │
         │                                               │
         └───────────────────┬───────────────────────────┘
                             │
         ┌───────────────────▼────────────────────────┐
         │ If attributed_campaign > 0:                 │
         │ - Save order to database                    │
         │ - insert_unprocessed_attribution()          │
         │   (reads Redis cache, creates attribution)  │
         │ - Log OrderAttributed event                 │
         └───────────────────┬────────────────────────┘
                             │
         ┌───────────────────▼────────────────────────┐
         │ Update checkpoint to latest order datetime  │
         └─────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ Background Job: process_attributions() - Separate worker    │
└────────────────────┬────────────────────────────────────────┘
                     │
         ┌───────────▼──────────┐
         │ Get all unprocessed   │
         │ attributions          │
         └───────────┬──────────┘
                     │
         ┌───────────▼──────────────────────────────────┐
         │ For each attribution:                         │
         │ process_attribution()                         │
         ├───────────────────────────────────────────────┤
         │                                               │
         │ 1. Load recipient, campaign, organization     │
         │                                               │
         │ 2. Check if sent_at exists                    │
         │    ├─ If None and method != DiscountCode:     │
         │    │  → Set passes_rules = False, processed = │
         │    │    True, return                           │
         │    └─ If None and method == DiscountCode:     │
         │       → Set passes_rules = True, processed =  │
         │         True, return                           │
         │                                               │
         │ 3. Calculate attribution window start         │
         │    ├─ Default: start_date = sent_at           │
         │    └─ If holdout.enabled: start_date =        │
         │       created_at                               │
         │                                               │
         │ 4. Calculate days_since_dispatch              │
         │    initial_days = (order_date - start_date)   │
         │                   .days                        │
         │                                               │
         │ 5. Check if within window                     │
         │    if min_days <= initial_days < max_days:    │
         │       passes_rules = True                     │
         │    else:                                      │
         │       passes_rules = False                    │
         │                                               │
         │ 6. Apply BFCM absolute cutoff (if applicable) │
         │    if BFCM and order_date > cutoff_date:      │
         │       passes_rules = False                    │
         │                                               │
         │ 7. Check previous attributions                │
         │    if first_attribution.passes_rules:         │
         │       current.passes_rules = True             │
         │                                               │
         │ 8. Exclude $0 orders                          │
         │    if order_value == 0:                       │
         │       passes_rules = False                    │
         │                                               │
         │ 9. Mark archived campaigns                    │
         │    if campaign.status == Archived:            │
         │       attribution.archived = True             │
         │                                               │
         │ 10. Set processed = True, save                │
         │                                               │
         └───────────────────────────────────────────────┘

Summary¶

The PaperRun attribution system is sophisticated, with different timing rules for experimental vs control groups, multiple attribution methods with cascading priority, and complex edge case handling.

Critical Insights for Test vs Control Groups¶

Dual Attribution Window Constraints:
Both experimental and control groups validate against TWO timing constraints:
- Campaign constraint: campaign.first_send_date + min_days
- Recipient constraint: recipient.created_at + min_days
Order must occur on or after BOTH thresholds (the later of the two dates)
Same min_days Logic for Both Groups:
Experimental recipients: recipient.created_at + min_days
Control recipients: recipient.created_at + min_days
min_days = 1 for holdout-enabled campaigns, 3 for standard campaigns
This ensures fair comparison by applying the same recipient timing rules to both groups
Different Processing Paths:
Experimental: Checked first via email/address/discount methods
Control: Checked last as fallback, immediately processed
Discount Code Exception:
ONLY method that works for unsent recipients
Can attribute orders before mailers are sent
Holdout Campaign Setting:
When campaign.settings.holdout.enabled = True
min_days = 1 (instead of 3) for ALL recipients
max_days = 60 (instead of 63)
Changes windows for everyone, ensuring fair A/B testing
Fair Comparison:
Both groups use the same min_days from recipient creation
Both groups use the same min_days from campaign first send
Attribution windows align to ensure valid statistical comparison
Differences in attribution rates can be confidently attributed to the campaign treatment

This design enables accurate A/B testing by ensuring experimental and control groups use identical timing logic, with the only difference being which date triggers their respective attribution windows (sent_at vs created_at for the experimental group's sent-based logic, while both use created_at for the recipient timing constraint).

Order Attribution Methodology¶

Executive Summary¶

Table of Contents¶

1. Attribution Flow Overview¶

1.1 High-Level Process¶

1.2 Key Components¶

2. Attribution Methods (Priority Order)¶

2.1 Method Hierarchy¶

1. Email Match (AttributionMethod.Email)¶

2. Physical Address Match (AttributionMethod.PhysicalAddress)¶

3. Discount Code Match (AttributionMethod.DiscountCode)¶

4. Holdout Match (Fallback)¶

3. Critical Timing Constraints¶

3.1 Key Date Fields¶

Campaign Recipient Fields¶

Attribution Fields¶

Campaign Fields¶

3.2 Attribution Window Rules for Experimental Groups (Sent Recipients)¶

Attribution Window Sizes¶

BFCM Special Case¶

3.3 Attribution Window Rules for Control Groups (Holdouts)¶

Holdout-Specific Rules¶

3.4 Unsent Recipients - Special Handling¶

4. Order Syncing & Checkpoint Management¶

4.1 Sync Process¶

4.2 Checkpoint Logic¶

4.3 Integration Providers¶

Klaviyo¶

Ometria¶

Shopify¶

4.4 Attribution Caching¶

5. Test vs Control Group Differences¶

5.1 Recipient Status Tracking¶

Experimental Group (recipients who receive mail)¶

Control Group (holdout recipients)¶

5.2 Attribution Process Differences¶

5.3 Holdout Campaign Settings¶

5.4 Flow Diagram¶

6. Multi-Order Attribution & Order Count¶

6.1 Subsequent Orders¶

6.2 Fallback Attribution Logic¶

6.3 Order Count Usage¶

7. Additional Business Rules¶

7.1 Zero-Value Orders¶

7.2 Archived Campaigns¶

7.3 Blacklist Filtering¶

7.5 Internal Orders¶

7.6 Active Subscription Check (Physical Address Only)¶

8. Experiment Analysis¶

8.1 Statistical Significance Testing¶

8.2 Incremental Metrics¶

1. Conversion Uplift¶

2. Incremental Revenue¶

3. Incremental ROAS (Return on Ad Spend)¶

4. Customer Acquisition Cost (CAC)¶

8.3 Experiment Metrics Modes¶

AllOrders Mode¶

FirstOrder Mode¶

9. Edge Cases & Important Behaviors¶

9.1 Recipient Created but Not Sent¶

9.2 Order Before Recipient Creation¶

Experimental Group¶

Control Group (Holdout)¶

9.3 Multiple Recipients with Same Email¶

9.4 Both Email and Address Match¶

9.5 Active Subscription Before Send (Address Match Only)¶

9.6 Campaign Not in Post-Launch State (Holdout Only)¶

9.7 BFCM Campaign Without end_date¶

10. Code References¶

10.1 Primary Files¶

10.2 Key Line References¶

Experimental Group Attribution Window¶

Control Group Attribution Window¶

Unsent Recipient Handling¶

Attribution Methods¶

Multi-Order Tracking¶

10.3 Test References¶

Appendix: Flow Diagram¶

Summary¶

Critical Insights for Test vs Control Groups¶

1. Email Match (`AttributionMethod.Email`)¶

2. Physical Address Match (`AttributionMethod.PhysicalAddress`)¶

3. Discount Code Match (`AttributionMethod.DiscountCode`)¶