Skip to content

Order Attribution Methodology

Executive Summary

PaperRun's attribution system tracks customer orders back to marketing campaigns using multiple matching methods with sophisticated timing constraints. The system handles both experimental groups (recipients who receive mailers) and control groups/holdouts (recipients excluded from mailings to measure campaign effectiveness).


Table of Contents

  1. Attribution Flow Overview
  2. Attribution Methods
  3. Critical Timing Constraints
  4. Order Syncing & Checkpoint Management
  5. Test vs Control Group Differences
  6. Multi-Order Attribution & Order Count
  7. Additional Business Rules
  8. Experiment Analysis
  9. Edge Cases & Important Behaviors
  10. Code References

1. Attribution Flow Overview

1.1 High-Level Process

  1. Order Syncing (every 5 minutes via app/celery/sync_orders.py:process_orders)
  2. Fetches orders from integration providers (Klaviyo, Ometria, Shopify)
  3. Uses checkpoint-based incremental syncing
  4. Processes orders through attribution pipeline

  5. Order Attribution (attribute_order function)

  6. Attempts to match orders to campaign recipients using cascading methods
  7. Creates unprocessed attribution records

  8. Attribution Processing (process_attribution function)

  9. Validates attribution timing windows
  10. Applies business rules
  11. Marks attributions as valid/invalid

1.2 Key Components

  • Order Syncing: app/celery/sync_orders.py
  • Attribution Logic: app/methods/attributions.py
  • Holdout Logic: app/methods/holdouts.py
  • Data Models: app/core/models/attributions.py, app/core/models/campaign_recipients.py

2. Attribution Methods (Priority Order)

2.1 Method Hierarchy

The system tries attribution methods in this order:

1. Email Match (AttributionMethod.Email)

  • Match Criteria: Order email == recipient email (case-insensitive)
  • Requirements:
  • Recipient status = Sent
  • sent_at timestamp must exist
  • Timing: Uses sent_at as attribution window start
  • Implementation: app/celery/sync_orders.py:check_for_matched_emails (line 210)

2. Physical Address Match (AttributionMethod.PhysicalAddress)

  • Match Criteria: Order shipping address == recipient address
  • Concatenates: address1 + address2 + zip (lowercased, spaces removed)
  • Requirements:
  • Recipient status = Sent
  • sent_at is not None
  • Special Case: Skips attribution if active subscription existed before sent_at
  • Timing: Uses sent_at as attribution window start
  • Implementation: app/celery/sync_orders.py:check_for_physical_address_match (line 261)

3. Discount Code Match (AttributionMethod.DiscountCode)

  • Match Criteria: Order discount code matches campaign or recipient code
  • Two Types:
  • Campaign-level codes: Stored in campaign.settings.discount_code
  • Unique recipient codes: When campaign.settings.use_unique_discount = True
  • Important: This is the ONLY method that passes rules for unsent recipients
  • Timing: Can work without sent_at
  • Implementation: app/celery/sync_orders.py:attribute_order (line 372)

4. Holdout Match (Fallback)

  • When Checked: Only if no experimental group attribution found
  • Match Criteria: Email or physical address match with recipient status = Holdout
  • Creates: Separate holdout attribution with holdout=True flag
  • Timing: Uses created_at as attribution window start (different from experimental!)
  • Implementation: app/celery/sync_orders.py:check_for_holdout_order (line 78)

3. Critical Timing Constraints

3.1 Key Date Fields

Campaign Recipient Fields

  • created_at: When recipient record was created in database
  • sent_at: When mailer was successfully sent to print provider (status changed to Sent)
  • status: Recipient lifecycle state
  • Pending → ProofingStaged → Proofing → Proofed → SendingStaged → Sending → Sent
  • Or: Holdout (never progresses to Sent)

Attribution Fields

  • created: Order datetime (when customer placed order)

Campaign Fields

  • start_date: Campaign start date
  • end_date: Campaign end date (used for BFCM absolute cutoff)

3.2 Attribution Window Rules for Experimental Groups (Sent Recipients)

Location: app/methods/attributions.py:process_attribution (lines 150-171)

The attribution system validates orders based on TWO timing constraints that must both be satisfied:

  1. Campaign timing: campaign.first_send_date + min_days
  2. Recipient timing: recipient.created_at + min_days

The order must occur on or after BOTH thresholds to be attributed. The system takes the maximum (latest) of these two dates as the effective attribution window start.

# Calculate campaign threshold
campaign_min_date = campaign.first_send_date + timedelta(days=min_days)

# Calculate recipient threshold
recipient_min_date = recipient.created_at + timedelta(days=min_days)

# Effective attribution window starts at the LATER of the two dates
effective_start = max(campaign_min_date, recipient_min_date)

# Order must be >= effective_start and within max_days window
if order_date >= effective_start and order_date < (effective_start + max_days):
    attribution.passes_rules = True
else:
    attribution.passes_rules = False

Attribution Window Sizes

Location: app/core/models/campaigns.py:get_attribution_window_max/min

Campaign Type Min Days (Campaign & Recipient) Max Days Notes
Holdout Enabled 1 60 Both campaign and recipient use 1 day
FirstPurchase Flow 3 180 Longer window for acquisition
BFCM Flow 3 63 PLUS absolute cutoff at end_date + 2 days
Standard Campaign 3 63 Default window

Important: The min_days value applies to BOTH the campaign first_send_date and the recipient created_at. This ensures orders are attributed only after both: - Sufficient time has passed since the campaign first sent (to allow mail delivery) - Sufficient time has passed since the recipient was created (to allow for recipient-specific processing)

BFCM Special Case

Location: app/methods/attributions.py:process_attribution (lines 164-171)

BFCM campaigns have an additional hard cutoff date: - Calculated as: min(end_date + 2 days, December 31) - 2-day buffer accounts for timezone differences (UK vs US campaigns) - Prevents attribution windows from extending into next year - Orders after this date fail attribution even if within min/max day window

Example:

# BFCM campaign with end_date = 2024-11-29
absolute_cutoff = min(2024-12-01, 2024-12-31) = 2024-12-01

# Order placed on 2024-12-05 (35 days after sent_at)
# Within 63-day window BUT after absolute cutoff
# Result: passes_rules = False

3.3 Attribution Window Rules for Control Groups (Holdouts)

Location: app/celery/sync_orders.py:check_for_holdout_order (lines 78-207)

Critical Difference: Holdouts use created_at not sent_at because they never receive mailers!

Like experimental groups, holdout attribution also validates orders based on TWO timing constraints:

  1. Campaign timing: campaign.first_send_date + min_days
  2. Recipient timing: recipient.created_at + min_days

The order must occur on or after BOTH thresholds to be attributed.

# Calculate campaign threshold
campaign_min_date = campaign.first_send_date + timedelta(days=min_days)

# Calculate recipient threshold (using created_at since holdouts never have sent_at)
recipient_min_date = holdout_recipient.created_at + timedelta(days=min_days)

# Effective attribution window starts at the LATER of the two dates
effective_start = max(campaign_min_date, recipient_min_date)

# Order must be >= effective_start and within max_days window
if order_datetime >= effective_start and order_datetime < (effective_start + max_days):
    passes_attribution_rules = True
else:
    passes_attribution_rules = False

Holdout-Specific Rules

  • Uses recipient created_at (not sent_at) as the recipient timing constraint
  • Same min_days value as experimental group (from campaign settings)
  • 1 day for campaigns with holdout enabled
  • 3 days for standard campaigns
  • Same max_days value as experimental group (from campaign settings)
  • Orders must satisfy BOTH campaign and recipient timing thresholds
  • Campaign must be in post-launch state (campaign.should_process_stats())
  • Orders < $1 are excluded

3.4 Unsent Recipients - Special Handling

Location: app/methods/attributions.py:process_attribution (lines 140-148)

# Early exit for recipients without sent_at
if recipient.sent_at is None:
    logger.info(f"Did not process attribution...")

    # ONLY DiscountCode attributions pass rules for unsent recipients
    passes_rules = attribution.method == AttributionMethod.DiscountCode

    attribution.processed = True
    attribution.passes_rules = passes_rules
    att.upsert_attribution_by_provider_id(attribution=attribution)
    return

Key Insight: If sent_at is None: - ✅ DiscountCode attributionspasses_rules = True - ❌ All other methods (Email, PhysicalAddress, etc.) → passes_rules = False

Why? - Discount codes can be used before mailers are sent (pre-campaign purchases) - Email/address matching requires the mailer to have been sent to influence behavior


4. Order Syncing & Checkpoint Management

4.1 Sync Process

Frequency: Every 5 minutes (300 seconds)

Location: app/celery/sync_orders.py:process_orders

Flow: 1. Fetch all enabled organizations 2. For each organization with sync_orders = True: - Get mail integration (Klaviyo, Ometria, or Shopify) - Fetch orders since last checkpoint - Process each order through attribution pipeline - Update checkpoint to latest order datetime

4.2 Checkpoint Logic

Location: app/celery/sync_orders.py:get_organization_sync_checkpoint

def get_organization_sync_checkpoint(organization: Organization) -> Optional[datetime]:
    # 1. Try to use last_order_sync from organization config
    last_sync_checkpoint = organization.configuration.last_order_sync

    # 2. If no checkpoint, use timestamp of most recent order in DB
    last_order = get_last_order(organization=organization.id)
    if last_order is not None and not last_sync_checkpoint:
        last_sync_checkpoint = last_order.datetime

    return last_sync_checkpoint

Checkpoint Update: app/celery/sync_orders.py:sync_organization_page_order (line 613)

# After processing all orders, update checkpoint to latest order datetime
events = sorted(events, key=lambda x: x.datetime, reverse=True)
next_checkpoint = events[0].datetime
update_organization_order_checkpoint(organization, next_checkpoint)

4.3 Integration Providers

Klaviyo

  • Uses UNIX timestamp for checkpoint
  • Fetches "Placed Order" metric events via find_events_for_metric()
  • Converts events to OrderEvent objects

Ometria

  • Uses ISO datetime for checkpoint
  • Fetches orders via list_orders(since=checkpoint, limit=150)
  • Converts to OrderEvent objects

Shopify

  • Uses ISO datetime for checkpoint
  • Fetches order events via GraphQL: list_order_events_since(checkpoint, per_page=100, limit=1000)
  • Converts to OrderEvent objects

4.4 Attribution Caching

Location: app/celery/sync_orders.py (multiple functions)

The system uses Redis caching to pass attribution data between order processing and attribution creation:

# Cache key format
redis_key = f"attribution_helper_order_provider_id_{order.provider_id}"

# Cached for 600 seconds (10 minutes)
RedisClient.set_dict(redis_key, {
    "email": order.email,
    "datetime": order.datetime,
    "recipient_id": recipient.id,
    "method": att.AttributionMethod.Email,  # or other method
    "holdout": False,
    "discount_code": code,  # if discount code match
    "discount_match": True  # if discount matched
}, 600)

Read: app/methods/attributions.py:insert_unprocessed_attribution (line 56)

This cache is read when creating the initial attribution record and then deleted.


5. Test vs Control Group Differences

5.1 Recipient Status Tracking

Experimental Group (recipients who receive mail)

  • Status Progression:
    Pending → ProofingStaged → Proofing → Proofed →
    SendingStaged → Sending → Sent
    
  • sent_at: Populated when status changes to Sent
  • Attribution Start: Uses sent_at as window start

Control Group (holdout recipients)

  • Status: Holdout (never progresses to Sent)
  • sent_at: Remains None
  • Attribution Start: Uses created_at as window start

5.2 Attribution Process Differences

Aspect Experimental Group Control Group
Attribution Window Start max(campaign.first_send_date + min_days, recipient.created_at + min_days) max(campaign.first_send_date + min_days, recipient.created_at + min_days)
Recipient Date Used recipient.created_at (for min_days calculation) recipient.created_at (for min_days calculation)
Checked When Primary attribution flow Fallback if no experimental match
Methods Supported Email, PhysicalAddress, DiscountCode Email, PhysicalAddress
Order in Process Checked first Checked last (only if attributed_campaign <= 0)
Database Flag attribution.holdout = False attribution.holdout = True
Min Days Value 1 for holdout campaigns, 3 for standard campaigns Same as experimental (1 for holdout campaigns, 3 for standard)
Processed Status Set during attribution processing job Set immediately during sync

Key Insight: Both experimental and control groups now use the same min_days logic for recipient timing. This creates fair A/B testing by ensuring both groups have the same minimum wait period from recipient creation before orders can be attributed.

5.3 Holdout Campaign Settings

Location: app/core/models/campaigns.py:CampaignHoldout

Campaigns can enable holdouts via settings:

campaign.settings.holdout = CampaignHoldout(
    enabled=True,
    holdout_percentage=10.0,  # 10% of recipients
    holdout_absolute_maximum=500  # Cap at 500 recipients
)

Important: When campaign.settings.holdout.enabled = True: - ALL recipients (including experimental) use created_at for attribution window - Min days = 0 (instead of 3) - Max days = 60 (instead of 63)

This is different from having some recipients with status = Holdout. The holdout.enabled flag changes timing for everyone.

5.4 Flow Diagram

┌─────────────────────────────────────────────────────────────┐
│ Order Received from Integration Provider                    │
└────────────────────┬────────────────────────────────────────┘
                     ├─────────────────────────────────────────┐
                     │                                         │
         ┌───────────▼──────────┐               ┌─────────────▼────────────┐
         │ Check Experimental    │               │ Check Holdout Match     │
         │ Attribution Methods   │               │ (if no experimental)    │
         └───────────┬──────────┘               └─────────────┬────────────┘
                     │                                         │
         ┌───────────▼──────────┐               ┌─────────────▼────────────┐
         │ 1. Email Match        │               │ Email/Address Match      │
         │    (sent_at required) │               │ Status = Holdout         │
         ├───────────────────────┤               ├──────────────────────────┤
         │ 2. Address Match      │               │ Window Start:            │
         │    (sent_at required) │               │   created_at             │
         ├───────────────────────┤               │                          │
         │ 3. Discount Code      │               │ holdout = True           │
         │    (no sent_at req'd) │               │                          │
         └───────────┬──────────┘               └─────────────┬────────────┘
                     │                                         │
         ┌───────────▼──────────┐               ┌─────────────▼────────────┐
         │ Window Start:         │               │ Processed immediately    │
         │   sent_at             │               │ during sync              │
         │                       │               │                          │
         │ holdout = False       │               │ Processed immediately    │
         │                       │               │ during sync              │
         └───────────┬──────────┘               └─────────────┬────────────┘
                     │                                         │
         ┌───────────▼──────────┐                             │
         │ Processed by          │                             │
         │ background job        │                             │
         └───────────┬──────────┘                             │
                     │                                         │
                     └─────────────┬───────────────────────────┘
                     ┌─────────────▼────────────┐
                     │ Attribution Record Saved │
                     │ in Database              │
                     └──────────────────────────┘

6. Multi-Order Attribution & Order Count

6.1 Subsequent Orders

Location: app/methods/attributions.py:process_attribution (lines 173-189)

The system tracks multiple orders from the same recipient:

# Fetch all attributions for this recipient-campaign pair
all_attributions = att.list_attributions_for_recipient_and_campaign_id(
    recipient_id=attribution.recipient_id,
    campaign_id=attribution.campaign_id
)

# Exclude current attribution from the list
all_attributions = [other for other in all_attributions
                   if other.attribution_id != attribution.attribution_id]

# Find previous attributions (orders before current)
previous_attributions = [other for other in all_attributions
                        if other.created < attribution.created]

# Store order count (1 for first order, 2 for second, etc.)
attribution.json["order_count"] = len(previous_attributions) + 1

6.2 Fallback Attribution Logic

Location: app/methods/attributions.py:process_attribution (lines 182-189)

Important: If current order is outside attribution window BUT a previous order from the same recipient passed rules, the current order can still pass:

# Even if current order is outside the attribution window
if len(all_attributions) > 0 and not attribution.passes_rules:
    first_attribution = all_attributions[0]
    if first_attribution.passes_rules:
        # Current order ALSO passes rules!
        attribution.passes_rules = True

Example:

Recipient sent mail: 2024-01-01
First order: 2024-01-10 (9 days later) → Within window, passes_rules = True
Second order: 2024-03-15 (74 days later) → Outside 63-day window
    BUT first order passed, so second order also passes_rules = True

This allows attributing repeat purchases even if they occur after the attribution window expires, as long as the customer's first purchase was within the window.

6.3 Order Count Usage

The order_count field is used in: - Reporting and analytics - Identifying first-time buyers vs repeat customers - Calculating customer lifetime value (LTV) - Holdout analysis for first purchase vs all purchases (ExperimentMetric.FirstOrder vs ExperimentMetric.AllOrders)


7. Additional Business Rules

7.1 Zero-Value Orders

Location: app/methods/attributions.py:process_attribution (lines 191-193)

# Always fail attribution for $0 orders
if attribution.json.get("order_value") == 0:
    attribution.passes_rules = False

This prevents test orders, cancelled orders, or data errors from being counted as conversions.

7.2 Archived Campaigns

Location: app/methods/attributions.py:process_attribution (lines 196-201)

if campaign.status == CampaignStatus.Archived:
    attribution.archived = True

Archived attributions are: - Still processed and stored in database - Excluded from reporting queries - Used for historical analysis

7.3 Blacklist Filtering

Location: app/celery/sync_orders.py:check_for_blacklist (line 320)

Organization-specific blacklist logic at sync time:

# Example: Organization 10 excludes orders with "wholesale" discount codes
if order.attributed_campaign > 0 and order.organization == 10:
    discount_codes = order.json.get("discount_codes", [])
    if len(discount_codes) > 0:
        for code in discount_codes:
            code = code.lower().strip()
            if "wholesale" in code:
                order.attributed_campaign = -1  # Clear attribution
                break

This is organization-specific logic that prevents certain order types from being attributed.

7.5 Internal Orders

Location: app/celery/sync_orders.py:is_order_made_internally (line 480)

Orders from company email domains are excluded:

def is_order_made_internally(order: OrderEvent, organization: Organization) -> bool:
    organization_email_domain = organization.get_email_domain()
    valid_org_domain = organization_email_domain is not None and len(organization_email_domain) > 3

    if order.email is not None and valid_org_domain and order.email.endswith(organization_email_domain):
        return True
    return False

Example: If organization email is support@acmeco.com, orders from john@acmeco.com are excluded.

7.6 Active Subscription Check (Physical Address Only)

Location: app/celery/sync_orders.py:check_for_physical_address_match (lines 286-299)

For physical address matches only, check if customer had active subscription before mail was sent:

active_subscription_start = datetime.fromisoformat(
    order.json.get("included_profile", {}).get("active_subscription_start_date")
)

if active_subscription_start < recipient.sent_at:
    logger.info(f"active subscription found for recipient. Should ignore order {order.id}")
    return order  # No attribution

This prevents attributing orders from existing subscribers who were sent winback campaigns.


8. Experiment Analysis

8.1 Statistical Significance Testing

Location: app/methods/holdouts.py:ab_test_statistical_significance (line 205)

Uses Z-test for proportions to determine if experimental group outperformed control group:

def ab_test_statistical_significance(control, experiment, alpha=0.05):
    # Pooled conversion rate
    pooled_prob = (conv_rate_a * n_a + conv_rate_b * n_b) / (n_a + n_b)

    # Standard error
    se = sqrt(pooled_prob * (1 - pooled_prob) * ((1/n_a) + (1/n_b)))

    # Z-score
    z_score = (conv_rate_b - conv_rate_a) / se

    # P-value (one-tailed)
    p_value = 1.0 - norm.cdf(z_score)

    # Win probability
    win_prob = norm.cdf(z_score)

    # Statistical significance (default alpha = 0.05)
    stat_significance = p_value < alpha

    return {
        "winner": "experiment" if win_prob > 0.5 else "control",
        "win_probability": win_prob,
        "statsig": stat_significance
    }

8.2 Incremental Metrics

Location: app/methods/holdouts.py:calculate_holdout_analysis (line 130)

1. Conversion Uplift

metric_difference = experiment_conversion_rate - control_conversion_rate
metric_uplift = metric_difference / control_conversion_rate

Example: If control = 2% and experiment = 3%, uplift = 50%

2. Incremental Revenue

revenue_per_recipient_exp = experiment_revenue / experiment_recipients
revenue_per_recipient_ctrl = control_revenue / control_recipients
incremental_revenue = (revenue_per_recipient_exp - revenue_per_recipient_ctrl) * experiment_recipients

This calculates total additional revenue generated by the campaign.

3. Incremental ROAS (Return on Ad Spend)

incremental_roas = incremental_revenue / total_campaign_cost

Example: If incremental_revenue = $10,000 and cost = $2,000, ROAS = 5.0 (5x return)

4. Customer Acquisition Cost (CAC)

test_orders_per_send = experiment_orders / experiment_recipients
holdout_orders_per_send = control_orders / control_recipients
incremental_customers_per_send = test_orders_per_send - holdout_orders_per_send
incremental_customers = incremental_customers_per_send * experiment_recipients

cac = total_cost / incremental_customers

This calculates the cost to acquire one incremental customer through the campaign.

8.3 Experiment Metrics Modes

Location: app/methods/holdouts.py:ExperimentMetric (line 33)

Two analysis modes:

AllOrders Mode

  • Counts all attributed orders (first + repeat purchases)
  • Uses total revenue from all orders
  • Better for understanding total campaign impact

FirstOrder Mode

  • Counts only first purchase per customer (order_count == 1)
  • Uses revenue only from first orders
  • Better for understanding customer acquisition effectiveness

Example:

# Generate holdout stats with first order focus
holdout_stats = generate_holdout_stats(
    holdout_recipient_count=500,
    holdout_attributions=all_holdout_attributions,
    metric=ExperimentMetric.FirstOrder  # Only count first purchases
)


9. Edge Cases & Important Behaviors

9.1 Recipient Created but Not Sent

Scenario: Recipient exists with created_at but sent_at = None

Outcome: - Discount code attributions: ✅ Pass rules - All other methods: ❌ Fail rules - Attribution record is still created and processed - processed = True, but passes_rules depends on method

Why: Discount codes can work before sending (pre-campaign purchases), but email/address matching requires the mail to have influenced behavior.

Test Reference: app/tests/methods/test_attributions.py:TestProcessAttributionUnsentRecipient

9.2 Order Before Recipient Creation

Experimental Group

Not possible in normal flow because: - Attribution window starts at sent_at - sent_at must be after created_at - Orders before created_at would have negative initial_days

Control Group (Holdout)

Explicitly checked and rejected:

order_newer_than_recipient = order_datetime > holdout_recipient.created_at
if not order_newer_than_recipient:
    return False  # No attribution

Location: app/celery/sync_orders.py:check_for_holdout_order (line 110)

9.3 Multiple Recipients with Same Email

Scenario: Two recipients in different campaigns have the same email, both status = Sent

Outcome: Order is attributed to most recently created recipient

Location: app/celery/sync_orders.py:check_for_matched_emails (line 223)

# Sort recipients by created_at, most recent first
recipients = sorted(recipients, key=lambda x: x.created_at, reverse=True)

for recipient in recipients:
    if matched is False:
        matched = True
        order.attributed_campaign = recipient.campaign_id
    else:
        # Log duplicate attribution
        logger.warning(
            "Order duplicate attributed",
            ...
        )

Additional recipients trigger an "Order duplicate attributed" log event but don't create separate attributions.

9.4 Both Email and Address Match

Scenario: Order matches both email and physical address for same recipient

Outcome: Email match wins (checked first)

Why: Email matching runs first, and address matching only executes if order.attributed_campaign < 0:

# Check email first
if order.email is not None and any_address_match:
    order = check_for_matched_emails(...)

# Only check address if no attribution yet
if order.attributed_campaign < 0 and any_address_match:
    order = check_for_physical_address_match(...)

Location: app/celery/sync_orders.py:attribute_order (lines 363-369)

9.5 Active Subscription Before Send (Address Match Only)

Scenario: Customer had active subscription starting before sent_at, then places order

Outcome: No attribution created (for physical address match only)

Why: Prevents attributing orders from existing subscribers who received winback campaigns

Location: app/celery/sync_orders.py:check_for_physical_address_match (lines 286-299)

active_subscription_start = datetime.fromisoformat(
    order.json.get("included_profile", {}).get("active_subscription_start_date")
)

if active_subscription_start < recipient.sent_at:
    return order  # Skip attribution

Note: This check only applies to physical address matches, not email matches.

9.6 Campaign Not in Post-Launch State (Holdout Only)

Scenario: Holdout recipient created, order comes in, but campaign status = Pending

Outcome: No attribution created

Location: app/celery/sync_orders.py:check_for_holdout_order (lines 119-121)

if campaign and not campaign.should_process_stats():
    return False

# should_process_stats() returns True for: Active, Completed, Paused
# Returns False for: Pending, Archived, Error

This prevents holdout attributions before campaign launches.

9.7 BFCM Campaign Without end_date

Scenario: Campaign has flow_trigger = BFCM but no end_date set

Outcome: Validation error when creating/updating campaign

Location: app/core/models/campaigns.py:validate_bfcm_end_date (line 276)

if self.settings.flow_trigger == FlowTrigger.Bfcm:
    if not self.end_date:
        raise BfcmEndDateValidationError("BFCM campaigns require an end_date")

    # Validate end_date is between Nov 13 - Dec 31
    if not ((month == 11 and day >= 13) or (month == 12)):
        raise BfcmEndDateValidationError(
            f"BFCM end_date must be between November 13 - December 31, got {self.end_date}"
        )

10. Code References

10.1 Primary Files

File Purpose Key Functions
app/celery/sync_orders.py Order syncing and initial attribution process_orders, attribute_order, check_for_holdout_order
app/methods/attributions.py Attribution processing and validation process_attribution, insert_unprocessed_attribution
app/methods/holdouts.py Holdout analysis and statistics calculate_holdout_analysis, ab_test_statistical_significance
app/core/models/attributions.py Attribution data model and queries Attribution, upsert_attribution_by_provider_id
app/core/models/campaign_recipients.py Recipient data model and queries CampaignRecipient, get_campaign_recipient
app/core/models/campaigns.py Campaign data model and settings Campaign, get_attribution_window_max/min

10.2 Key Line References

Experimental Group Attribution Window

  • Start date calculation: app/methods/attributions.py:151
  • Holdout campaign override: app/methods/attributions.py:154-155
  • Window validation: app/methods/attributions.py:159-162
  • BFCM cutoff: app/methods/attributions.py:164-171

Control Group Attribution Window

  • Holdout check: app/celery/sync_orders.py:78-207
  • Date calculation: app/celery/sync_orders.py:109
  • Window validation: app/celery/sync_orders.py:124-125

Unsent Recipient Handling

  • Early exit: app/methods/attributions.py:140-148
  • Discount code exception: app/methods/attributions.py:144

Attribution Methods

  • Email match: app/celery/sync_orders.py:210-258
  • Address match: app/celery/sync_orders.py:261-317
  • Discount code: app/celery/sync_orders.py:372-423
  • Holdout check: app/celery/sync_orders.py:430-431

Multi-Order Tracking

  • Previous attributions: app/methods/attributions.py:173-179
  • Fallback logic: app/methods/attributions.py:182-189

10.3 Test References

Test File Coverage
app/tests/methods/test_attributions.py Unsent recipients, attribution timing
app/tests/methods/test_holdouts.py Holdout analysis, statistics
app/tests/celery/test_sync_orders.py Order syncing, attribution methods
app/tests/models/test_campaigns.py Campaign settings, BFCM validation

Appendix: Flow Diagram

┌─────────────────────────────────────────────────────────────┐
│ Background Job: process_orders() - Every 5 minutes          │
└────────────────────┬────────────────────────────────────────┘
                     ├─ For each enabled organization
         ┌───────────▼──────────┐
         │ Get checkpoint date   │
         │ (last synced order)   │
         └───────────┬──────────┘
         ┌───────────▼──────────┐
         │ Fetch orders from     │
         │ Klaviyo/Ometria/      │
         │ Shopify since         │
         │ checkpoint            │
         └───────────┬──────────┘
         ┌───────────▼──────────┐
         │ For each order:       │
         │                       │
         │ 1. Skip if internal   │
         │ 2. attribute_order()  │
         │ 3. check_blacklist()  │
         │ 4. Save order if      │
         │    attributed         │
         └───────────┬──────────┘
         ┌───────────▼──────────────────────────────────┐
         │ attribute_order() - Cascading Match Logic     │
         ├───────────────────────────────────────────────┤
         │                                               │
         │ ┌─────────────────────────────────────────┐  │
         │ │ 1. Email Match (Status=Sent)            │  │
         │ │    - Match email                        │  │
         │ │    - Require sent_at                    │  │
         │ │    - Use most recent recipient          │  │
         │ │    - Cache in Redis                     │  │
         │ └─────────────────────────────────────────┘  │
         │                     │ If no match             │
         │ ┌─────────────────▼─────────────────────┐  │
         │ │ 2. Physical Address Match (Status=Sent) │  │
         │ │    - Match address1+address2+zip        │  │
         │ │    - Require sent_at                    │  │
         │ │    - Skip if subscription before send   │  │
         │ │    - Cache in Redis                     │  │
         │ └─────────────────────────────────────────┘  │
         │                     │ If no match             │
         │ ┌─────────────────▼─────────────────────┐  │
         │ │ 3. Discount Code Match                  │  │
         │ │    - Match campaign-level codes         │  │
         │ │    - Match unique recipient codes       │  │
         │ │    - Works without sent_at!             │  │
         │ │    - Cache in Redis                     │  │
         │ └─────────────────────────────────────────┘  │
         │                     │ If no match             │
         │ ┌─────────────────▼─────────────────────┐  │
         │ │ 4. Holdout Match (Status=Holdout)       │  │
         │ │    - Match email or address             │  │
         │ │    - Use created_at (not sent_at!)      │  │
         │ │    - Check campaign post-launch         │  │
         │ │    - Create with holdout=True           │  │
         │ │    - Process & save immediately         │  │
         │ └─────────────────────────────────────────┘  │
         │                                               │
         └───────────────────┬───────────────────────────┘
         ┌───────────────────▼────────────────────────┐
         │ If attributed_campaign > 0:                 │
         │ - Save order to database                    │
         │ - insert_unprocessed_attribution()          │
         │   (reads Redis cache, creates attribution)  │
         │ - Log OrderAttributed event                 │
         └───────────────────┬────────────────────────┘
         ┌───────────────────▼────────────────────────┐
         │ Update checkpoint to latest order datetime  │
         └─────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ Background Job: process_attributions() - Separate worker    │
└────────────────────┬────────────────────────────────────────┘
         ┌───────────▼──────────┐
         │ Get all unprocessed   │
         │ attributions          │
         └───────────┬──────────┘
         ┌───────────▼──────────────────────────────────┐
         │ For each attribution:                         │
         │ process_attribution()                         │
         ├───────────────────────────────────────────────┤
         │                                               │
         │ 1. Load recipient, campaign, organization     │
         │                                               │
         │ 2. Check if sent_at exists                    │
         │    ├─ If None and method != DiscountCode:     │
         │    │  → Set passes_rules = False, processed = │
         │    │    True, return                           │
         │    └─ If None and method == DiscountCode:     │
         │       → Set passes_rules = True, processed =  │
         │         True, return                           │
         │                                               │
         │ 3. Calculate attribution window start         │
         │    ├─ Default: start_date = sent_at           │
         │    └─ If holdout.enabled: start_date =        │
         │       created_at                               │
         │                                               │
         │ 4. Calculate days_since_dispatch              │
         │    initial_days = (order_date - start_date)   │
         │                   .days                        │
         │                                               │
         │ 5. Check if within window                     │
         │    if min_days <= initial_days < max_days:    │
         │       passes_rules = True                     │
         │    else:                                      │
         │       passes_rules = False                    │
         │                                               │
         │ 6. Apply BFCM absolute cutoff (if applicable) │
         │    if BFCM and order_date > cutoff_date:      │
         │       passes_rules = False                    │
         │                                               │
         │ 7. Check previous attributions                │
         │    if first_attribution.passes_rules:         │
         │       current.passes_rules = True             │
         │                                               │
         │ 8. Exclude $0 orders                          │
         │    if order_value == 0:                       │
         │       passes_rules = False                    │
         │                                               │
         │ 9. Mark archived campaigns                    │
         │    if campaign.status == Archived:            │
         │       attribution.archived = True             │
         │                                               │
         │ 10. Set processed = True, save                │
         │                                               │
         └───────────────────────────────────────────────┘

Summary

The PaperRun attribution system is sophisticated, with different timing rules for experimental vs control groups, multiple attribution methods with cascading priority, and complex edge case handling.

Critical Insights for Test vs Control Groups

  1. Dual Attribution Window Constraints:
  2. Both experimental and control groups validate against TWO timing constraints:
    • Campaign constraint: campaign.first_send_date + min_days
    • Recipient constraint: recipient.created_at + min_days
  3. Order must occur on or after BOTH thresholds (the later of the two dates)

  4. Same min_days Logic for Both Groups:

  5. Experimental recipients: recipient.created_at + min_days
  6. Control recipients: recipient.created_at + min_days
  7. min_days = 1 for holdout-enabled campaigns, 3 for standard campaigns
  8. This ensures fair comparison by applying the same recipient timing rules to both groups

  9. Different Processing Paths:

  10. Experimental: Checked first via email/address/discount methods
  11. Control: Checked last as fallback, immediately processed

  12. Discount Code Exception:

  13. ONLY method that works for unsent recipients
  14. Can attribute orders before mailers are sent

  15. Holdout Campaign Setting:

  16. When campaign.settings.holdout.enabled = True
  17. min_days = 1 (instead of 3) for ALL recipients
  18. max_days = 60 (instead of 63)
  19. Changes windows for everyone, ensuring fair A/B testing

  20. Fair Comparison:

  21. Both groups use the same min_days from recipient creation
  22. Both groups use the same min_days from campaign first send
  23. Attribution windows align to ensure valid statistical comparison
  24. Differences in attribution rates can be confidently attributed to the campaign treatment

This design enables accurate A/B testing by ensuring experimental and control groups use identical timing logic, with the only difference being which date triggers their respective attribution windows (sent_at vs created_at for the experimental group's sent-based logic, while both use created_at for the recipient timing constraint).