# Source Patterns

Use this reference to decide which system is authoritative for a claim and how to work across common product-data stacks.

## Source Categories

- Product database or warehouse
  - Best for users, accounts, workspaces, teams, projects, content items, orders, bookings, and subscription state.
- Analytics events
  - Best for paths, screens, clicks, sessions, event timing, step completion, and client-side drop-off.
- Billing or revenue system
  - Best for trial starts, upgrades, renewals, cancellations, refunds, and plan mix.
- Release or experiment history
  - Best for pre/post rollout cohorts, exposed vs unexposed users, and change attribution.
- Support feedback or surveys
  - Best for turning complaints or qualitative claims into hypotheses to test behaviorally.
- Logs
  - Best for backfilling missing instrumentation or verifying whether an event path exists at all.

## Authority Rules

- Name the source that is authoritative for each claim.
- Do not average disagreeing systems. Quantify the gap.
- When the event stream and entity truth differ, say which one answers the question better.
- Keep an `uncertain` bucket when identity stitching, bot filtering, or source coverage is incomplete.
- Always report excluded traffic volume when internal, test, bot, or automation filters materially change the denominator.

## Common Stack Patterns

### Product DB + PostHog

- Use the database for user, account, workspace, content, order, or subscription truth.
- Use PostHog for funnel sequencing, page or screen paths, and timing cliffs.
- Reconcile identity carefully when anonymous traffic later becomes signed-in.

### Product DB + Amplitude

- Use the database for canonical entities and monetization joins.
- Use Amplitude for pathing, retention curves, and behavioral cohorts.
- Be explicit about which Amplitude events are client-side proxies vs backend-confirmed actions.

### Product DB + Mixpanel

- Use the database for authoritative state transitions and historical truth.
- Use Mixpanel for event-level funnels and repeated behavior patterns.
- Watch for duplicate event names or mixed client/server instrumentation under one label.

### GA4 + Backend Events

- Use GA4 for traffic source, landing behavior, and broad site conversion flow.
- Use backend events or the database for account creation, checkout completion, fulfillment, and durable value events.
- Expect attribution gaps between anonymous web traffic and logged-in product usage.

### Stripe + Product Usage

- Use Stripe for trial, paid conversion, cancellations, renewals, and refunds.
- Use product usage data for value realization before and after billing moments.
- Test whether churn-like complaints map to weak usage before billing, short one-session use, or successful one-time completion.

### Warehouse-First Setups

- Use warehouse models as the reporting surface, but still state which upstream source owns each field.
- Check model freshness before drawing pre/post rollout conclusions.
- If model logic changes recently, treat it as part of release context.

## Release Context Questions

- What shipped recently in the funnel being studied?
- What copy, pricing, paywall, onboarding, or experiment changes define the relevant cohorts?
- Which high-traffic surfaces have not changed in weeks?
- If the metric moved without a related product change, is traffic mix, seasonality, or instrumentation a better explanation?

## Output Discipline

- Always show denominators.
- Always state the time window.
- Separate signal from speculation.
- If a metric is based on a proxy, say so plainly.
