Platform Guide

Know Who Your Visitors Are — Even Across Devices

ClickStream's 5-layer identity stack automatically resolves anonymous visitors into unified profiles. Here's what it does, why it matters, and how it works under the hood.

March 2026 • 16 min read

What You'll See in the Dashboard

Open the Visitors tab at einstein.clickstream.com and click any visitor profile. You'll see a unified timeline showing every session across every device — desktop, mobile, tablet — stitched together automatically. No login required from the visitor. The identity stack described below is what makes that possible. You don't need to configure or code anything — it works out of the box.

The 5-Layer Identity Stack

ClickStream automatically resolves visitor identity using 5 layers of signals, from most accurate (Layer 1) to most probabilistic (Layer 5). Each layer fills gaps where the layer above has no signal. Together, they create the most complete visitor profile possible — all handled by the platform.

Layer 1: First-Party Cookie (The Anchor)

What it does for you: Recognizes returning visitors reliably — even in Safari and Firefox — by using a server-set first-party cookie on your domain. The platform sets this up automatically during onboarding when you configure your tracking domain.

Attribute Value
Signal typeDeterministic
AccuracyHigh (deterministic match)
Persistence365 days
Coverage70-92% of visitors (depends on consent)
Cross-deviceNo (per-browser)
Privacy impactLow (first-party, consent-gated)

The first-party cookie is the anchor of the identity graph. Every other signal resolves back to this identifier. When a visitor provides an email on one device, and that same email appears in a session on another device, both device cookies are linked in the identity graph through the email as the connecting node.

Layer 2: Hashed Email / Phone / CRM IDs

What it does for you: When a visitor fills a form, logs in, or matches a CRM record, ClickStream links their anonymous sessions to a known identity — and connects all their devices. The platform stores only one-way hashes (SHA-256), never raw emails or phone numbers.

Attribute Value
Signal typeDeterministic
Accuracy99%+ (when available)
PersistenceIndefinite (tied to CRM)
Coverage10-30% of visitors (authentication required)
Cross-deviceYes (same email/phone across devices)
Privacy impactMedium (PII-derived, consent required)

Hashed emails are the most powerful cross-device signal because people use the same email address everywhere. When stored as SHA-256 hashes, the original email cannot be recovered, but the hash can be matched across sessions and devices.

Under the Hood: How Hashing Works
// Hashing example (SHA-256)
// Input:  "user@example.com" (lowercased, trimmed)
// Output: "b4c9a289323b21a01c3e940f150eb9b8c542587f1abfd8f0e1cc1ffc5e475514"

// ClickStream stores ONLY the hash, never the raw email
const hashedEmail = await crypto.subtle.digest(
    'SHA-256',
    new TextEncoder().encode(email.toLowerCase().trim())
);

Layer 3: Mobile Ad IDs (MAIDs) and Social IDs

What it does for you: When available, mobile advertising IDs help stitch app-to-web sessions. ClickStream captures these automatically via SDK integrations — no configuration needed.

Attribute Value
Signal typeDeterministic (device-level)
AccuracyHigh (when available)
PersistenceUntil user resets
Coverage15-25% (iOS ATT reduced IDFA availability)
Cross-deviceNo (per-device)
Privacy impactHigh (device tracking, consent required)

Apple's App Tracking Transparency (ATT) dramatically reduced IDFA availability from ~70% to ~25% of iOS users. GAID remains more available on Android but Google has announced deprecation plans. MAIDs are declining in importance but remain valuable for app-to-web identity stitching.

Layer 4: Click IDs and UTM Parameters

What it does for you: ClickStream automatically captures click IDs (gclid, fbclid, msclkid, ttclid) and UTM parameters from ad platforms on landing. These are stored against the visitor's cookie for accurate multi-touch attribution — visible in the Campaigns tab.

Attribute Value
Signal typeDeterministic (session-level)
Accuracy100% (for that click)
PersistenceSession only (URL parameter)
Coverage30-60% of sessions (paid traffic only)
Cross-deviceNo
Privacy impactLow (campaign metadata)

Click IDs are the bridge between ad platforms and your site. They're 100% accurate for the session they appear in -- but they exist only in the URL of the initial ad click. Without a persistent cookie to store them against, they're lost when the user navigates to a second page or returns later.

Layer 5: Behavioral Signature / IP Household / Probabilistic

What it does for you: For visitors with no cookie or login, ClickStream falls back to probabilistic signals — browser characteristics, IP-based household clustering, and behavioral biometrics — to maintain partial identity resolution. You'll see these visitors in the dashboard with a "probabilistic" confidence badge.

Attribute Value
Signal typeProbabilistic
Accuracy40-65%
PersistenceSession only (regenerated)
Coverage100% (always available)
Cross-deviceLimited (IP household only)
Privacy impactVaries (device recognition may violate EU regulations)

Layer 5 signals include:

Example Journey: What You'll See in the Dashboard

Here's how the 5 identity layers converge across a real visitor journey — and what each step looks like in your ClickStream dashboard:

Session 1: Google Ads Click (Monday)

Identity graph: v_abc123 = anonymous visitor from Google Ads brand campaign.

Session 2: Organic Return (Wednesday)

Identity graph: v_abc123 has 2 sessions, one paid, one organic. Journey stitched.

Session 3: Form Fill (Thursday)

Identity graph: v_abc123 is now linked to hashed email b4c9a2.... Previously anonymous visitor is now a known contact.

Session 4: Mobile Visit (Friday)

Identity graph: v_xyz789 (mobile) is linked to v_abc123 (desktop) via shared hashed email. Cross-device identity resolved.

Session 5: Conversion (Saturday, Desktop)

Complete journey: Google Ads click (Mon) → Organic return (Wed) → Form fill (Thu) → Mobile research (Fri) → Conversion (Sat). Five sessions, two devices, one unified identity. In the Visitors tab, you'd see all five sessions in a single timeline under one visitor profile, with the attribution chain visible in the Campaigns tab.

Under the Hood: Identity Signal Storage

ClickStream stores identity signals in a pipe-delimited format within Analytics Engine's 20-blob field limit. This encoding maximizes information density per field:

// Identity signal storage format
// blob1: identity_signals
"cookie:v_abc123|hem:b4c9a289...|gclid:EAIaIQob..."

// Parsing function
function parseIdentitySignals(blob) {
    const signals = {};
    blob.split('|').forEach(pair => {
        const [key, value] = pair.split(':');
        signals[key] = value;
    });
    return signals;
}

// Result:
// {
//     cookie: "v_abc123",
//     hem: "b4c9a289...",
//     gclid: "EAIaIQob..."
// }

Privacy Considerations by Signal Type

Each identity layer has different privacy implications. This table maps each signal to its regulatory classification and consent requirements:

Layer Signal GDPR Classification Consent Required? Opt-Out Mechanism
1 First-party cookie Online identifier (Art. 4) Yes (EU/UK), No (US opt-out) Cookie consent banner, browser settings
2 Hashed email Pseudonymized personal data Yes (explicit) Account deletion, erasure request
2 Hashed phone Pseudonymized personal data Yes (explicit) Account deletion, erasure request
3 IDFA/GAID Online identifier Yes (ATT on iOS) OS settings, ATT prompt
4 Click IDs Not personal data (campaign metadata) No (functional) N/A
4 UTM parameters Not personal data No N/A
5 Browser signature Online identifier (Art. 4) Yes (EU), debated Browser anti-device recognition settings
5 IP address Personal data (EU courts) Legitimate interest (analytics) VPN, proxy
5 Behavioral biometrics Potentially special category data Yes (explicit recommended) SDK opt-out

ClickStream's consent framework respects these classifications. When a visitor in the EU declines analytics cookies, Layers 1-3 are disabled. Layer 4 (click IDs/UTMs, which are campaign metadata) remains available. Layer 5 is only activated as a last-resort fallback with appropriate legal basis.

The Identity Graph Architecture

All five layers feed into ClickStream's identity graph, which maintains relationships between identifiers:

// Identity graph data model (simplified)
{
    "canonical_id": "profile_f7a82b",
    "identifiers": {
        "cookies": ["v_abc123", "v_xyz789"],
        "hashed_emails": ["b4c9a289..."],
        "hashed_phones": [],
        "maids": [],
        "click_ids": {
            "gclid": ["EAIaIQob..."],
            "fbclid": []
        }
    },
    "devices": [
        { "type": "desktop", "browser": "Chrome", "cookie": "v_abc123" },
        { "type": "mobile", "browser": "Safari", "cookie": "v_xyz789" }
    ],
    "first_seen": "2026-02-24T10:30:00Z",
    "last_seen": "2026-03-01T14:22:00Z",
    "total_sessions": 5,
    "touchpoints": [
        { "session": 1, "source": "google_ads", "device": "desktop" },
        { "session": 2, "source": "organic", "device": "desktop" },
        { "session": 3, "source": "direct", "device": "desktop" },
        { "session": 4, "source": "direct", "device": "mobile" },
        { "session": 5, "source": "direct", "device": "desktop" }
    ]
}

The canonical_id is the unified identifier that resolves all signals back to a single person. Every analytics query, behavioral score, and attribution model operates on the canonical ID, not on individual cookies or sessions.

The identity graph is not a database. It's a resolution layer. Raw events reference cookies. The graph resolves cookies to people. Analytics operate on people.

Summary: When to Use Each Layer

Layer Best For Limitations
1. First-party cookie Primary identity, cross-session tracking, behavioral scoring Per-browser, per-device only
2. Hashed email/phone Cross-device resolution, CRM matching Requires authentication (10-30% coverage)
3. MAIDs App-to-web stitching, mobile identity Declining availability (ATT)
4. Click IDs / UTMs Attribution, campaign measurement Session-level only, paid traffic only
5. Signature / IP Last-resort fallback, household clustering Low accuracy, privacy concerns

The stack works because no single layer provides complete coverage. First-party cookies cover 70-92% of visitors but can't cross devices. Hashed emails cross devices but only cover 10-30%. Click IDs provide perfect attribution for paid traffic but nothing for organic. Together, the layers create comprehensive identity resolution.

Make Every Ad Dollar Reach a Real Customer

Five layers of identity resolution ensure your retargeting, attribution, and personalization reach real people -- not bots, not duplicates, not ghosts.

GET EARLY ACCESS