What You'll See in the Dashboard
Open the Visitors tab at einstein.clickstream.com and click any visitor profile. You'll see a unified timeline showing every session across every device — desktop, mobile, tablet — stitched together automatically. No login required from the visitor. The identity stack described below is what makes that possible. You don't need to configure or code anything — it works out of the box.
The 5-Layer Identity Stack
ClickStream automatically resolves visitor identity using 5 layers of signals, from most accurate (Layer 1) to most probabilistic (Layer 5). Each layer fills gaps where the layer above has no signal. Together, they create the most complete visitor profile possible — all handled by the platform.
Layer 1: First-Party Cookie (The Anchor)
What it does for you: Recognizes returning visitors reliably — even in Safari and Firefox — by using a server-set first-party cookie on your domain. The platform sets this up automatically during onboarding when you configure your tracking domain.
| Attribute | Value |
|---|---|
| Signal type | Deterministic |
| Accuracy | High (deterministic match) |
| Persistence | 365 days |
| Coverage | 70-92% of visitors (depends on consent) |
| Cross-device | No (per-browser) |
| Privacy impact | Low (first-party, consent-gated) |
The first-party cookie is the anchor of the identity graph. Every other signal resolves back to this identifier. When a visitor provides an email on one device, and that same email appears in a session on another device, both device cookies are linked in the identity graph through the email as the connecting node.
Layer 2: Hashed Email / Phone / CRM IDs
What it does for you: When a visitor fills a form, logs in, or matches a CRM record, ClickStream links their anonymous sessions to a known identity — and connects all their devices. The platform stores only one-way hashes (SHA-256), never raw emails or phone numbers.
| Attribute | Value |
|---|---|
| Signal type | Deterministic |
| Accuracy | 99%+ (when available) |
| Persistence | Indefinite (tied to CRM) |
| Coverage | 10-30% of visitors (authentication required) |
| Cross-device | Yes (same email/phone across devices) |
| Privacy impact | Medium (PII-derived, consent required) |
Hashed emails are the most powerful cross-device signal because people use the same email address everywhere. When stored as SHA-256 hashes, the original email cannot be recovered, but the hash can be matched across sessions and devices.
Under the Hood: How Hashing Works
// Hashing example (SHA-256)
// Input: "user@example.com" (lowercased, trimmed)
// Output: "b4c9a289323b21a01c3e940f150eb9b8c542587f1abfd8f0e1cc1ffc5e475514"
// ClickStream stores ONLY the hash, never the raw email
const hashedEmail = await crypto.subtle.digest(
'SHA-256',
new TextEncoder().encode(email.toLowerCase().trim())
);
Layer 3: Mobile Ad IDs (MAIDs) and Social IDs
What it does for you: When available, mobile advertising IDs help stitch app-to-web sessions. ClickStream captures these automatically via SDK integrations — no configuration needed.
| Attribute | Value |
|---|---|
| Signal type | Deterministic (device-level) |
| Accuracy | High (when available) |
| Persistence | Until user resets |
| Coverage | 15-25% (iOS ATT reduced IDFA availability) |
| Cross-device | No (per-device) |
| Privacy impact | High (device tracking, consent required) |
Apple's App Tracking Transparency (ATT) dramatically reduced IDFA availability from ~70% to ~25% of iOS users. GAID remains more available on Android but Google has announced deprecation plans. MAIDs are declining in importance but remain valuable for app-to-web identity stitching.
Layer 4: Click IDs and UTM Parameters
What it does for you: ClickStream automatically captures click IDs (gclid, fbclid, msclkid, ttclid) and UTM parameters from ad platforms on landing. These are stored against the visitor's cookie for accurate multi-touch attribution — visible in the Campaigns tab.
| Attribute | Value |
|---|---|
| Signal type | Deterministic (session-level) |
| Accuracy | 100% (for that click) |
| Persistence | Session only (URL parameter) |
| Coverage | 30-60% of sessions (paid traffic only) |
| Cross-device | No |
| Privacy impact | Low (campaign metadata) |
Click IDs are the bridge between ad platforms and your site. They're 100% accurate for the session they appear in -- but they exist only in the URL of the initial ad click. Without a persistent cookie to store them against, they're lost when the user navigates to a second page or returns later.
Layer 5: Behavioral Signature / IP Household / Probabilistic
What it does for you: For visitors with no cookie or login, ClickStream falls back to probabilistic signals — browser characteristics, IP-based household clustering, and behavioral biometrics — to maintain partial identity resolution. You'll see these visitors in the dashboard with a "probabilistic" confidence badge.
| Attribute | Value |
|---|---|
| Signal type | Probabilistic |
| Accuracy | 40-65% |
| Persistence | Session only (regenerated) |
| Coverage | 100% (always available) |
| Cross-device | Limited (IP household only) |
| Privacy impact | Varies (device recognition may violate EU regulations) |
Layer 5 signals include:
- Browser signature: User agent, screen resolution, timezone, language, hardware concurrency, WebGL renderer
- IP-based household clustering: Group visitors by IP as a proxy for household/office identity
- Behavioral biometrics: Mouse movement patterns, typing cadence, scroll behavior -- unique enough to be identity-suggestive
Example Journey: What You'll See in the Dashboard
Here's how the 5 identity layers converge across a real visitor journey — and what each step looks like in your ClickStream dashboard:
Session 1: Google Ads Click (Monday)
- Layer 1: First-party cookie set:
_cs_id=v_abc123 - Layer 4: Click ID captured:
gclid=EAIaIQobChMI... - Layer 4: UTM captured:
utm_source=google&utm_medium=cpc&utm_campaign=brand - Layer 5: Device signature generated as backup
Identity graph: v_abc123 = anonymous visitor from Google Ads brand campaign.
Session 2: Organic Return (Wednesday)
- Layer 1: Cookie recognized:
_cs_id=v_abc123(same visitor!) - Layer 4: No click ID (organic search), referrer captured
Identity graph: v_abc123 has 2 sessions, one paid, one organic. Journey stitched.
Session 3: Form Fill (Thursday)
- Layer 1: Cookie recognized:
_cs_id=v_abc123 - Layer 2: Email provided:
SHA-256(user@company.com)=b4c9a2...
Identity graph: v_abc123 is now linked to hashed email b4c9a2.... Previously anonymous visitor is now a known contact.
Session 4: Mobile Visit (Friday)
- Layer 1: New cookie on mobile:
_cs_id=v_xyz789(different device) - Layer 2: Logs in with same email →
SHA-256(user@company.com)=b4c9a2...
Identity graph: v_xyz789 (mobile) is linked to v_abc123 (desktop) via shared hashed email. Cross-device identity resolved.
Session 5: Conversion (Saturday, Desktop)
- Layer 1: Cookie recognized:
_cs_id=v_abc123
Complete journey: Google Ads click (Mon) → Organic return (Wed) → Form fill (Thu) → Mobile research (Fri) → Conversion (Sat). Five sessions, two devices, one unified identity. In the Visitors tab, you'd see all five sessions in a single timeline under one visitor profile, with the attribution chain visible in the Campaigns tab.
Under the Hood: Identity Signal Storage
ClickStream stores identity signals in a pipe-delimited format within Analytics Engine's 20-blob field limit. This encoding maximizes information density per field:
// Identity signal storage format
// blob1: identity_signals
"cookie:v_abc123|hem:b4c9a289...|gclid:EAIaIQob..."
// Parsing function
function parseIdentitySignals(blob) {
const signals = {};
blob.split('|').forEach(pair => {
const [key, value] = pair.split(':');
signals[key] = value;
});
return signals;
}
// Result:
// {
// cookie: "v_abc123",
// hem: "b4c9a289...",
// gclid: "EAIaIQob..."
// }
Privacy Considerations by Signal Type
Each identity layer has different privacy implications. This table maps each signal to its regulatory classification and consent requirements:
| Layer | Signal | GDPR Classification | Consent Required? | Opt-Out Mechanism |
|---|---|---|---|---|
| 1 | First-party cookie | Online identifier (Art. 4) | Yes (EU/UK), No (US opt-out) | Cookie consent banner, browser settings |
| 2 | Hashed email | Pseudonymized personal data | Yes (explicit) | Account deletion, erasure request |
| 2 | Hashed phone | Pseudonymized personal data | Yes (explicit) | Account deletion, erasure request |
| 3 | IDFA/GAID | Online identifier | Yes (ATT on iOS) | OS settings, ATT prompt |
| 4 | Click IDs | Not personal data (campaign metadata) | No (functional) | N/A |
| 4 | UTM parameters | Not personal data | No | N/A |
| 5 | Browser signature | Online identifier (Art. 4) | Yes (EU), debated | Browser anti-device recognition settings |
| 5 | IP address | Personal data (EU courts) | Legitimate interest (analytics) | VPN, proxy |
| 5 | Behavioral biometrics | Potentially special category data | Yes (explicit recommended) | SDK opt-out |
ClickStream's consent framework respects these classifications. When a visitor in the EU declines analytics cookies, Layers 1-3 are disabled. Layer 4 (click IDs/UTMs, which are campaign metadata) remains available. Layer 5 is only activated as a last-resort fallback with appropriate legal basis.
The Identity Graph Architecture
All five layers feed into ClickStream's identity graph, which maintains relationships between identifiers:
// Identity graph data model (simplified)
{
"canonical_id": "profile_f7a82b",
"identifiers": {
"cookies": ["v_abc123", "v_xyz789"],
"hashed_emails": ["b4c9a289..."],
"hashed_phones": [],
"maids": [],
"click_ids": {
"gclid": ["EAIaIQob..."],
"fbclid": []
}
},
"devices": [
{ "type": "desktop", "browser": "Chrome", "cookie": "v_abc123" },
{ "type": "mobile", "browser": "Safari", "cookie": "v_xyz789" }
],
"first_seen": "2026-02-24T10:30:00Z",
"last_seen": "2026-03-01T14:22:00Z",
"total_sessions": 5,
"touchpoints": [
{ "session": 1, "source": "google_ads", "device": "desktop" },
{ "session": 2, "source": "organic", "device": "desktop" },
{ "session": 3, "source": "direct", "device": "desktop" },
{ "session": 4, "source": "direct", "device": "mobile" },
{ "session": 5, "source": "direct", "device": "desktop" }
]
}
The canonical_id is the unified identifier that resolves all signals back to a single person. Every analytics query, behavioral score, and attribution model operates on the canonical ID, not on individual cookies or sessions.
The identity graph is not a database. It's a resolution layer. Raw events reference cookies. The graph resolves cookies to people. Analytics operate on people.
Summary: When to Use Each Layer
| Layer | Best For | Limitations |
|---|---|---|
| 1. First-party cookie | Primary identity, cross-session tracking, behavioral scoring | Per-browser, per-device only |
| 2. Hashed email/phone | Cross-device resolution, CRM matching | Requires authentication (10-30% coverage) |
| 3. MAIDs | App-to-web stitching, mobile identity | Declining availability (ATT) |
| 4. Click IDs / UTMs | Attribution, campaign measurement | Session-level only, paid traffic only |
| 5. Signature / IP | Last-resort fallback, household clustering | Low accuracy, privacy concerns |
The stack works because no single layer provides complete coverage. First-party cookies cover 70-92% of visitors but can't cross devices. Hashed emails cross devices but only cover 10-30%. Click IDs provide perfect attribution for paid traffic but nothing for organic. Together, the layers create comprehensive identity resolution.