PII in Analytics Data
Personally Identifiable Information (PII) accidentally collected in analytics platforms creates serious privacy compliance risks, potential fines, and loss of user trust. PII includes names, emails, phone numbers, addresses, social security numbers, and other data that can identify individuals.
What This Means
PII leaking into analytics happens through:
- URL parameters -
?email=user@example.com&phone=555-1234 - Page titles - "Thank you, John Smith" in title tag
- Form fields - Accidentally tracking form input values
- Custom dimensions - Passing user data to custom fields
- Site search - Users searching for their own name/email
- Ecommerce data - Including customer names in product fields
Why This Is Critical
Legal consequences:
- GDPR violations - Up to €20M or 4% of global revenue
- CCPA violations - Up to $7,500 per intentional violation
- HIPAA violations - Up to $1.5M per year for healthcare data
- Class action lawsuits - Data privacy litigation
Platform consequences:
- GA4/Adobe Analytics - Terms of Service violations
- Account suspension - Platforms may suspend accounts
- Data deletion requirements - Must delete historical PII data
Business consequences:
- Loss of user trust
- Regulatory audits
- Negative publicity
- Insurance premium increases
How to Diagnose
1. Audit URL Parameters
Check for PII in URLs:
High-risk pages:
- Form confirmation pages (
/thank-you?email=...) - User account pages (
/account?user_id=john.smith) - Search results (
/search?q=john+doe+phone+number) - Password reset pages (
/reset?token=...&email=...) - Checkout pages (
/checkout?name=...&address=...)
How to check:
In GA4:
- Reports → Engagement → Pages and screens
- Search for
@,email,name,phone,address - Look at Page path + query string dimension
In Adobe Analytics:
Browser console:
// Check what URLs are being sent
console.log(window.location.href);
// Check dataLayer for PII
console.log(window.dataLayer);
2. Review Page Titles
Page titles are tracked automatically:
Risky patterns:
<!-- BAD: Contains PII -->
<title>Welcome back, john.smith@example.com - Dashboard</title>
<title>Order confirmation for John Smith - Store</title>
<title>Reset password for user: jane.doe</title>
<!-- GOOD: Generic titles -->
<title>Dashboard - Store</title>
<title>Order Confirmation - Store</title>
<title>Reset Password - Store</title>
Check in GA4:
- Reports → Engagement → Pages and screens
- Review "Page title" dimension
- Search for patterns: @, names, emails
3. Inspect Custom Dimensions and Metrics
Review what's being sent to custom dimensions:
GA4 custom dimensions:
- Admin → Custom Definitions
- Review each custom dimension
- Check example values for PII
Common PII leaks in custom dimensions:
- User ID dimension contains email instead of hash
- Customer tier includes customer name
- User properties include phone numbers
- Custom metrics include personal data
Test by viewing in DebugView:
// GA4 DebugView - look at custom parameters
gtag('config', 'G-XXXXXX', {
'debug_mode': true
});
4. Check Site Search Tracking
Users may search for their own information:
Risky searches:
- "john smith order status"
- "my account email@example.com"
- "track order 555-1234" (phone as order number)
In GA4:
- Reports → Engagement → Search terms
- Review for names, emails, phone patterns
- Check search query parameter configuration
5. Review Ecommerce Tracking
Common PII in ecommerce data:
// BAD: Contains customer name
gtag('event', 'purchase', {
transaction_id: 'T12345',
value: 99.99,
items: [{
item_name: 'Gift for John Smith', // PII!
item_id: 'SKU123'
}]
});
// GOOD: No PII
gtag('event', 'purchase', {
transaction_id: 'T12345',
value: 99.99,
items: [{
item_name: 'Blue Widget',
item_id: 'SKU123'
}]
});
6. Audit Form Tracking
Check if form field values are being captured:
Test in browser console:
// Before submitting form, check what's tracked
console.log(window.dataLayer);
// Submit form, check again
// Look for form field values in events
Look for:
- Form field values in event parameters
- Input values in custom dimensions
- User data in event labels
7. Use Automated Scanning Tools
Scan for PII patterns:
Regex patterns to search for:
- Email:
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} - Phone:
\d{3}[-.\s]?\d{3}[-.\s]?\d{4} - SSN:
\d{3}-\d{2}-\d{4} - Credit card:
\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}
Tools:
- Google Analytics PII Scanner (browser extension)
- Custom BigQuery queries (for GA4 exports)
- Adobe Analytics Data Warehouse exports + regex search
General Fixes
1. Remove PII from URLs
Strip query parameters before page load:
Client-side URL cleaning:
// Remove PII parameters from URL and analytics
(function() {
const piiParams = ['email', 'name', 'phone', 'address', 'ssn', 'user'];
const url = new URL(window.location);
let modified = false;
piiParams.forEach(param => {
if (url.searchParams.has(param)) {
url.searchParams.delete(param);
modified = true;
}
});
if (modified) {
// Update URL without page reload
window.history.replaceState({}, '', url);
}
})();
Run before analytics loads:
<!-- Load PII removal BEFORE GTM/analytics -->
<script src="/js/pii-removal.js"></script>
<script async src="https://www.googletagmanager.com/gtm.js?id=GTM-XXXXX"></script>
Server-side redirect (preferred):
// Node.js/Express example
app.get('/thank-you', (req, res) => {
// Strip PII from URL
const cleanUrl = '/thank-you';
if (req.query.email || req.query.name) {
return res.redirect(302, cleanUrl);
}
res.render('thank-you');
});
2. Configure GA4 Data Redaction
Enable data redaction in GA4:
- Admin → Data Settings → Data Collection
- Enable "Redact Personally Identifiable Information"
- This removes email addresses from URLs automatically
Note: Only available in GA4, not Universal Analytics
Configure in gtag.js:
gtag('config', 'G-XXXXXX', {
'anonymize_ip': true, // IP anonymization
'allow_google_signals': false, // Disable remarketing
});
3. Implement URL Filtering in GTM
Google Tag Manager variable filtering:
Create a custom variable:
// Variable Name: Clean Page URL
function() {
const url = new URL({{Page URL}});
const piiParams = ['email', 'name', 'phone', 'address', 'user', 'token'];
piiParams.forEach(param => {
url.searchParams.delete(param);
});
return url.toString();
}
Use in GA4 config tag:
- Instead of
{{Page URL}}, use{{Clean Page URL}}
4. Hash User Identifiers
Instead of sending raw user IDs or emails:
// BAD: Sending email directly
gtag('config', 'G-XXXXXX', {
'user_id': 'john.smith@example.com'
});
// GOOD: Hash the email first
async function hashEmail(email) {
const encoder = new TextEncoder();
const data = encoder.encode(email.toLowerCase().trim());
const hashBuffer = await crypto.subtle.digest('SHA-256', data);
const hashArray = Array.from(new Uint8Array(hashBuffer));
return hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
}
const hashedEmail = await hashEmail('john.smith@example.com');
gtag('config', 'G-XXXXXX', {
'user_id': hashedEmail // '4a5f7c2e...'
});
Benefits:
- Can still identify unique users
- Cannot reverse to find PII
- Compliant with privacy regulations
5. Use Generic Page Titles
Set consistent, PII-free page titles:
// Set title before analytics loads
document.title = 'Order Confirmation - Store Name';
// Or in React/SPA:
useEffect(() => {
document.title = 'Order Confirmation - Store Name';
}, []);
Dynamic title strategy:
// Instead of: "Welcome, John Smith"
// Use: "Welcome Back"
const pageTitle = isLoggedIn ? 'Dashboard' : 'Login';
document.title = `${pageTitle} - Store Name`;
6. Implement Enhanced Measurement Filters
GA4 Enhanced Measurement - exclude form fields:
Admin → Data Streams → Configure tag settings
- Enhanced Measurement → Settings
- Form interactions: Disable or configure exclusions
Exclude specific form classes:
<!-- Forms with this class won't track field values -->
<form class="pii-form">
<input type="email" name="email">
</form>
7. Create Data Deletion Requests
If PII already exists in analytics:
GA4 Data Deletion:
- Admin → Data Settings → Data Deletion Requests
- Create new request
- Specify parameter and value to delete (e.g., email = "user@example.com")
- Select date range
- Submit request (takes several days to process)
Adobe Analytics Data Deletion:
- Privacy Service UI → Create Request
- GDPR/CCPA deletion request
- Specify identifiers
- Submit through Privacy API
- Use Google Ads API for data deletion
- Contact support for account-level issues
8. Audit and Filter Custom Dimensions
Review all custom dimensions:
// BEFORE: Sending raw user data
gtag('event', 'login', {
'user_email': user.email, // PII!
'user_name': user.name // PII!
});
// AFTER: Sending hashed or generic data
gtag('event', 'login', {
'user_id_hash': hashUserId(user.id),
'user_type': user.subscription_tier // Not PII
});
Implement safeguards:
function trackEvent(eventName, params) {
// Validate no PII in params
const piiPatterns = [
/@.*\./, // Email pattern
/\d{3}.*\d{3}.*\d{4}/, // Phone pattern
];
Object.values(params).forEach(value => {
piiPatterns.forEach(pattern => {
if (pattern.test(String(value))) {
console.error('PII detected in tracking params!', value);
// Don't send event or remove the parameter
}
});
});
gtag('event', eventName, params);
}
9. Implement Content Security Policy for Analytics
Monitor what's being tracked:
<meta http-equiv="Content-Security-Policy"
content="default-src 'self';
script-src 'self' https://www.googletagmanager.com;
connect-src 'self' https://www.google-analytics.com;
report-uri /csp-report">
Monitor CSP reports to detect:
- Unauthorized tracking scripts
- Data being sent to unexpected domains
10. Train Team on PII Prevention
Create internal guidelines:
DO:
- Hash user identifiers
- Use generic page titles
- Strip URL parameters
- Validate before sending to analytics
DON'T:
- Send emails, names, phone numbers
- Track form field values
- Include PII in product names
- Use PII in custom dimensions
Review checklist for new features:
- URLs checked for PII parameters
- Page titles are generic
- Custom dimensions validated
- Form tracking excludes PII fields
- Ecommerce data sanitized
Platform-Specific Guides
| Platform | Guide |
|---|---|
| Shopify | Shopify privacy and PII handling |
| WordPress | WordPress GDPR compliance |
| GA4 | GA4 data redaction settings |
| Adobe Analytics | Adobe privacy controls |
Further Reading
- GDPR Compliance - European privacy regulation
- CCPA Compliance - California privacy law
- HIPAA Compliance - Healthcare data protection
- GA4 PII Policy - Google's PII guidelines
- Privacy Issues - Related privacy topics
- Data Deletion in GA4 - Remove PII from GA4