Skip to content

Pseudonymization vs Anonymization: Reversible and Irreversible Data Protection under GDPR

Pseudonymization and anonymization are two different approaches to protecting personal data. The distinction matters for GDPR compliance and for building analytics that balance privacy and utility.

Pseudonymization: Reversible Protection

Pseudonymization replaces identifying information with pseudonyms or artificial identifiers so the data can be restored only with separately stored additional information.

GDPR Definition

According to Article 4(5) GDPR, pseudonymization means "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information."

Key Characteristics

Reversible. Original data can be restored when the key or mapping table is available.

Still personal data. Pseudonymized data remains personal data under GDPR if the controller can de-anonymize it.

Technical pattern:

graph TD
    A[Original Data] --> B[Generate Pseudonym]
    B --> C[Create Mapping Table]
    C --> D[Secure Key Storage]
    B --> E[Pseudonymized Data]
    D --> F[Recovery Capability]
    E --> F

Methods

Replace sensitive data with tokens stored in a protected database.

  • Tokens have no mathematical link to the original
  • High security
  • Full control over de-anonymization

Use keyed cryptographic functions.

  • Deterministic for identical inputs
  • Comparison without de-anonymization
  • Strength depends on the key

Partially conceal identifiers while keeping structure.

  • Format preserved
  • Reversible
  • Useful for test environments

Pseudonymization Example in Analytics

Original User Data:

Email: [email protected]
IP: 192.168.1.100
Device ID: ABC123XYZ

After Pseudonymization:

User Token: USR_789456123
IP Hash: 4f3d2a1b9c8e7f6a
Device Hash: DEV_445566778

Mapping Table (stored separately):

USR_789456123 → [email protected]
4f3d2a1b9c8e7f6a → 192.168.1.100
DEV_445566778 → ABC123XYZ

Anonymization: Irreversible Protection

Anonymization is an irreversible process that prevents direct or indirect identification.

GDPR Principles

Irreversibility. Recital 26 GDPR requires that "information does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable."

Out of GDPR scope. Properly anonymized data sits outside GDPR.

Techniques

Combine records into groups for statistical results.

Pros:

  • Full individual protection
  • Preserves statistical significance
  • Suitable for reporting

Cons:

  • Loses detail
  • Limited analysis options
  • No individual tracking

Replace specific values with broader categories.

  • Age ranges instead of exact age
  • Regional grouping instead of precise addresses
  • Time intervals instead of exact timestamps

Remove identifying fields or records entirely.

  • Drop direct identifiers
  • Exclude unique records
  • Filter rare values

Pseudo-anonymization Risks

Many techniques considered anonymization actually represent pseudonymization, as data can be re-identified using additional information or modern analysis methods.

Comparison

CriteriaPseudonymizationAnonymization
ReversibilityReversibleIrreversible
GDPR StatusPersonal dataNon-personal data
Re-identification riskLow with proper implementationTheoretically zero
Data utilityHigh analytical valueLimited detail
Security focusKey and mapping protectionVerifying irreversibility

Application in Analytics

When to Pseudonymize

User sessions. Track behavior across sessions with the option to link data when needed.

A/B testing. Stable user groups for experiments while keeping result analysis.

Personalization. Personalized content without revealing identity to analytics.

When to Anonymize

Public reporting. Aggregated reports for publication without disclosure risk.

Research. Data for scientific work or new algorithms.

Long-term storage. Historical archives for trend analysis.

Practical Implementation

Pseudonymization for User Journeys:

// User receives permanent pseudonym
const userPseudonym = generatePseudonym(realUserId);

// Events linked to pseudonym
trackEvent('page_view', {
    user: userPseudonym,
    page: '/products',
    timestamp: Date.now()
});

Anonymization for Aggregated Reporting:

// Data aggregated without recovery possibility
const aggregatedStats = {
    timeRange: 'daily',
    totalViews: 15420,
    uniqueVisitors: 8756,
    topPages: ['/home', '/products', '/about'],
    // Individual users not recoverable
};

GDPR Requirements for Pseudonymization

Technical:

  • Store de-anonymization keys separately
  • Restrict access to mapping tables
  • Encrypt additional information

Organizational:

  • Separate roles and responsibilities
  • Audit access to de-anonymization
  • Key management policies

Recommendations

Pseudonymize when:

  • User tracking over time is needed
  • Personalization or targeting is required
  • Longitudinal studies are planned

Anonymize when:

  • Data goes public
  • Strict confidentiality is required
  • Individual identification is unnecessary

Pseudonymization:

  • Use cryptographically strong algorithms
  • Secure key storage
  • Rotate pseudonyms

Anonymization:

  • Combine multiple techniques
  • Run regular re-identification risk assessments
  • Document procedures and decisions

We have studied the effectiveness of different protection approaches. The choice between pseudonymization and anonymization depends on business needs, analytics requirements, and acceptable risk.

graph TD
    A[Personal Data] --> B{Data Linking Required?}
    B -->|Yes| C[Pseudonymization]
    B -->|No| D[Anonymization]
    C --> E[Reversible Protection]
    D --> F[Irreversible Protection]
    E --> G[Remains Personal Data]
    F --> H[Not Personal Data]

Effective protection requires planning and clear understanding of the available methods. The right choice keeps you compliant and keeps analytics valuable.

About AI participation in writing articles

This article, like many others on our site, was created, written and proofread by a team of developers. Of course, not without the participation of AI assistants. We don't hide this and believe that modern systems are already quite good at handling simple tasks and, relatively speaking, writing an article about Viewport yourself is quite strange. It won't come out significantly better and will take a lot of time. But providing basic understanding to beginner webmasters is necessary. Of course, after the article is written by assistants - there's always proofreading, and this is where not one or two people participate, and only after that the article is published.

Need Help with Data Protection?

Our analytics platform provides built-in tools for both pseudonymization and anonymization. Get full control over personal information processing with GDPR compliance.


Ready to take control of your web analytics? Try Statable free for 30 days — no credit card required, full feature access, GDPR-compliant by default. Start your free trial or view a live demo.