Text Anonymization

Text – Anonymize node allows you to anonymize sensitive information from input text using pre-trained named entity recognition (NER) models. It replaces detected entities (like names or organizations) with a specified masking character. 

Input Output
Text – Raw input for anonymization Text – Anonymized version
Classifications – Optional filters

How It Works

  • Scans text for sensitive entities (names, organizations, etc.)
  • Model Selection (Choose NER model based on language and domain)
  • Replaces detected entities with your chosen masking character (Character used to replace entities, e.g., █ or *))
  • Preserves overall text structure while protecting privacy

Configuration:

Example:

Input: John Smith is a patient at St. Mary’s Hospital.

Output: ████ █████ is a patient at ██████████████████.

 

Tutorial

Best Practices:

  • Select an appropriate model for your content language and domain
  • Use with preprocessor modules for optimal text handling
  • Can be integrated into classification pipelines for comprehensive processing