GenogramAI
Security & Compliance

HIPAA De-identification Policy

How GenogramAI protects patient health information when using AI features

Last updated: March 9, 2026
|Version: 1.0

1. Overview

GenogramAI uses AI (Google Gemini) to help users build and analyze family genograms. Before any data is sent to the AI, we apply HIPAA Safe Harbor de-identification to strip all 18 categories of protected health information (PHI). This ensures no individually identifiable health information ever leaves your device for AI processing.

This policy documents exactly what data is sent, what is redacted, and how we maintain an auditable trail of compliance.

2. De-identification Method: HIPAA Safe Harbor

We follow the Safe Harbor method defined in 45 CFR 164.514(b)(2), which requires the removal of 18 specific identifiers. Our implementation addresses every category:

Safe Harbor IdentifierOur ApproachStatus
NamesReplaced with Person_1, Person_2, etc.
Geographic data (below state)City, address, birthPlace, deathPlace stripped; only state/country sent
Dates (except year)Birth/death month and day stripped; only year sent
Phone numbersNot collected
Fax numbersNot collected
Email addressesNever sent to AI
SSNNot collected
Medical record numbersNot collected
Health plan beneficiary numbersNot collected
Account numbersNot collected
Certificate/license numbersNot collected
Vehicle identifiersNot collected
Device identifiersNot collected
Web URLsNot sent to AI
IP addressesNot sent to AI
Biometric identifiersNot collected
Full-face photosNot sent to AI (image processing extracts only shapes/lines)
Any unique identifying numberInternal IDs replaced with short tokens (n1, n2, etc.)

3. What Data Is Sent to AI

Only de-identified, non-identifying attributes are transmitted to the AI model:

Sent to AI (Safe Fields)

  • Gender
  • Birth year (year only)
  • Death year (year only)
  • Living/deceased status
  • Index person flag
  • Occupation
  • Education level
  • Religion
  • Social class
  • Country code
  • State/province
  • Sexual orientation
  • Heritage label
  • Twin type
  • Pet species
  • Relationship type
  • Emotional connection type
  • Child connection type

Never Sent (Redacted)

  • First name
  • Last name
  • Maiden name
  • Middle name
  • Nickname
  • Alternative/changed name
  • Birth month & day
  • Birth place
  • Death month & day
  • Death place
  • Cause of death
  • City
  • Street address/location
  • Burial place
  • Notes (anonymized via token replacement)

4. Technical Implementation

4.1 Name Anonymization

Every person in the genogram is assigned a sequential anonymous identifier (Person_1, Person_2, etc.) before any data is transmitted. A mapping table is maintained only in the client's browser memory and is never persisted or transmitted. After the AI responds, anonymous identifiers are converted back to real names for display.

4.2 Date Truncation

All dates are truncated to year-only precision before transmission. Birth month, birth day, death month, and death day are stripped entirely. Only birth year and death year are sent, which is consistent with HIPAA Safe Harbor requirements (years are permitted when not combined with other identifying information).

4.3 Geographic Generalization

Geographic data is limited to state/province and country level. City names, street addresses, ZIP codes, birth places, death places, and burial places are never transmitted. Only the country code (ISO 3166-1) and state/province are sent when available.

4.4 Free-Text Anonymization

User-entered notes and free-text prompts are processed through a token-replacement system that substitutes any real names found in the text with their corresponding Person_N identifiers before transmission. This prevents accidental PHI disclosure through narrative text.

4.5 Internal ID Obfuscation

Internal database identifiers (UUIDs) are replaced with short sequential tokens (n1, n2, n3, etc.) before transmission. This prevents any possibility of cross-referencing records via ID values.

5. Audit Trail

Every AI API call generates an immutable audit log entry that records:

  • What function was called (e.g., streamEditGenogramWithChat, streamGenogramInsights)
  • What fields were sent (the safe field categories)
  • What fields were redacted (the PHI field categories)
  • De-identification confirmations: names anonymized, dates truncated, geography generalized
  • Request metadata: prompt character count (not content), response time, success status
  • Timestamp and user context

The audit log never stores the actual prompt content, AI responses, or any PHI. It only records metadata proving that de-identification was applied before each AI call.

Audit Log Schema

ai_deid_audit_logs
├── id (auto-increment)
├── created_at (timestamp)
├── user_id (UUID, nullable)
├── ai_function (text) — which AI function was called
├── model (text) — e.g., "gemini-2.5-flash"
├── ai_version (text) — e.g., "8.0"
├── node_count (int) — number of people in genogram
├── edge_count (int) — number of relationships
├── fields_sent (text[]) — safe field categories sent
├── fields_redacted (text[]) — PHI field categories redacted
├── names_anonymized (boolean) — always true
├── dates_truncated (boolean) — always true
├── geo_generalized (boolean) — always true
├── prompt_length (int) — character count only
├── response_time_ms (int)
├── was_successful (boolean)
└── user_agent (text)

6. Clinical Mode (Unlimited Plan)

For users on the Unlimited plan, GenogramAI offers a Clinical Mode that provides an additional layer of protection:

  • All genogram data is encrypted locally using AES-256 encryption
  • Data is stored only on the user's device — never synced to cloud storage
  • AI features still apply the same de-identification before any AI calls
  • This provides defense-in-depth: even if de-identification had a gap, the data at rest is encrypted

7. Data Flow Summary

1

User enters data or sends a request

Names, dates, locations, and notes are stored locally in the browser.

2

De-identification applied

Names → Person_N tokens. Dates → year only. Geography → state/country only. IDs → short tokens. Notes → anonymized.

3

Audit log recorded

An immutable entry is logged with proof of de-identification (field lists, boolean confirmations, metadata only).

4

De-identified data sent to AI

Only safe fields (gender, birth year, occupation, etc.) are transmitted to Google Gemini.

5

AI response re-identified locally

Person_N tokens in the response are mapped back to real names in the browser only.

8. Important Disclaimer

GenogramAI applies de-identification before transmission so that the data sent to AI services does not constitute PHI under HIPAA. De-identified data is not subject to HIPAA requirements per 45 CFR 164.514(a).

GenogramAI does not currently hold a Business Associate Agreement (BAA) with Google for Gemini API services. This is not required because properly de-identified data under Safe Harbor is not PHI. However, we continuously review and improve our de-identification processes to maintain the highest standards of patient privacy.

For users handling actual patient data in clinical settings, we recommend using Clinical Mode (Unlimited plan) for the additional protection of local AES-256 encryption with no cloud sync.

9. Contact

For questions about our de-identification practices or to report a privacy concern, contact us at support@genogramai.com.