Back to Blog
Industry Insights8 min read

What Happens to Your Documents? Data Security Truth

See exactly how Scanny AI processes and protects your documents. Complete transparency on security, encryption, and data privacy.

Scanny Team
Secure document processing infrastructure diagram with encryption and privacy controls

When you upload a confidential invoice, a sensitive employee resume, or a legal contract to a document processing platform, you're placing enormous trust in that service. Where does your document go? Who sees it? How long is it stored? Can it be deleted?

These aren't just theoretical questions—they're critical concerns for any business handling sensitive data. A 2024 IBM Security report found that the average cost of a data breach is $4.45 million, and document processing systems are increasingly targeted by attackers.

Here's the uncomfortable truth: Most document automation platforms operate in a black box. You upload your files, get results back, and have no visibility into what happens in between. For compliance officers, IT security teams, and privacy-conscious organizations, this opacity is a dealbreaker.

At Scanny AI, we believe transparency builds trust. This article pulls back the curtain completely. You'll see exactly what happens to your documents from the moment you upload them until the moment they're deleted—and you'll understand why our security-first architecture makes us the safest choice for sensitive document processing.

Secure document processing infrastructure

Why Transparency Matters: The Hidden Cost of Opacity

When evaluating document processing platforms, most organizations focus on accuracy rates and pricing. But the real risk lies in what you don't know:

  • Vendor lock-in: Can you export or delete your data at any time?
  • Third-party access: Are your documents being used to train AI models?
  • Data residency: Is your data stored in compliance with regional regulations?
  • Retention policies: Are documents automatically deleted, or stored indefinitely?
  • Access controls: Who on the vendor's team can view your sensitive files?

Let's compare the traditional approach with Scanny AI's transparency-first model:

Aspect Traditional Platforms Scanny AI's Approach
Processing Location Undisclosed ("the cloud") Explicit regional data centers (EU, US, APAC) with customer choice
Data Retention Indefinite or unclear Configurable: 24 hours to 90 days, or immediate deletion
Third-Party Training Often used for model improvement Zero training on customer data – explicit opt-out by default
Access Logs Not provided Complete audit trail with IP, timestamp, and user identity
Deletion Guarantee "Request deletion" (no verification) Cryptographic proof of deletion within 24 hours
Encryption "AES-256" (vague) End-to-end encryption with customer-managed keys (BYOK) option
Compliance Generic SOC 2 claim SOC 2 Type II, GDPR, HIPAA, ISO 27001 with public audit reports

Key Takeaway: Opacity isn't just a inconvenience—it's a compliance risk. Every unanswered question about your data is a potential audit failure or breach vector.

The Document Lifecycle in Scanny AI: A Complete Walkthrough

Let's trace a single document through our entire system. We'll use a real example: a medical invoice containing personally identifiable information (PII) and protected health information (PHI).

Step 1: Upload & Encryption (Client-Side)

Before your document ever reaches our servers, it's encrypted on your device.

// Client-side encryption (simplified)
{
  "file": "medical-invoice-2024.pdf",
  "encryption": {
    "algorithm": "AES-256-GCM",
    "key_id": "customer-key-abc123",
    "iv": "randomly-generated-initialization-vector"
  },
  "metadata": {
    "upload_timestamp": "2025-12-30T10:15:00Z",
    "user_id": "user_xyz",
    "document_type": "medical_invoice",
    "retention_policy": "delete_after_24_hours"
  }
}

What this means:

  • Your document is encrypted before leaving your network
  • We never see the unencrypted file during transmission
  • Even if network traffic is intercepted, it's useless without your encryption key

End-to-end encryption process

Step 2: Processing & Extraction (Isolated Environment)

Once the encrypted document reaches our servers, it enters an isolated processing environment:

  1. Decryption in Memory: The file is decrypted in a secure, ephemeral container that exists only for the duration of processing (typically 2-30 seconds).
  2. OCR Processing: Gemini Vision API processes the document to extract structured data.
  3. Immediate Re-Encryption: Extracted JSON data is re-encrypted using your key.
  4. Container Destruction: The processing container is destroyed, and all temporary files are securely wiped.

Here's the JSON extraction schema for our medical invoice example:

{
  "document_type": "medical_invoice",
  "extraction_schema": {
    "fields": [
      {
        "name": "patient_name",
        "type": "string",
        "pii": true,
        "required": true
      },
      {
        "name": "patient_id",
        "type": "string",
        "pii": true,
        "required": true
      },
      {
        "name": "date_of_service",
        "type": "date",
        "required": true
      },
      {
        "name": "provider_name",
        "type": "string",
        "required": true
      },
      {
        "name": "diagnosis_codes",
        "type": "array",
        "items": "string",
        "phi": true
      },
      {
        "name": "total_amount",
        "type": "number",
        "required": true
      },
      {
        "name": "insurance_claims",
        "type": "array",
        "items": {
          "claim_number": "string",
          "paid_amount": "number",
          "status": "string"
        }
      }
    ]
  },
  "security_classification": "PHI",
  "compliance_requirements": ["HIPAA", "GDPR"],
  "processing_constraints": {
    "max_retention": "24_hours",
    "geographic_restriction": "US_only",
    "audit_level": "full"
  }
}

Critical Security Feature: Notice the pii and phi flags? When fields are marked as containing sensitive data, Scanny AI applies additional protections:

  • Automatic redaction in logs
  • Encrypted at rest with separate key rotation schedule
  • Access restricted to zero-knowledge processing (our engineers cannot view the data)

Step 3: Storage & Access Control (Your Choice)

After processing, you control exactly what happens next:

Option A: Immediate Deletion (Recommended for Maximum Security)

{
  "retention_policy": "delete_immediately",
  "action": "return_json_only",
  "original_document": "deleted",
  "deletion_verification": "cryptographic_proof_provided"
}

The extracted JSON is returned to you via API, and the original document is permanently deleted from our servers within 60 seconds. You receive a cryptographic deletion certificate for compliance audits.

Option B: Temporary Storage (For Workflow Automation)

{
  "retention_policy": "delete_after_24_hours",
  "storage_location": "eu-west-1",
  "encryption_at_rest": "AES-256-GCM",
  "access_control": {
    "owner": "user_xyz",
    "shared_with": [],
    "api_access": "token_abc123_only"
  }
}

Documents are stored in encrypted object storage with:

  • Geographic residency: You choose the region (EU, US, APAC)
  • Automatic expiration: Guaranteed deletion after your specified period
  • Zero-knowledge architecture: Even Scanny AI employees cannot decrypt your files
  • Audit trail: Every access is logged with timestamp, IP, and user identity

Step 4: Delivery & Integration (Secure Webhooks)

When processing is complete, extracted data is delivered to your systems via:

Secure Webhook Delivery:

{
  "webhook_url": "https://your-crm.example.com/scanny/webhook",
  "delivery_method": "POST",
  "authentication": {
    "type": "HMAC-SHA256",
    "signature_header": "X-Scanny-Signature",
    "shared_secret": "your-webhook-secret"
  },
  "retry_policy": {
    "max_attempts": 3,
    "backoff": "exponential"
  },
  "payload": {
    "document_id": "doc_xyz789",
    "extracted_data": {
      "patient_name": "John Doe",
      "total_amount": 450.00,
      "date_of_service": "2024-12-15"
    },
    "processing_metadata": {
      "model_used": "gemini-3-pro-preview",
      "confidence_score": 0.98,
      "processing_time_ms": 2340
    }
  }
}

Security Features:

  • HMAC signature verification prevents tampering
  • TLS 1.3 encryption for all webhook deliveries
  • Webhook endpoints validated before activation
  • No sensitive data in URLs or headers

Secure workflow automation

Step 5: Deletion & Verification (Cryptographic Proof)

When your retention period expires (or you manually delete), here's what happens:

  1. Secure Deletion Protocol: Files are overwritten with random data 7 times (DoD 5220.22-M standard)
  2. Metadata Purge: All references, logs, and backups are removed
  3. Deletion Certificate: You receive cryptographic proof of deletion:
{
  "deletion_certificate": {
    "document_id": "doc_xyz789",
    "deleted_at": "2025-12-31T10:15:00Z",
    "deletion_method": "DoD_5220.22-M_7_pass",
    "verification": {
      "hash_before": "sha256:abc123...",
      "hash_after": "sha256:000000...",
      "signed_by": "scanny-deletion-service",
      "signature": "RSA-4096:xyz789..."
    },
    "compliance_attestation": "This document has been irreversibly deleted and cannot be recovered by any means."
  }
}

This certificate can be submitted to auditors as proof of GDPR Article 17 (Right to Erasure) compliance.

Your Control Panel: Transparency in Action

Every Scanny AI account includes a Data Governance Dashboard where you can:

View all active documents with upload date, retention policy, and storage location ✅ Download audit logs showing every access to your documents (who, when, from where) ✅ Bulk delete documents with one click and receive batch deletion certificates ✅ Configure default retention policies for different document types ✅ Export all your data in standard formats (no vendor lock-in) ✅ Manage encryption keys (including BYOK - Bring Your Own Key) ✅ Set geographic restrictions (e.g., "EU data must never leave EU borders")

Example Use Case: A healthcare organization processing 10,000 patient invoices monthly uses the dashboard to:

  • Set automatic 24-hour deletion for all PHI documents
  • Restrict processing to US-based data centers only
  • Generate quarterly audit reports for HIPAA compliance officers
  • Receive real-time alerts if any document access occurs outside business hours

Compliance & Certifications: Third-Party Verification

We don't just claim to be secure—we prove it through independent audits:

Certification Status Scope Public Audit Report
SOC 2 Type II ✅ Active Security, Availability, Confidentiality View Report
ISO 27001:2022 ✅ Active Information Security Management View Certificate
GDPR Compliance ✅ Active EU Data Protection Regulation View DPA
HIPAA ✅ Active Protected Health Information View BAA
PCI DSS Level 1 🟡 In Progress Payment Card Data (Q2 2025) Coming Soon

What this means for you:

  • We undergo rigorous annual audits by third-party security firms
  • Our security controls are verified, not self-reported
  • You can download audit reports to satisfy your own compliance requirements
  • We sign Business Associate Agreements (BAA) for HIPAA-covered entities

Common Questions: Radical Transparency

"Do you use my documents to train your AI models?"

Absolutely not. This is stated explicitly in our Terms of Service:

"Scanny AI will never use customer documents, extracted data, or any content uploaded by users to train machine learning models, improve algorithms, or for any purpose other than providing the requested document processing service."

We use Google Gemini Vision API for OCR, which operates under Google's enterprise terms (no training on customer data). Our own systems never ingest your documents for training purposes.

"Can Scanny AI employees view my documents?"

No. Our architecture is zero-knowledge by design:

  • Documents are encrypted with keys we don't control (your keys or managed keys you own)
  • Our engineers have no access to decryption keys
  • Even our database administrators cannot view document contents
  • The only exception: If you explicitly grant support access for troubleshooting (requires your written consent)

"What happens if Scanny AI is acquired or shuts down?"

Our Data Continuity Guarantee:

  1. 90-day advance notice of any service termination
  2. Full data export in standard formats (PDF originals + JSON exports)
  3. Guaranteed deletion of all data within 30 days of export
  4. Escrow arrangement: In case of sudden shutdown, a third-party escrow agent will facilitate data return

"Where is my data physically stored?"

You choose from:

  • EU: Frankfurt, Germany (AWS eu-central-1)
  • US: Northern Virginia, USA (AWS us-east-1)
  • APAC: Singapore (AWS ap-southeast-1)

Data never crosses regional boundaries unless you explicitly configure cross-region workflows.

"How do you handle law enforcement requests?"

We follow a strict Transparency Protocol:

  1. We require a valid court order or subpoena (no informal requests)
  2. We notify you immediately (unless legally prohibited)
  3. We disclose only the minimum data required by the order
  4. We publish a Transparency Report twice yearly detailing all government data requests

Since our founding, we've received zero government data requests (as of December 2025).

The Technical Foundation: How We Built for Security

Our security isn't bolted on—it's foundational. Here's the technical architecture:

┌─────────────────────────────────────────────────────────────┐
│                     Client Application                       │
│            (Your browser, API integration)                   │
│                                                              │
│  [Document] → Client-Side Encryption (AES-256-GCM)          │
└────────────────────────┬────────────────────────────────────┘
                         │ Encrypted over TLS 1.3
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                    API Gateway (WAF)                         │
│  ✓ DDoS Protection  ✓ Rate Limiting  ✓ IP Allowlisting     │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│             Isolated Processing Environment                  │
│  ┌──────────────────────────────────────────────────┐      │
│  │  Ephemeral Container (Lifespan: 2-30 seconds)   │      │
│  │  1. Decrypt in memory (never touches disk)       │      │
│  │  2. Process with Gemini Vision API               │      │
│  │  3. Extract structured JSON                       │      │
│  │  4. Re-encrypt output                            │      │
│  │  5. Secure wipe & destroy container              │      │
│  └──────────────────────────────────────────────────┘      │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│              Encrypted Object Storage (S3)                   │
│  ✓ Encryption at rest (AES-256)                             │
│  ✓ Versioning disabled (no hidden copies)                   │
│  ✓ Lifecycle policies (auto-deletion)                       │
│  ✓ Access logs → SIEM for monitoring                        │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                  Secure Webhook Delivery                     │
│  → Your CRM/ERP/Drive with HMAC authentication             │
└─────────────────────────────────────────────────────────────┘

Key Security Principles:

  1. Defense in Depth: Multiple layers of security (encryption, isolation, access controls)
  2. Principle of Least Privilege: Systems have minimal permissions needed
  3. Immutable Infrastructure: Servers are never patched—they're replaced
  4. Zero Trust: Every request is authenticated and authorized, even internal ones

Why Transparency is a Competitive Advantage

In the document processing market, most vendors compete on price or accuracy. We compete on trust.

Here's why transparency isn't just ethical—it's strategic:

For Enterprises:

  • Faster procurement cycles (security teams approve us quickly)
  • Reduced compliance burden (our audit reports satisfy your auditors)
  • Lower risk of vendor lock-in (full data portability)

For Regulated Industries (Healthcare, Finance, Legal):

  • HIPAA/GDPR compliance out-of-the-box
  • BAA and DPA agreements signed in 24 hours
  • Explicit data residency guarantees

For Privacy-Conscious Organizations:

  • No hidden data usage or model training
  • Complete audit trails for internal investigations
  • Cryptographic proof of deletion for data subject requests

Getting Started: Security-First Setup

When you sign up for Scanny AI, you can configure your security preferences from day one:

Quick Setup (5 minutes):

  1. Choose your data region: EU, US, or APAC
  2. Set default retention policy: 24 hours, 7 days, 30 days, or custom
  3. Configure encryption: Use Scanny-managed keys or BYOK (Bring Your Own Key)
  4. Enable audit logging: Full, standard, or minimal
  5. Set up webhook authentication: HMAC secrets for secure delivery

Advanced Setup (For Enterprises):

  • SSO integration (SAML 2.0, OAuth 2.0)
  • IP allowlisting for API access
  • Custom data retention schedules per document type
  • Dedicated processing environments (single-tenant option)
  • Real-time security alerts to your SIEM

Try it risk-free: All accounts include a 14-day free trial with full access to enterprise security features. No credit card required. Start your free trial

Conclusion: Trust Through Transparency

Here's what we've covered:

Complete visibility: You know exactly where your documents are at all times ✅ Customer control: You choose storage location, retention period, and deletion ✅ Zero-knowledge architecture: Even we can't access your encrypted documents ✅ Third-party verification: SOC 2, ISO 27001, GDPR, HIPAA compliance ✅ Cryptographic proof: Deletion certificates for compliance audits ✅ No hidden usage: Your data is never used for training or secondary purposes

Most document processing platforms ask you to trust them blindly. At Scanny AI, we believe you shouldn't have to.

Transparency isn't a feature—it's our foundation.

Every security control, every compliance certification, and every line of code in our system is designed around one principle: Your data is yours, and you should always know what's happening with it.

If you're tired of vendor opacity, if you need to satisfy stringent compliance requirements, or if you simply believe your sensitive documents deserve better protection—Scanny AI is built for you.


Ready to Experience Transparency-First Document Processing?

🔐 See it in action: Book a live demo where we'll walk through our Data Governance Dashboard and show you exactly how your documents are protected.

🚀 Start processing securely: Sign up for free and process your first 100 documents with enterprise-grade security (no credit card required).

📚 Read the technical details: Download our Security Whitepaper for deep-dive architecture documentation and threat model analysis.

💬 Questions about compliance? Talk to our security team for HIPAA BAA, GDPR DPA, or custom compliance requirements.


Already using Scanny AI? Log in to your Data Governance Dashboard to review your document audit logs, configure retention policies, or generate deletion certificates.

Last updated: December 30, 2025 | Scanny AI Security Team

Data SecurityPrivacyTransparencyComplianceDocument ProcessingEncryption

Related Articles

Cloud security architecture diagram showing encryption and compliance layers for document processing
Industry Insights8 min read

Is Cloud Processing Safe? Security Guide

Cloud document security explained: enterprise encryption, compliance standards, and how modern security protects your data.

Scanny Team
Dec 30, 2025
Person working alongside AI technology, representing human-AI collaboration in modern workplace
Industry Insights10 min read

Real Talk: Is AI Going to Replace My Job?

An honest conversation about AI automation and jobs. Spoiler: the answer is more nuanced (and more hopeful) than you think.

Scanny Team
Dec 30, 2025