Every prompt you type into a commercial AI system becomes a data point. It is routed through infrastructure you don’t control, processed by systems you can’t inspect, stored in databases governed by policies you didn’t negotiate, and potentially absorbed into training pipelines that will reproduce fragments of your input to strangers for years to come.

This is not speculation. It is the documented, default behavior of every major AI provider operating today. OpenAI, Google, Anthropic, Meta, and their peers have built extraordinarily powerful language systems on a foundation of data extraction. The consumer interfaces – ChatGPT, Gemini, Claude.ai, Meta AI – are designed to be frictionless. That frictionlessness is the problem.

This guide provides a concrete, layered approach to using AI systems while minimizing the data you surrender. No single technique provides complete protection. The goal is defense in depth: multiple overlapping strategies that collectively reduce your exposure surface from total to manageable.

Step 1: Understand What You’re Exposing

Before you can protect anything, you need to understand what’s at risk. Every interaction with a commercial AI system exposes at least four categories of data.

Prompt content. The text you type, including any personally identifiable information (PII), proprietary business data, legal privileged information, or medical details embedded in your questions. This is the most obvious exposure vector and the one most people think about.

Behavioral metadata. Session duration, prompt frequency, topic patterns, time-of-day usage, feature engagement, and interaction sequences. This data is almost never covered by training opt-outs because providers classify it as product analytics. It is, nonetheless, a detailed behavioral fingerprint.

Network metadata. Your IP address, browser fingerprint, device characteristics, timezone, language settings, and geolocation. These are transmitted with every HTTP request before your prompt even reaches the AI model.

Account identity. Your email address, phone number (if required for verification), payment information (if on a paid tier), and the full history of every conversation tied to your account.

The architecture of exposure is layered. Addressing only one layer – say, turning on an opt-out toggle – while ignoring the others provides a false sense of security. Effective privacy requires action at every layer.

Step 2: Choose the Right Access Tier

The single most impactful decision you can make is how you access the AI system. Consumer interfaces and API access operate under fundamentally different privacy regimes.

Consumer Tier (ChatGPT, Claude.ai, Gemini)

Consumer tiers are the default experience. You sign up with an email address, optionally provide a phone number, and interact through a web or mobile interface. The privacy characteristics of this tier are consistently poor across providers:

  • Training data defaults. Your conversations are used for model training unless you actively opt out. Even then, opt-out mechanisms are architecturally insufficient.
  • Conversation persistence. Your entire conversation history is stored server-side, often indefinitely, linked to your account.
  • Full identity binding. Every prompt is associated with your verified identity (email, phone, payment method).
  • Broad data sharing. Terms of service typically permit sharing with subprocessors, affiliates, and in response to legal requests.

If you must use the consumer tier, the minimum steps are:

  1. Create a dedicated, compartmentalized account. Use a privacy-focused email provider (ProtonMail, Tuta) with no connection to your primary identity. Do not reuse passwords.
  2. Disable training data usage. In ChatGPT: Settings > Data Controls > disable “Improve the model for everyone.” In Claude.ai: the toggle is under Privacy settings. In Gemini: navigate through Google’s activity controls. These toggles are necessary but insufficient – they cannot undo retroactive training.
  3. Disable conversation history where possible. In ChatGPT, you can disable history (which also disables training for those conversations). In Claude.ai, similar options exist in settings.
  4. Never input PII, credentials, proprietary code, or legally privileged information. Treat the consumer interface as a public terminal. If you wouldn’t write it on a whiteboard in a conference center, don’t type it into a consumer AI chat.

API Tier

API access is structurally different. OpenAI, Anthropic, Google, and most other providers maintain explicit contractual commitments that API data is not used for model training. This is not because they respect your privacy more – it’s because enterprise and developer customers demanded it as a contractual requirement.

The privacy advantages of API access include:

  • No training by default. API inputs and outputs are excluded from training pipelines per the API terms of service and, for enterprise customers, per Data Processing Agreements (DPAs).
  • Reduced data retention. Most providers retain API data for 30 days for abuse monitoring, then shred it. Some offer zero-retention tiers.
  • No account-linked history. API requests are associated with an API key, not a user profile with email and phone number.
  • Programmatic control. You control what data is sent, how it is formatted, and can implement client-side PII stripping before any data leaves your infrastructure.

To use the API tier effectively:

  1. Obtain API access from your preferred provider. This requires a developer account and payment method.
  2. Use the API through a local client – not through the provider’s web playground, which may operate under consumer terms. Tools like llm (Simon Willison’s CLI), aichat, or custom scripts give you direct API access from your terminal.
  3. Implement client-side PII detection and tokenization. Before any prompt reaches the API, scan it for names, email addresses, phone numbers, addresses, and other identifiers. Replace them with tokens. Re-inject the real values into the response on the client side. This is the approach Stealth Cloud’s PII engine takes by default.
  4. Review the provider’s DPA and confirm training exclusion, retention period, and subprocessor list. Do not assume – verify.

The cost difference between consumer and API tiers is often negligible for individual users. At current pricing, a heavy individual user might spend $20-40/month on API calls – comparable to a ChatGPT Plus or Claude Pro subscription, but with fundamentally stronger privacy guarantees.

Step 3: Harden Your Network Layer

Even with API access and training opt-outs, your network traffic reveals metadata that identifies you. IP addresses, TLS fingerprints, and DNS queries all leak information about who is making requests and from where.

VPN Considerations

A reputable VPN hides your IP address from the AI provider. This prevents the provider from associating your requests with your geographic location and ISP. However, not all VPNs are equally trustworthy.

Requirements for a privacy-appropriate VPN:

  • No-log policy verified by independent audit. Marketing claims are insufficient. Look for audits by firms like Cure53, Deloitte, or PricewaterhouseCoopers. Mullvad, IVPN, and Proton VPN have all published audit results.
  • Payment without identity. Mullvad accepts cash sent by mail. IVPN accepts Monero. If you pay for a VPN with a credit card linked to your real name, you’ve traded IP-level anonymity for payment-level identity linkage.
  • Jurisdiction outside Five/Nine/Fourteen Eyes is preferable but not strictly necessary if the no-log policy is cryptographically enforced (i.e., the provider genuinely cannot produce logs even under compulsion, because the logs don’t exist).

Tor Considerations

The Tor network provides stronger anonymity than VPNs by routing traffic through three independent relays, preventing any single node from knowing both the source and destination of a request. However, using Tor with AI APIs introduces practical challenges:

  • Latency. Tor adds 200-800ms of latency per request. For streaming responses, this compounds into noticeably degraded performance.
  • Exit node blocking. Many AI providers rate-limit or block requests from known Tor exit nodes.
  • Account association. If you authenticate with an API key over Tor, the API key itself becomes the identifier. Tor protects your IP, not your API key identity.

Tor is most useful for one-off, unauthenticated interactions with AI systems – for example, using a provider’s free tier through the Tor Browser without creating an account. For sustained API usage, a properly vetted VPN is more practical.

DNS Privacy

Ensure your DNS queries don’t leak to your ISP. Use DNS-over-HTTPS (DoH) or DNS-over-TLS (DoT) with a privacy-respecting resolver such as Quad9 (9.9.9.9) or Cloudflare’s 1.1.1.1 (with the privacy-focused configuration that doesn’t log query data).

Step 4: Run Models Locally

The most effective way to prevent data exposure to AI providers is to never send data to AI providers at all. Local model execution keeps your prompts, your data, and your behavioral patterns entirely on hardware you control.

The Current State of Local Models

The local model ecosystem has matured rapidly. As of early 2026, models running on consumer hardware can match or approach the capabilities of cloud-hosted systems for many tasks:

  • Llama 3.3 70B (Meta, open weights) delivers strong general-purpose performance and runs on a workstation with 48GB+ of VRAM or quantized on Apple Silicon with 64GB+ of unified memory.
  • Mistral Large and Mixtral variants provide competitive reasoning and coding capabilities.
  • Qwen 2.5 series offers strong multilingual performance.
  • DeepSeek-V3 and DeepSeek-R1 provide excellent reasoning at various parameter counts.

For most knowledge work – drafting, summarizing, brainstorming, code review, analysis – a well-tuned 13B-70B parameter model running locally is adequate.

Local Inference Tools

  • Ollama is the simplest entry point. A single command (ollama run llama3.3) downloads and runs a model locally. It exposes an OpenAI-compatible API on localhost, making it a drop-in replacement for cloud API calls in most toolchains.
  • llama.cpp provides optimized inference for GGUF-quantized models across CPU and GPU. It’s the engine behind Ollama and many other tools.
  • LM Studio offers a graphical interface for downloading, configuring, and chatting with local models. Useful for users who prefer a GUI over terminal commands.
  • vLLM and TGI (Text Generation Inference) are production-grade serving frameworks for teams running local models at scale.

Hardware Requirements

Running models locally requires adequate hardware. The practical minimums:

Model SizeRAM/VRAM RequiredExample Hardware
7-13B (quantized)8-16 GBM2/M3 MacBook, gaming GPU
30-34B (quantized)24-32 GBM2/M3 Pro/Max, RTX 4090
70B (quantized)48-64 GBM2/M3 Ultra, dual GPU
70B (full precision)140+ GBMulti-GPU server

Apple Silicon Macs with unified memory are particularly effective for local inference because the CPU and GPU share the same memory pool, eliminating the VRAM bottleneck that limits NVIDIA GPU deployments.

Limitations of Local Models

Local models are not a universal solution. They currently lag behind frontier cloud models (GPT-4o, Claude 3.5 Opus, Gemini Ultra) in complex reasoning, nuanced instruction following, and very long context tasks. For high-stakes work requiring the best available model, you may still need to access cloud APIs – but you can do so through the privacy-hardened API approach described in Step 2, with PII stripping, rather than through consumer interfaces.

Step 5: Use Privacy-First Intermediaries

A growing category of tools sits between you and AI providers, acting as privacy-preserving proxies. These intermediaries strip identifying information before forwarding your requests.

What to look for in a privacy intermediary:

  • Client-side PII detection and tokenization. The intermediary should identify and replace personal information before any data leaves your device or enters the intermediary’s infrastructure.
  • Zero-persistence architecture. The intermediary should not store your prompts, responses, or session data. Processing should occur in memory only, with cryptographic shredding on session termination.
  • Metadata stripping. The intermediary should remove your IP address, user agent, and other identifying headers before forwarding requests to the AI provider.
  • Transparent architecture. The intermediary’s privacy claims should be verifiable – ideally through open-source code, independent audits, or architectures that make data collection technically impossible rather than merely prohibited by policy.

This is the architectural approach that Stealth Cloud’s Ghost Chat implements: client-side PII tokenization via a WebAssembly engine, AES-256-GCM encryption of sanitized prompts, metadata stripping at the edge, and zero-persistence infrastructure that cannot retain data even if compelled to.

Step 6: Practice Prompt Hygiene

Technology alone is insufficient. The most privacy-hardened infrastructure cannot protect you from yourself. Prompt hygiene – the discipline of controlling what you input – is the most reliable privacy measure available.

Rules of prompt hygiene:

  1. Never include real names. Use pseudonyms or role descriptions (“the client,” “the patient,” “Employee A”).
  2. Generalize specifics. Instead of “our Q3 revenue was $4.2M,” write “quarterly revenue in the low single-digit millions.” The AI can work with approximations.
  3. Strip context that identifies. Remove company names, project codenames, dates of specific events, and any detail that could be cross-referenced to identify you or your organization.
  4. Decompose sensitive queries. Instead of asking one question that reveals your full situation, break it into multiple abstract questions across separate sessions. No single prompt should contain enough context to reconstruct your identity or business situation.
  5. Review before sending. Read every prompt before submission and ask: if this prompt were leaked publicly, could it be traced back to me or cause harm? If yes, redact or restructure.

Step 7: Audit and Verify

Privacy is not a configuration you set once. It’s a practice you maintain.

Monthly verification checklist:

  • Confirm training opt-out settings are still active (providers have been known to reset these during updates).
  • Review your conversation history for any sessions that should have been burned but weren’t.
  • Check provider changelog and terms of service updates for privacy-relevant changes.
  • Verify your VPN’s no-log audit is current (audits older than 18 months are stale).
  • Test your PII stripping pipeline with known PII inputs to confirm detection rates.
  • Review API key permissions and rotate keys quarterly.

The Layered Defense Model

No single technique in this guide provides complete privacy. The architecture of protection is layered:

LayerTechniqueProtects Against
Access tierAPI over consumerTraining data ingestion
NetworkVPN/Tor + DoHIP and metadata correlation
ClientPII stripping + encryptionContent exposure
InfrastructurePrivacy intermediaryProvider-side logging
ExecutionLocal modelsAll cloud-based risks
BehavioralPrompt hygieneSelf-inflicted exposure
OperationalRegular auditsConfiguration drift

Each layer addresses a different threat vector. The more layers you implement, the smaller your exposure surface. The first three layers – API access, VPN, and PII stripping – provide the highest impact-to-effort ratio for most users.

What Stealth Cloud Builds Toward

The steps in this guide are practical but manual. They require vigilance, technical knowledge, and ongoing discipline. Stealth Cloud exists to automate this entire stack: client-side PII tokenization, zero-persistence infrastructure, wallet-based authentication that requires no email or identity, and cryptographic shredding that makes data recovery physically impossible.

The goal is not to make privacy harder. It’s to make privacy the default – an architecture where you don’t have to think about every layer because every layer is handled before your first prompt leaves the client.

Until that infrastructure is universally available, the steps above represent the current best practice. Execute them. Maintain them. And treat every AI interaction as what it is: a data transaction with terms you should negotiate, not accept.