Ask First

Every chatbot company is losing money on free users. OpenAI’s inference bill is projected at $14 billion for 2026. Subscriptions have a ceiling. Advertising is the only business model that’s ever scaled with free usage. Chatbot companies will run ads. The question is how to do it without destroying the trust that makes a chatbot worth using.

ChatGPT rolled out ads on February 9. Perplexity tried and killed them. Both failed the same test: the moment a chatbot shows an ad it selected, the user wonders whether the answer is honest. The conflict of interest is structural. No amount of labeling fixes it.

The fix is to separate the signal from the sale, and to separate both from the chatbot. But first, it helps to distinguish three things that all get called “ads.”

Three Kinds of Ad

Ads you endure. A pre-roll video before the thing you came for. A sponsored box in your ChatGPT answer. A billboard on the highway. The ad is a tax on your attention, the price of accessing something else. You tolerate it. You never asked for it. Every platform that forces these ads is betting that the content is worth more than the interruption costs. Sometimes it is. The resentment accumulates anyway.

Ads disguised as experience. An affiliate review that reads like editorial. A native ad styled as a feed post. A chatbot recommendation that’s quietly sponsored. The commercial intent is hidden because revealing it would break the spell. These feel more pleasant than interruptions, until you realize the advice wasn’t honest. The trust damage is worse precisely because you didn’t see it coming.

Ads you’d miss if they were gone. You open Google Maps and search “restaurants near me.” The pins are ads. Nobody gets upset because removing them would make the product useless. Amazon search results are ads, and they’re exactly what you came for. Yelp listings are ads, and 90% of users buy within a week. When your intent is already commercial, relevant advertising is a service. Remove it and the experience gets worse.

The first two categories are what people mean when they say they hate ads. The third is what people mean when they say “I don’t mind ads if they’re relevant,” but the relevance has to be real, and the intent has to be theirs.

Forced ads poison the chatbot conversation even when the user’s intent is commercial, because the user can’t tell whether the answer is honest. Disguised ads are worse: they poison the conversation especially when it seems honest. The two-phase model is designed for the third category: surface commercial information only when the user’s conversation is already headed somewhere commercial, and only when the user asks.

Five dots growing brighter, from nearly invisible to warm amber — proximity to expertise rendered as light.

Two Phases

Phase one: proximity. A separate system monitors the conversation’s embedding, the vector representation of what’s being discussed. As the conversation moves through embedding space, a small indicator in the UI reflects how close the nearest advertiser is. A dot that brightens. A line that extends. Something peripheral, readable at a glance, ignorable by default.

The indicator maps directly to cosine distance. No interpretation layer, no algorithm deciding “is this relevant enough to show.” How close is the nearest expertise to where this conversation currently is? The answer is a number. The number is a brightness. If no advertiser has positioned anywhere nearby, the indicator is dark. There’s nothing to see.

The chatbot doesn’t know this indicator exists. It’s produced by a separate system that reads the conversation’s meaning but cannot write to the conversation. No auction has run. No advertiser has been selected. No money has moved. The user sees proximity to a region of expertise, not a business name.

This works because advertiser embeddings are public. An advertiser’s position in embedding space is a claim of expertise; hiding it would defeat the purpose. The exchange publishes the full catalog, and the publisher caches a local copy. Proximity is computed entirely on the publisher’s infrastructure: cosine distance between the conversation embedding and cached advertiser embeddings. No network call to the exchange, no request to log. The exchange doesn’t know the user exists until they tap.

Phase two: auction. The user taps the indicator. Now the full auction fires inside a TEE enclave: score = log(bid) - distance² / σ². Winner selected, second-price payment, result presented. The advertiser pays only for impressions the user asked for.

Phase one is a passive signal: continuous, ambient, no commercial transaction. Phase two is the market: competitive, priced, consented to. The user decides when to cross the boundary. There’s no threshold for the system to game. Just a distance rendered as a brightness, and a person who chooses when to act on it.

The order matters. Traditional advertising: pay money, get shown, hope for relevance. Permission marketing: ask for permission, then try to be relevant. This model: prove relevance, then earn permission, then get shown. The advertiser must position accurately in embedding space, close to the problem they actually solve. The indicator is the proof rendered visually. The user tapping is permission granted on the basis of proven relevance. The auction is the sale, after both conditions are met. Relevance is the prerequisite for an impression, not something optimized after the fact.

Browsing the Ad Space

Each message in the conversation produces a new embedding, a new position in the space. The indicator updates continuously.

The user says “my basement floods every spring.” The dot warms slightly. Waterproofing contractors exist in this region. The user continues: “I think it’s the sewer line.” Brighter. Sewer repair specialists are closer. The user asks: “what about French drains?” The brightness shifts. Each conversational turn moves the user through embedding space, and the indicator reflects what’s nearby at each step.

The user doesn’t have to notice. The natural conversation is the browse. But they could watch the indicator and deliberately explore, asking about alternatives, narrowing their problem, following the signal toward denser regions of expertise. Window shopping in embedding space. No store is entered until the user taps. No auction fires until they ask.

Seth Godin argued in Permission Marketing (1999) that advertising should be anticipated, personal, and relevant, and that marketers should seek permission before communicating. Doc Searls extended this in The Intention Economy (2012): buyers broadcast intent to the marketplace, and the market responds. For twenty years, every attempt to build this failed. Yellcast, Intently, Budggy, Ubokia all required users to fill out forms, structure their needs, manage responses. The friction was fatal. Google captured intent implicitly through search queries, with less work.

Nobody implemented permission marketing at the per-impression level until now. Conversational browsing is zero-friction intentcasting. You just talk. Each message casts your intent into the embedding space, and the market responds with what’s nearby. The twenty-year failure was a failure of the interface, not the idea.

The UX Is the Trust

The architecture means nothing if the experience doesn’t feel right.

Pinterest proved this. In a study with MAGNA testing 6,200 participants, the same exact ad shown on a positively-perceived platform was rated 2x as trustworthy, with 94% higher purchase intent. The ad didn’t change. The context did.

WeChat mini-programs are the strongest precedent. They process $100 billion in annual transactions with 945 million users, and they architecturally cannot push messages. They can only respond when users initiate. That constraint is the reason the platform works for commerce. Users trust it because the system can’t interrupt them.

The data on voluntary commercial engagement is unambiguous. Amazon Sponsored Products (user already shopping) convert at 9.5% vs general ecommerce at 1.33%, a 7.5x gap. Google Search converts at 7-13x the rate of Display. Yelp users: 90% purchase within a week, 42% within 24 hours, because they arrived with declared intent. When people enter a commercial context voluntarily because the information serves them, every downstream metric improves by an order of magnitude.

Don Marti argued that perfectly targeted surveillance ads are perfectly worthless. Targeting eliminates the waste that signals a company’s commitment and solvency. A retargeted ad says “this company knows you were on their website.” An embedding-matched indicator in an opt-in context says “this company positioned themselves near your problem.” The second is a credible claim of expertise. The two-phase model restores the signaling value that surveillance targeting destroyed.

A chatbot where a faint signal reflects proximity to expertise, outside the conversation, leading somewhere only when the user taps. That’s a context where commercial information feels like a resource.

The architecture provides the guarantee: attested separation, one-directional data flow, auditable code. But the guarantee is not the trust. Trust is earned over time. The user taps the indicator, gets something useful, and the conversation continues honestly. They ignore it, and nothing changes. Over weeks and months of that consistency, the system earns what no whitepaper can claim. The architecture makes sure the consistency isn’t a lie.

What Could Go Wrong

The indicator itself is commercial. Even as a passive signal, its presence means “this chatbot has an ad system.” A faint dot during a conversation about a medical diagnosis could feel wrong. The indicator’s subtlety helps because it demands no attention, but its existence changes the product. A passive proximity signal almost certainly costs less trust than forced ads. The cost still isn’t zero.

Context leakage. When the user taps the indicator, the conversation embedding enters the auction enclave. The user may have shared sensitive information they’d tell a chatbot but not an ad system. The two-phase model limits this: phase one runs against a cached copy of advertiser embeddings on the publisher’s own infrastructure, so no data reaches the exchange at all. Phase two sends the conversation embedding into a sealed enclave that returns only a winner and a price. The exchange never sees the raw embedding. But “sealed” is only as trustworthy as the attestation infrastructure, and TEE side-channel attacks exist.

Low tap rates. Many users will never tap. High-intent queries may be the only viable inventory. That might be fine because high-intent is where the value is, but it limits the model’s reach.

This is a design, not a deployment. The two-phase mechanism has not been built. No one has shipped per-impression permission marketing in a chatbot. The closest analogue is Google Ad Intents, contextual chips that delay ad loading until click, but those chips are styled as content navigation rather than a commercial consent gate. The individual components exist in production (TEE-attested auctions, embedding infrastructure, confidential computing). The integrated system does not.

Competition is the real check. The platform that deploys the chatbot still chooses the embedding model and designs the indicator’s UX. A platform that makes the indicator too prominent loses users to one that doesn’t. Switching costs between chatbots are near zero. That competitive pressure is the constraint that keeps the system honest, and it requires no engineering.

Written with Claude Opus 4.6 via Claude Code. I directed the argument and framing; Claude researched prior art and drafted prose.

Part of the Vector Space series. The architecture that guarantees the UX isn’t lying is described in Model Blindness.