Dissecting a ChatGPT Web HAR: Architecture, Conversation Flow and Data

Executive Summary

A full ChatGPT web-session HAR shows a layered system:

  • A browser SPA pulling static bundles from OpenAI CDNs
  • A single public-facing BFF/API gateway at chatgpt.com/backend-api/*
  • Multiple ancillary planes (telemetry, experiments, bot mitigation)
  • A security gate (Turnstile + proof-of-work) before sensitive actions
  • A realtime channel bootstrapped via celsius and upgraded to wss://ws.chatgpt.com
  • A tool ecosystem that renders widgets via /ecosystem/widget while executing tools via /ecosystem/call_mcp

Conversation state is hydrated by a single "conversation object" fetch whose node graph (mapping) contains all turns, roles, and message payloads, plus tool metadata (search result groups, citations, async task IDs, invoked resources) when tools were used.

⚠️ Disclaimer

This HAR analysis is not a "source code leak."

It is still enough to reconstruct system boundaries, the runtime contract between UI and backend, the gating model, the realtime plumbing, and the exact shape of conversation state and tool outputs that the client consumes.

For advanced marketers and developers, that contract is the useful part: it reveals how generative interfaces are assembled, where tool results enter the conversation, how widget UIs are sandboxed, and which objects the client treats as canonical truth.

  • A HAR captures only client-side network activity. Server-to-server calls (tool providers, retrieval, vendor APIs, ranking systems) are invisible unless proxied through the client.
  • Names visible in bundles or flags do not prove a separate backend service. Only hosts/endpoints do.
  • Any tokens, IDs, cookies, file IDs, session IDs, and personally identifying data must be removed before publication. Treat this data as sensitive operational telemetry.
  • This writeup describes observable interfaces and flows, not internal implementation guarantees. Interfaces can change without notice.

Part 1: Architecture

1) Planes and boundaries

What the HAR makes unambiguous:

A. Static asset plane (SPA runtime)

  • cdn.oaistatic.com: bulk JS/CSS chunk delivery (the SPA code and UI modules).
  • persistent.oaistatic.com: long-lived/static resources (e.g., onboarding assets).
Interpretation: the browser is executing a modular SPA where most "feature names" exist as bundle code, not as independent backend services.

B. Edge/bot mitigation plane

  • Cloudflare edge termination is visible via Cloudflare response headers and challenge assets.
  • Requests to chatgpt.com/cdn-cgi/challenge-platform/* indicate a Cloudflare challenge script path used for bot mitigation and fingerprinting.
Interpretation: there is a perimeter layer before application semantics. This is independent of the application's own anti-abuse checks.

C. Application API plane (single public BFF/API gateway)

  • Dominant interface: https://chatgpt.com/backend-api/*
  • This behaves like a BFF ("backend-for-frontend"): the client calls one origin and receives normalized objects across identity, settings, conversations, gizmos, subscriptions, connectors, tasks, images, etc.
Interpretation: the browser has a simplified contract. Many backend services may exist, but they are fronted by a single coherent gateway path.

D. Security gate plane (application-layer)

  • POST /backend-api/sentinel/chat-requirements/prepare
  • POST /backend-api/sentinel/chat-requirements/finalize
  • Payloads show Turnstile and proof-of-work requirements in prepare, and a token + expiry in finalize.
Interpretation: beyond Cloudflare, the app enforces per-session/per-action gates that can demand CAPTCHA and compute work before allowing certain operations.

E. Realtime plane

  • GET /backend-api/celsius/ws/user returns a websocket_url.
  • Browser upgrades to wss://ws.chatgpt.com/... (101 Switching Protocols).
Interpretation: the realtime channel is not hard-coded; it is dynamically issued by the backend, which allows sharding, migration, and access control.

F. Telemetry and experiments plane

  • POST /ces/v1/t (high-frequency event ingestion).
  • POST /ces/statsc/flush (metrics flush).
  • POST https://ab.chatgpt.com/v1/rgstr (experiment registry / assignment; 202 Accepted).
Interpretation: UI behavior and feature exposure are shaped by experimentation and measured continuously.

G. Tool ecosystem plane

  • UI embed: GET /backend-api/ecosystem/widget returns HTML-as-data (an embedded widget document).
  • Tool execution: POST /backend-api/ecosystem/call_mcp invokes tools (example observed: a shopping connector tool get_all_products).
  • Widget runtime loads from a sandbox host: connector_openai_shopping.web-sandbox.oaiusercontent.com (its own JS/CSS).
  • The browser then fetches images from:
    • OpenAI thumbnails/proxy plane: images.openai.com/thumbnails/*
    • Merchant CDNs (direct, client-side): multiple e-commerce/image hosts.
Interpretation: tool output is separated into (1) a sandboxed UI surface and (2) an invocation API. Rendering assets can be a mix of proxied and direct-to-vendor.

H. Blob/content store plane

  • GET /backend-api/estuary/content?id=file-... returns binary (e.g., image/jpeg).
Interpretation: attachments and media referenced in conversation state are retrieved via an internal blob service fronted by the same gateway path.

2) Architecture diagram

Browser SPA
├── loads JS/CSS from cdn.oaistatic.com
├── passes Cloudflare challenge scripts (cdn-cgi)
├── calls chatgpt.com/backend-api (BFF)
│   ├── Sentinel gate (Turnstile + PoW token)
│   ├── WS bootstrap (celsius) → ws.chatgpt.com (realtime)
│   ├── Conversations, Settings, Gizmos, Subscriptions, Tasks, etc.
│   ├── Estuary content fetch (binary blobs)
│   └── Ecosystem:
│       ├── widget HTML (embed)
│       ├── call_mcp (tool execution)
│       └── sandbox widget app loads from oaiusercontent.com
└── browser fetches thumbnails from images.openai.com and vendor CDNs

3) Interactive architecture visualization

Click phases to see the request flow, or click individual nodes for details.

Node Details

Browser
Edge & CDN
Origin (chatgpt.com)
Backend
External
Telemetry
🌐 User Agent
☁️ Cloudflare
🤖 CF Challenge
📦 cdn.oaistatic
🖼️ persistent
🏠 Origin
🚪 BFF Gateway
👤 Identity
⚙️ Settings
💡 Hints
🎬 Conv Init
💬 Conv State
🤖 Gizmos
🔌 Connectors
🛡️ Sentinel
🌡️ Celsius
🗄️ Estuary
💳 Subscriptions
📦 Misc
🧩 Widget
⚡ MCP
🔌 WebSocket
🔧 MCP Runtime
🧠 Model
📦 Sandbox
🖼️ Images
🛒 Merchants
🤖 Gizmo Assets
📊 CES
🧪 A/B
🎬 Flow Control
Client
Edge/CDN
API Gateway
Security
Realtime
Backend
External
Telemetry

Part 2: Conversation Flow

1) What "complete conversation flow" means in a HAR

A full HAR often shows:

  • Hydration: the client fetching the entire current conversation object in one request.
  • Tool surfaces: widget fetch + tool invocations if a tool UI is displayed.
  • Realtime hookup: WebSocket bootstrapping and upgrade.

It may not show message-send requests if the HAR was captured on a reload after the conversation already existed, or if filtering omitted XHR/fetch at send-time.

2) Observed chronological phases

A
Page Nav
B
Hydration
C
Security
D
Tools

Phase A — Page navigation and SPA boot

  • GET /c/<conversation_id> (HTML shell)
  • Pull of SPA chunks from cdn.oaistatic.com

3) "All turns" timeline diagram

This model is how the client treats a conversation:

  • The "truth" of the conversation is fetched as a single graph from /conversation/<id>.
  • Tool usage appears inside message metadata (and may also create separate network activity: widget + tool calls).
  • Realtime updates arrive via WebSocket; state is still represented as nodes/messages.
Browser SPA
/backend-api
Sentinel
WebSocket
/ecosystem
Telemetry
Hydration
GET /conversation/<id>
Conversation graph
POST /conversation/init
model_slug + limits
Gate + Realtime
POST /sentinel/prepare
turnstile+pow
POST /sentinel/finalize
token + expiry
WSS upgrade
realtime events
Tool Surface
GET /ecosystem/widget
HTML payload
POST /ecosystem/call_mcp
tool output (JSON)
Measurement
POST /ces/v1/t
POST /v1/rgstr

4) Interactive conversation flow

This visualization shows how a single message flows through ChatGPT's architecture - from user input through model inference to response rendering.

Details

User Input
Server / API
Sonic
Model
Tools/MCP
Streaming
Data
Render
⌨️ User Msg
📱 Context
🔧 Prepare
📤 /f/conversation
📋 State
🛡️ Sentinel
⏳ Async
⚙️ SysMsg
🎯 Sonic
📊 Thresholds
💡 Hints
🧠 Model
💭 Reasoning
💬 Response
✅ Finish
🔧 Tool Call
⚡ MCP
📦 Result
🧩 Widget
🌐 External
📡 SSE
🔌 WebSocket
💾 Buffer
🗺️ Mapping
📊 Metadata
📎 Citations
🗄️ Estuary
🛡️ Moderation
💾 Persist
🎨 Render
🛒 Products
❓ Follow-ups
👍 Feedback
💬 Conversation Flow
User
Assistant
API
Sonic
Tools
Stream
Data
System

Part 3: Conversation Data

1) The canonical conversation object: what the client consumes

The endpoint GET /backend-api/conversation/<id> returns a structured object that functions as the client's authoritative local cache. Key properties typically observed:

  • title (conversation title)
  • current_node (node ID for the "active" tip of the main branch)
  • mapping (a dictionary of nodes representing the conversation graph)

Each node commonly includes:

  • id
  • parent and children (graph edges; supports edits/regenerations/branches)
  • message object:
    • author.role (user/assistant/tool/system)
    • content (often parts[] containing text blocks)
    • metadata (dense; holds tool traces, model identifiers, citations, etc.)

2) How turns are represented

A "turn" is not a flat list entry; it is a node chain:

User message node → assistant message node → next user message node → …

Regenerations/edits appear as alternative children under the same parent.

The UI can choose which branch is "active" by following current_node back through parents.

3) Where "prompts" exist

In a web UI conversation object:

  • User prompts are plain message content in user-role nodes.
  • Assistant outputs are in assistant-role nodes.
  • System/developer prompts are not necessarily delivered as raw text inside the conversation object; they can be implicit in server policy and only reflected via behavior, model slug, or system message surfaces.

Additional system content surfaces exist via:

  • GET /backend-api/user_system_messages
  • GET /backend-api/system_hints

4) Tool traces inside conversation data

When tools are used, evidence often appears in message metadata, even if server-side details remain opaque. Observed categories:

  • search_result_groups: server-returned groups of web results used by a "search" tool.
  • citations: references the UI can render alongside claims.
  • invoked_resource: identifies which tool/widget was invoked.
  • async_task_id: indicates tool execution may be asynchronous and tracked separately from the message text.

Independently of message metadata, the HAR shows the tool runtime contract:

  • /ecosystem/widget → returns an HTML document payload for embedding.
  • /ecosystem/call_mcp → executes a named tool (tool_name) against a connector URI (app_uri), passing tool_input that includes a session_id.

5) Data implications for SEO/GEO (mechanics, not folklore)

The HAR-level contract clarifies where generative "answers" can be influenced by external data:

  • Direct web results (search tool): results arrive as structured groups and citations. The UI treats them as first-class objects, not scraped HTML at render time.
  • Tool connectors (shopping): a tool returns structured product objects that drive a widget UI and image fetches. The browser may pull images directly from merchant CDNs, meaning the final visual layer can depend on vendor image hosting and caching behavior.
  • Thumbnail proxying: images.openai.com/thumbnails/* indicates an intermediate image plane for safe/fast rendering, separate from raw vendor image URLs.
  • Conversation persistence: the conversation graph is the persisted substrate; "what happened" is represented as nodes with metadata, not as ephemeral chat bubbles.

6) Minimal schema sketch (for developers)

This is a faithful conceptual model of what the browser treats as the conversation state:

Conversation {
  id
  title
  current_node: NodeId
  mapping: { NodeId: Node }
}

Node {
  id
  parent: NodeId | null
  children: NodeId[]
  message: Message | null
}

Message {
  id
  author: { role: "user" | "assistant" | "system" | "tool" }
  content: { parts: string[] | structured_blocks[] }
  metadata: {
    model_slug?
    citations?
    search_result_groups?
    invoked_resource?
    async_task_id?
    ...feature flags and render hints...
  }
  create_time?
  update_time?
}
Key insight: This is the practical core — everything else (WS, sentinel, widgets) exists to safely produce, update, and render this state object.

Part 4: Methodology

How this HAR analysis was produced.

4.1 Inputs and scope control

The analysis was based on two HAR captures:

  • A smaller capture (chat.har) used to establish the initial system boundary map.
  • A full capture (full.har) used to validate, correct, and extend the picture with missing planes (static assets, Cloudflare challenge, widget sandbox, external image fetches, blob/content retrieval, additional config endpoints).
Working rule: Only claim components and flows that are directly evidenced by URLs, methods, status codes, timing, headers, and payload fragments present in the HAR.

4.2 Step 1 — Inventory: hosts, counts, and dominant paths

The first pass was a mechanical inventory:

  • Extract all unique host values.
  • Count requests per host and per path prefix.
  • Rank endpoints by frequency and by early appearance in the timeline.

This immediately exposed the system's high-level segmentation:

  • A dominant application API plane (chatgpt.com/backend-api/*)
  • A telemetry plane (chatgpt.com/ces/*)
  • An experiments plane (ab.chatgpt.com/*)
  • CDNs for the SPA (cdn.oaistatic.com, persistent.oaistatic.com)
  • Cloudflare challenge assets (/cdn-cgi/challenge-platform/*)
  • Optional third-party monitoring in the small HAR (Datadog RUM), absent in the full HAR.

4.3 Step 2 — Timeline reconstruction (critical path vs. parallel hydration)

Next, requests were sorted by startedDateTime and grouped into phases:

  1. Navigation and asset boot (HTML + JS/CSS chunks).
  2. Parallel hydration burst against /backend-api (identity, settings, conversation state, connectors, gizmos, notifications).
  3. Security gating (Sentinel prepare/finalize).
  4. Realtime bootstrap and WebSocket upgrade (celsiusws.chatgpt.com).
  5. Tool/widget activity (if present): /ecosystem/widget followed by /ecosystem/call_mcp.
  6. Late hydration (subscriptions, tasks, blob fetches, additional gizmo conversation lists).
Key insight: This phase-based view is what allowed "complete flow" statements without guessing internal service topology.

4.4 Step 3 — Endpoint clustering into logical services

Endpoints were clustered by namespace and functional semantics:

  • Identity/session: /me, /accounts/check/*
  • User config and feature hints: /settings/user, /system_hints, /user_system_messages, /client/strings, /settings/voices
  • Conversation domain: /conversation/<id>, /conversation/init, /conversations, /stream_status, /textdocs
  • Gizmos/custom GPT surfaces: /gizmos/bootstrap, /gizmos/snorlax/sidebar, /gizmos/<id>/conversations
  • Connector registry and availability: /aip/connectors/*, /connectors/check
  • Security gating: /sentinel/chat-requirements/*
  • Realtime bootstrap: /celsius/ws/user
  • Blob/content retrieval: /estuary/content
  • Commerce and ops: /subscriptions, /tasks, /notifications, /beacons/home, /user_surveys/active
  • Tool ecosystem: /ecosystem/widget, /ecosystem/call_mcp
Methodological constraint: These are "logical services" inferred from stable URL namespaces, not claims about physical microservices.

4.5 Step 4 — Proving realtime and gating (not assumptions)

Two observations were treated as hard proofs because they are protocol-level artifacts:

  • WebSocket upgrade was proven by 101 Switching Protocols to wss://ws.chatgpt.com/...
  • Sentinel gating was proven by the explicit prepare → finalize pair with Turnstile and proof-of-work fields in the payload.
This prevented the common mistake of describing realtime or security as "probably present" rather than "observed."

4.6 Step 5 — Tool ecosystem tracing (widget UI vs. tool execution)

Tool behavior was reconstructed by following a strict causality chain visible in the HAR:

  1. The client requests /backend-api/ecosystem/widget and receives HTML packaged as JSON.
  2. The client loads a sandboxed widget app from connector_openai_shopping.web-sandbox.oaiusercontent.com (its own JS/CSS).
  3. The client invokes tools via /backend-api/ecosystem/call_mcp with:
    • app_uri=connectors://connector_openai_shopping
    • tool_name=get_all_products
    • tool_input.session_id=...
  4. After tool output arrives, the browser fetches images from:
    • images.openai.com/thumbnails/* (proxy/thumbnail plane)
    • merchant CDNs (direct client-side fetches)
Core insight: The separation—embed surface versus execution API—is fully derived from network traces.

4.7 Step 6 — Conversation "all turns" reconstruction from state hydration

The "all turns" view came from the structure of the conversation-state fetch:

GET /backend-api/conversation/<id> returns a node graph (mapping) and a pointer (current_node).

That implies:

  • Turns are nodes, not a flat list
  • Branching/regeneration is represented by multiple children per parent
  • Metadata is the container for tool traces (citations, search result groups, invoked resources, async task IDs)
The HAR did not need to contain "send message" requests to reconstruct the existing conversation's turns; a single state object fetch is sufficient.

4.8 Step 7 — Validation by contradiction (diagram correction loop)

Initial flow diagrams contained internal names (e.g., Sonic, Sonicberry, Mercury, Hive, etc.). The methodology for correction was:

  1. Search the HAR for those names as:
    • hosts
    • endpoint prefixes
    • request payload fields
    • response payload fields
    • delivered configuration keys
    • static bundle strings (only as weak evidence)
  2. Promote to "observed component" only if it appears as a host/endpoint or as a config object with clear operational semantics.

Result:

  • "Sonic" was supported only as a UI/announcement flag (hasDismissedSonicSidebar) and bundle strings, not as a callable API surface.
  • "Sonicberry" was not present at all; therefore it could not be asserted as a feature used in that captured session.

4.9 Step 8 — Output artifacts: diagrams and narrative

The final outputs were built directly from the clustered endpoint map and timeline phases:

  • Architecture diagram: planes + boundaries + tool ecosystem.
  • Flow diagrams: hydration, gating, realtime, tool invocation, telemetry.
  • Data model sketch: conversation graph and message metadata.
Every diagram element corresponds to at least one observed request/host/prefix in the HAR, and every "unknown internal service" was kept explicitly behind the /backend-api boundary rather than invented as a separate box.

4.10 Reproducibility checklist

To repeat this work:

  1. Capture HAR during:
    • A cold load of /c/<id>
    • At least one tool invocation (e.g., shopping widget)
    • Optionally one fresh message send (to capture write endpoints)
  2. Enumerate hosts and prefix clusters.
  3. Build a phase timeline (boot → hydration → gate → WS → tools → late hydration).
  4. Identify protocol proofs (101 WS upgrade, gate prepare/finalize).
  5. Extract conversation state schema from /conversation/<id>.
  6. Separate "observed" vs. "inferred" vs. "unseen server-side."

Part 5: Validation of the Study

This section evaluates the claims made in this study against the strength of evidence available from HAR analysis. Every assertion is classified by its evidentiary basis.

5.1 Correct (supported by HAR data)

The following findings are directly observable and verifiable in the HAR:

  • Single BFF/API gateway (/backend-api/*)
    Plausible and consistent with known OpenAI web clients. HAR files show exactly this pattern: a unified frontend contract behind which internal services are hidden.
  • SPA architecture with CDN delivery (cdn.oaistatic.com)
    Factually correct. Matches standard practice and is directly visible in the HAR.
  • Cloudflare as upstream edge and bot mitigation
    Correct. Challenge scripts and headers are hard evidence.
  • Sentinel gating (Turnstile + proof-of-work)
    Correct. The prepare/finalize sequence is an explicit, observable mechanism. No speculation.
  • Dynamic WebSocket bootstrapping (celsiusws.chatgpt.com)
    Correct. 101 Switching Protocols is a protocol-level proof with no interpretive ambiguity.
  • Conversation object as canonical truth
    Correct. The node-graph structure has been known for a long time and is clearly evidenced by /conversation/<id>.
  • Branching model for regenerations/edits
    Correct. The parent/children structure precisely explains the UI behavior.
  • Tool separation: widget (UI) vs. tool execution (call_mcp)
    Correct. This is one of the strongest points of the study and is cleanly derived from the HAR.
  • Telemetry and experiment plane (/ces, ab.chatgpt.com)
    Correct. Visible, functionally clear, and not overstated.

5.2 Plausible, but not fully provable

The following claims are architecturally sound but cannot be definitively proven from HAR data alone:

  • "Many backends behind one gateway"
    Architecturally highly plausible, but not provable from HAR alone. HAR shows only the gateway, not the internal service topology.
  • Implicit system/developer prompts outside the conversation object
    Plausible. Matches OpenAI design patterns, but HAR only proves that they are not in the object—not where or how they are injected.
  • SEO/GEO implications
    Mechanically described correctly (where structured data enters), but any claims about actual ranking or influence strength remain speculative. The study is mostly disciplined here, but occasionally crosses into interpretation.
  • Sharding/migration as motivation for dynamic WebSocket URLs
    Plausible, but an assumption about intent. HAR shows the what, not the why.

5.3 Overstated or potentially incorrect

The following points require careful interpretation or may overstate what can be concluded from the evidence:

  • "Exact shape of conversation state" as a stable contract
    Overstated. HAR shows a snapshot in time. Although the study notes mutability, it sometimes treats the schema as more stable than is realistically justified.
  • Implicit equation of "client-consumed = system-canonical"
    Partially incorrect. The client receives a canonical client view, not necessarily the complete server-side truth (e.g., moderation layers, hidden state, policy layers).
  • Absence of internal services as negative proof
    Methodologically correct in wording, but risky in reception: "Not visible in the HAR" means not exposed to the client, not nonexistent in the system. The study states this, but readers may overlook it.
  • Tool metadata as a complete representation of tool usage
    Incomplete. Server-side tool orchestration, ranking, retrieval, and post-processing are explicitly invisible and implicitly underweighted.

Part 6: Sources

None.

This study intentionally does not cite external sources. The reason is methodological, not incidental.

The analysis is based exclusively on first-hand empirical observation of client-side network traffic captured in HAR (HTTP Archive) files. All claims are derived directly from observable artifacts: request URLs, hosts, methods, status codes, headers, timing, and payload fragments present in the captured sessions.

HAR analysis is, by definition, a primary-source method. It does not rely on documentation, blog posts, marketing materials, leaked code, or third-party interpretations. Introducing external sources would not increase evidentiary strength and could instead dilute the core methodological guarantee: every assertion is traceable to something that was actually observed on the wire by the client.

Where the HAR does not provide direct evidence, the study either:

  • explicitly marks statements as plausible rather than proven, or
  • refrains from making claims about internal systems, motivations, or server-side logic.

Accordingly, the absence of traditional sources is a feature of the study's design, not a gap.
The HAR itself is the source.

Acknowledgments

Even though there are no formal sources, this contribution is the result of many intensive discussions over the past months.

I would like to express my particular thanks to Alexander Holl for reigniting the joy of learning in his unmistakable way.

To Hanns Kronnenberg for his tireless work, and especially for his generous sharing of knowledge.

To Metehan Yesilyurt, who set me on the right track.

…last but not least,

ChatGPT 5.1 / 5.2, Opus 4.5, and Gemini 3.

Without this support, it would not have been possible for me to conceive and execute this contribution on my own within such a short timeframe.

Try It Yourself

Apply this methodology to your own ChatGPT session. Capture a HAR file from your browser's DevTools and drop it here to explore the conversation structure.

Drop HAR file here

How to capture a HAR: Open DevTools (F12) → Network tab → Load a ChatGPT conversation → Right-click → "Save all as HAR"

Are my HAR data stored when using the tool?

No. No data are stored externally or transmitted.

The HAR processing happens entirely client-side in the browser:

  • No server upload – There are no fetch() or POST requests to external servers.
  • IndexedDB only temporarily – HAR data are stored in IndexedDB only briefly to transfer them between pages and are automatically deleted after a single use.
  • Pure JavaScript analysis – Parsing and analysis are performed entirely within the user's browser.

The HAR file therefore never leaves your local machine.

Validate this study using an AI model.

Use ChatGPT, Gemini, or another model to critically assess the claims, assumptions, and conclusions based on the evidence presented.