Dissecting a ChatGPT Web HAR: Architecture, Conversation Flow and Data

Published: 14 December 2025 · Alin Radulescu

Executive Summary

A full ChatGPT web-session HAR shows a layered system:

A browser SPA pulling static bundles from OpenAI CDNs
A single public-facing BFF/API gateway at chatgpt.com/backend-api/*
Multiple ancillary planes (telemetry, experiments, bot mitigation)
A security gate (Turnstile + proof-of-work) before sensitive actions
A realtime channel bootstrapped via celsius and upgraded to wss://ws.chatgpt.com
A tool ecosystem that renders widgets via /ecosystem/widget while executing tools via /ecosystem/call_mcp

Conversation state is hydrated by a single "conversation object" fetch whose node graph (mapping) contains all turns, roles, and message payloads, plus tool metadata (search result groups, citations, async task IDs, invoked resources) when tools were used.

⚠️ Disclaimer

This HAR analysis is not a "source code leak."

It is still enough to reconstruct system boundaries, the runtime contract between UI and backend, the gating model, the realtime plumbing, and the exact shape of conversation state and tool outputs that the client consumes.

For advanced marketers and developers, that contract is the useful part: it reveals how generative interfaces are assembled, where tool results enter the conversation, how widget UIs are sandboxed, and which objects the client treats as canonical truth.

A HAR captures only client-side network activity. Server-to-server calls (tool providers, retrieval, vendor APIs, ranking systems) are invisible unless proxied through the client.
Names visible in bundles or flags do not prove a separate backend service. Only hosts/endpoints do.
Any tokens, IDs, cookies, file IDs, session IDs, and personally identifying data must be removed before publication. Treat this data as sensitive operational telemetry.
This writeup describes observable interfaces and flows, not internal implementation guarantees. Interfaces can change without notice.

Part 1: Architecture

1) Planes and boundaries

What the HAR makes unambiguous:

A. Static asset plane (SPA runtime)

cdn.oaistatic.com: bulk JS/CSS chunk delivery (the SPA code and UI modules).
persistent.oaistatic.com: long-lived/static resources (e.g., onboarding assets).

Interpretation: the browser is executing a modular SPA where most "feature names" exist as bundle code, not as independent backend services.

B. Edge/bot mitigation plane

Cloudflare edge termination is visible via Cloudflare response headers and challenge assets.
Requests to chatgpt.com/cdn-cgi/challenge-platform/* indicate a Cloudflare challenge script path used for bot mitigation and fingerprinting.

Interpretation: there is a perimeter layer before application semantics. This is independent of the application's own anti-abuse checks.

C. Application API plane (single public BFF/API gateway)

Dominant interface: https://chatgpt.com/backend-api/*
This behaves like a BFF ("backend-for-frontend"): the client calls one origin and receives normalized objects across identity, settings, conversations, gizmos, subscriptions, connectors, tasks, images, etc.

Interpretation: the browser has a simplified contract. Many backend services may exist, but they are fronted by a single coherent gateway path.

D. Security gate plane (application-layer)

POST /backend-api/sentinel/chat-requirements/prepare
POST /backend-api/sentinel/chat-requirements/finalize
Payloads show Turnstile and proof-of-work requirements in prepare, and a token + expiry in finalize.

Interpretation: beyond Cloudflare, the app enforces per-session/per-action gates that can demand CAPTCHA and compute work before allowing certain operations.

E. Realtime plane

GET /backend-api/celsius/ws/user returns a websocket_url.
Browser upgrades to wss://ws.chatgpt.com/... (101 Switching Protocols).

Interpretation: the realtime channel is not hard-coded; it is dynamically issued by the backend, which allows sharding, migration, and access control.

F. Telemetry and experiments plane

POST /ces/v1/t (high-frequency event ingestion).
POST /ces/statsc/flush (metrics flush).
POST https://ab.chatgpt.com/v1/rgstr (experiment registry / assignment; 202 Accepted).

Interpretation: UI behavior and feature exposure are shaped by experimentation and measured continuously.

G. Tool ecosystem plane

UI embed: GET /backend-api/ecosystem/widget returns HTML-as-data (an embedded widget document).
Tool execution: POST /backend-api/ecosystem/call_mcp invokes tools (example observed: a shopping connector tool get_all_products).
Widget runtime loads from a sandbox host: connector_openai_shopping.web-sandbox.oaiusercontent.com (its own JS/CSS).
The browser then fetches images from:
- OpenAI thumbnails/proxy plane: images.openai.com/thumbnails/*
- Merchant CDNs (direct, client-side): multiple e-commerce/image hosts.

Interpretation: tool output is separated into (1) a sandboxed UI surface and (2) an invocation API. Rendering assets can be a mix of proxied and direct-to-vendor.

H. Blob/content store plane

GET /backend-api/estuary/content?id=file-... returns binary (e.g., image/jpeg).

Interpretation: attachments and media referenced in conversation state are retrieved via an internal blob service fronted by the same gateway path.

2) Architecture diagram

Browser SPA
├── loads JS/CSS from cdn.oaistatic.com
├── passes Cloudflare challenge scripts (cdn-cgi)
├── calls chatgpt.com/backend-api (BFF)
│   ├── Sentinel gate (Turnstile + PoW token)
│   ├── WS bootstrap (celsius) → ws.chatgpt.com (realtime)
│   ├── Conversations, Settings, Gizmos, Subscriptions, Tasks, etc.
│   ├── Estuary content fetch (binary blobs)
│   └── Ecosystem:
│       ├── widget HTML (embed)
│       ├── call_mcp (tool execution)
│       └── sandbox widget app loads from oaiusercontent.com
└── browser fetches thumbnails from images.openai.com and vendor CDNs

3) Interactive architecture visualization

Click phases to see the request flow, or click individual nodes for details.

Browser

Edge & CDN

Origin (chatgpt.com)

Backend

External

Telemetry

🌐 User Agent

☁️ Cloudflare

🤖 CF Challenge

📦 cdn.oaistatic

🖼️ persistent

🏠 Origin

🚪 BFF Gateway

👤 Identity

⚙️ Settings

💡 Hints

🎬 Conv Init

💬 Conv State

🤖 Gizmos

🔌 Connectors

🛡️ Sentinel

🌡️ Celsius

🗄️ Estuary

💳 Subscriptions

📦 Misc

⚡ MCP

🔌 WebSocket

🔧 MCP Runtime

🧠 Model

📦 Sandbox

🖼️ Images

🛒 Merchants

🤖 Gizmo Assets

📊 CES

🧪 A/B

🎬 Flow Control

Client

Edge/CDN

API Gateway

Security

Realtime

Backend

External

Telemetry

Part 2: Conversation Flow

1) What "complete conversation flow" means in a HAR

A full HAR often shows:

Hydration: the client fetching the entire current conversation object in one request.
Tool surfaces: widget fetch + tool invocations if a tool UI is displayed.
Realtime hookup: WebSocket bootstrapping and upgrade.

It may not show message-send requests if the HAR was captured on a reload after the conversation already existed, or if filtering omitted XHR/fetch at send-time.

2) Observed chronological phases

Page Nav

Hydration

Security

Tools

Phase A — Page navigation and SPA boot

GET /c/<conversation_id> (HTML shell)
Pull of SPA chunks from cdn.oaistatic.com

Phase B — Parallel hydration against backend-api

Typical batch includes:

Identity: /me, /accounts/check/*
Consent: /user_granular_consent
User and feature config: /settings/user, /system_hints, /user_system_messages, /settings/voices, /client/strings
Conversation state:
- GET /conversation/<id> (the canonical conversation graph)
- POST /conversation/init (default model slug and limit/progress structures)
- GET /conversations (list/index)
Gizmos and sidebar surfaces: /gizmos/bootstrap, /gizmos/snorlax/sidebar
Connector availability: /aip/connectors/list_accessible, /aip/connectors/links/list_accessible, /connectors/check
Optional UI surfaces: /images/bootstrap, /beacons/home, /amphora/notifications, /user_surveys/active, /tasks, /subscriptions

Phase C — Security and realtime

Sentinel prepare/finalize (Turnstile + PoW)
celsius/ws/user → WebSocket URL
WebSocket upgrade to ws.chatgpt.com

Phase D — Tool UI and tool execution (when present)

GET /ecosystem/widget?... returns HTML document payload
Widget app loads from web-sandbox.oaiusercontent.com (its own assets)
POST /ecosystem/call_mcp executes tools (example observed: shopping connector tool get_all_products)
Tool results may trigger image fetches from:
- OpenAI thumbnail/proxy (images.openai.com)
- Vendor CDNs (direct)

3) "All turns" timeline diagram

This model is how the client treats a conversation:

The "truth" of the conversation is fetched as a single graph from /conversation/<id>.
Tool usage appears inside message metadata (and may also create separate network activity: widget + tool calls).
Realtime updates arrive via WebSocket; state is still represented as nodes/messages.

Browser SPA

/backend-api

Sentinel

WebSocket

/ecosystem

Telemetry

Hydration

→ GET /conversation/<id>

→ Conversation graph

→ POST /conversation/init

→ model_slug + limits

Gate + Realtime

→ POST /sentinel/prepare

→ turnstile+pow

→ POST /sentinel/finalize

→ token + expiry

→ WSS upgrade

→ realtime events

Tool Surface

→ GET /ecosystem/widget

→ HTML payload

→ POST /ecosystem/call_mcp

→ tool output (JSON)

Measurement

→ POST /ces/v1/t

→ POST /v1/rgstr

4) Interactive conversation flow

This visualization shows how a single message flows through ChatGPT's architecture - from user input through model inference to response rendering.

User Input

Server / API

Sonic

Model

Tools/MCP

Streaming

Data

Render

⌨️ User Msg

📱 Context

🔧 Prepare

📤 /f/conversation

📋 State

🛡️ Sentinel

⏳ Async

⚙️ SysMsg

🎯 Sonic

📊 Thresholds

💡 Hints

🧠 Model

💭 Reasoning

💬 Response

✅ Finish

🔧 Tool Call

⚡ MCP

📦 Result

🌐 External

📡 SSE

🔌 WebSocket

💾 Buffer

🗺️ Mapping

📊 Metadata

📎 Citations

🔍 Search

🗄️ Estuary

🛡️ Moderation

💾 Persist

🎨 Render

🛒 Products

❓ Follow-ups

👍 Feedback

💬 Conversation Flow

User

Assistant

API

Sonic

Tools

Stream

Data

System

Part 3: Conversation Data

1) The canonical conversation object: what the client consumes

The endpoint GET /backend-api/conversation/<id> returns a structured object that functions as the client's authoritative local cache. Key properties typically observed:

title (conversation title)
current_node (node ID for the "active" tip of the main branch)
mapping (a dictionary of nodes representing the conversation graph)

Each node commonly includes:

id
parent and children (graph edges; supports edits/regenerations/branches)
message object:
- author.role (user/assistant/tool/system)
- content (often parts[] containing text blocks)
- metadata (dense; holds tool traces, model identifiers, citations, etc.)

2) How turns are represented

A "turn" is not a flat list entry; it is a node chain:

User message node → assistant message node → next user message node → …

Regenerations/edits appear as alternative children under the same parent.

The UI can choose which branch is "active" by following current_node back through parents.

3) Where "prompts" exist

In a web UI conversation object:

User prompts are plain message content in user-role nodes.
Assistant outputs are in assistant-role nodes.
System/developer prompts are not necessarily delivered as raw text inside the conversation object; they can be implicit in server policy and only reflected via behavior, model slug, or system message surfaces.

Additional system content surfaces exist via:

GET /backend-api/user_system_messages
GET /backend-api/system_hints

4) Tool traces inside conversation data

When tools are used, evidence often appears in message metadata, even if server-side details remain opaque. Observed categories:

search_result_groups: server-returned groups of web results used by a "search" tool.
citations: references the UI can render alongside claims.
invoked_resource: identifies which tool/widget was invoked.
async_task_id: indicates tool execution may be asynchronous and tracked separately from the message text.

Independently of message metadata, the HAR shows the tool runtime contract:

/ecosystem/widget → returns an HTML document payload for embedding.
/ecosystem/call_mcp → executes a named tool (tool_name) against a connector URI (app_uri), passing tool_input that includes a session_id.

5) Data implications for SEO/GEO (mechanics, not folklore)

The HAR-level contract clarifies where generative "answers" can be influenced by external data:

Direct web results (search tool): results arrive as structured groups and citations. The UI treats them as first-class objects, not scraped HTML at render time.
Tool connectors (shopping): a tool returns structured product objects that drive a widget UI and image fetches. The browser may pull images directly from merchant CDNs, meaning the final visual layer can depend on vendor image hosting and caching behavior.
Thumbnail proxying: images.openai.com/thumbnails/* indicates an intermediate image plane for safe/fast rendering, separate from raw vendor image URLs.
Conversation persistence: the conversation graph is the persisted substrate; "what happened" is represented as nodes with metadata, not as ephemeral chat bubbles.

6) Minimal schema sketch (for developers)

This is a faithful conceptual model of what the browser treats as the conversation state:

Conversation {
  id
  title
  current_node: NodeId
  mapping: { NodeId: Node }
}

Node {
  id
  parent: NodeId | null
  children: NodeId[]
  message: Message | null
}

Message {
  id
  author: { role: "user" | "assistant" | "system" | "tool" }
  content: { parts: string[] | structured_blocks[] }
  metadata: {
    model_slug?
    citations?
    search_result_groups?
    invoked_resource?
    async_task_id?
    ...feature flags and render hints...
  }
  create_time?
  update_time?
}

Key insight: This is the practical core — everything else (WS, sentinel, widgets) exists to safely produce, update, and render this state object.

Part 4: Methodology

How this HAR analysis was produced.

4.1 Inputs and scope control

The analysis was based on two HAR captures:

A smaller capture (chat.har) used to establish the initial system boundary map.
A full capture (full.har) used to validate, correct, and extend the picture with missing planes (static assets, Cloudflare challenge, widget sandbox, external image fetches, blob/content retrieval, additional config endpoints).

Working rule: Only claim components and flows that are directly evidenced by URLs, methods, status codes, timing, headers, and payload fragments present in the HAR.

4.2 Step 1 — Inventory: hosts, counts, and dominant paths

The first pass was a mechanical inventory:

Extract all unique host values.
Count requests per host and per path prefix.
Rank endpoints by frequency and by early appearance in the timeline.

This immediately exposed the system's high-level segmentation:

A dominant application API plane (chatgpt.com/backend-api/*)
A telemetry plane (chatgpt.com/ces/*)
An experiments plane (ab.chatgpt.com/*)
CDNs for the SPA (cdn.oaistatic.com, persistent.oaistatic.com)
Cloudflare challenge assets (/cdn-cgi/challenge-platform/*)
Optional third-party monitoring in the small HAR (Datadog RUM), absent in the full HAR.

4.3 Step 2 — Timeline reconstruction (critical path vs. parallel hydration)

Next, requests were sorted by startedDateTime and grouped into phases:

Navigation and asset boot (HTML + JS/CSS chunks).
Parallel hydration burst against /backend-api (identity, settings, conversation state, connectors, gizmos, notifications).
Security gating (Sentinel prepare/finalize).
Realtime bootstrap and WebSocket upgrade (celsius → ws.chatgpt.com).
Tool/widget activity (if present): /ecosystem/widget followed by /ecosystem/call_mcp.
Late hydration (subscriptions, tasks, blob fetches, additional gizmo conversation lists).

Key insight: This phase-based view is what allowed "complete flow" statements without guessing internal service topology.

4.4 Step 3 — Endpoint clustering into logical services

Endpoints were clustered by namespace and functional semantics:

Identity/session: /me, /accounts/check/*
User config and feature hints: /settings/user, /system_hints, /user_system_messages, /client/strings, /settings/voices
Conversation domain: /conversation/<id>, /conversation/init, /conversations, /stream_status, /textdocs
Gizmos/custom GPT surfaces: /gizmos/bootstrap, /gizmos/snorlax/sidebar, /gizmos/<id>/conversations
Connector registry and availability: /aip/connectors/*, /connectors/check
Security gating: /sentinel/chat-requirements/*
Realtime bootstrap: /celsius/ws/user
Blob/content retrieval: /estuary/content
Commerce and ops: /subscriptions, /tasks, /notifications, /beacons/home, /user_surveys/active
Tool ecosystem: /ecosystem/widget, /ecosystem/call_mcp

Methodological constraint: These are "logical services" inferred from stable URL namespaces, not claims about physical microservices.

4.5 Step 4 — Proving realtime and gating (not assumptions)

Two observations were treated as hard proofs because they are protocol-level artifacts:

WebSocket upgrade was proven by 101 Switching Protocols to wss://ws.chatgpt.com/...
Sentinel gating was proven by the explicit prepare → finalize pair with Turnstile and proof-of-work fields in the payload.

This prevented the common mistake of describing realtime or security as "probably present" rather than "observed."

4.6 Step 5 — Tool ecosystem tracing (widget UI vs. tool execution)

Tool behavior was reconstructed by following a strict causality chain visible in the HAR:

The client requests /backend-api/ecosystem/widget and receives HTML packaged as JSON.
The client loads a sandboxed widget app from connector_openai_shopping.web-sandbox.oaiusercontent.com (its own JS/CSS).
The client invokes tools via /backend-api/ecosystem/call_mcp with:
- app_uri=connectors://connector_openai_shopping
- tool_name=get_all_products
- tool_input.session_id=...
After tool output arrives, the browser fetches images from:
- images.openai.com/thumbnails/* (proxy/thumbnail plane)
- merchant CDNs (direct client-side fetches)

Core insight: The separation—embed surface versus execution API—is fully derived from network traces.

4.7 Step 6 — Conversation "all turns" reconstruction from state hydration

The "all turns" view came from the structure of the conversation-state fetch:

GET /backend-api/conversation/<id> returns a node graph (mapping) and a pointer (current_node).

That implies:

Turns are nodes, not a flat list
Branching/regeneration is represented by multiple children per parent
Metadata is the container for tool traces (citations, search result groups, invoked resources, async task IDs)

The HAR did not need to contain "send message" requests to reconstruct the existing conversation's turns; a single state object fetch is sufficient.

4.8 Step 7 — Validation by contradiction (diagram correction loop)

Initial flow diagrams contained internal names (e.g., Sonic, Sonicberry, Mercury, Hive, etc.). The methodology for correction was:

Search the HAR for those names as:
- hosts
- endpoint prefixes
- request payload fields
- response payload fields
- delivered configuration keys
- static bundle strings (only as weak evidence)
Promote to "observed component" only if it appears as a host/endpoint or as a config object with clear operational semantics.

Result:

"Sonic" was supported only as a UI/announcement flag (hasDismissedSonicSidebar) and bundle strings, not as a callable API surface.
"Sonicberry" was not present at all; therefore it could not be asserted as a feature used in that captured session.

4.9 Step 8 — Output artifacts: diagrams and narrative

The final outputs were built directly from the clustered endpoint map and timeline phases:

Architecture diagram: planes + boundaries + tool ecosystem.
Flow diagrams: hydration, gating, realtime, tool invocation, telemetry.
Data model sketch: conversation graph and message metadata.

Every diagram element corresponds to at least one observed request/host/prefix in the HAR, and every "unknown internal service" was kept explicitly behind the /backend-api boundary rather than invented as a separate box.

4.10 Reproducibility checklist

To repeat this work:

Capture HAR during:
- A cold load of /c/<id>
- At least one tool invocation (e.g., shopping widget)
- Optionally one fresh message send (to capture write endpoints)
Enumerate hosts and prefix clusters.
Build a phase timeline (boot → hydration → gate → WS → tools → late hydration).
Identify protocol proofs (101 WS upgrade, gate prepare/finalize).
Extract conversation state schema from /conversation/<id>.
Separate "observed" vs. "inferred" vs. "unseen server-side."

Part 5: Validation of the Study

This section evaluates the claims made in this study against the strength of evidence available from HAR analysis. Every assertion is classified by its evidentiary basis.

5.1 Correct (supported by HAR data)

The following findings are directly observable and verifiable in the HAR:

Single BFF/API gateway (/backend-api/*)
Plausible and consistent with known OpenAI web clients. HAR files show exactly this pattern: a unified frontend contract behind which internal services are hidden.
SPA architecture with CDN delivery (cdn.oaistatic.com)
Factually correct. Matches standard practice and is directly visible in the HAR.
Cloudflare as upstream edge and bot mitigation
Correct. Challenge scripts and headers are hard evidence.
Sentinel gating (Turnstile + proof-of-work)
Correct. The prepare/finalize sequence is an explicit, observable mechanism. No speculation.
Dynamic WebSocket bootstrapping (celsius → ws.chatgpt.com)
Correct. 101 Switching Protocols is a protocol-level proof with no interpretive ambiguity.
Conversation object as canonical truth
Correct. The node-graph structure has been known for a long time and is clearly evidenced by /conversation/<id>.
Branching model for regenerations/edits
Correct. The parent/children structure precisely explains the UI behavior.
Tool separation: widget (UI) vs. tool execution (call_mcp)
Correct. This is one of the strongest points of the study and is cleanly derived from the HAR.
Telemetry and experiment plane (/ces, ab.chatgpt.com)
Correct. Visible, functionally clear, and not overstated.

5.2 Plausible, but not fully provable

The following claims are architecturally sound but cannot be definitively proven from HAR data alone:

"Many backends behind one gateway"
Architecturally highly plausible, but not provable from HAR alone. HAR shows only the gateway, not the internal service topology.
Implicit system/developer prompts outside the conversation object
Plausible. Matches OpenAI design patterns, but HAR only proves that they are not in the object—not where or how they are injected.
SEO/GEO implications
Mechanically described correctly (where structured data enters), but any claims about actual ranking or influence strength remain speculative. The study is mostly disciplined here, but occasionally crosses into interpretation.
Sharding/migration as motivation for dynamic WebSocket URLs
Plausible, but an assumption about intent. HAR shows the what, not the why.

5.3 Overstated or potentially incorrect

The following points require careful interpretation or may overstate what can be concluded from the evidence:

"Exact shape of conversation state" as a stable contract
Overstated. HAR shows a snapshot in time. Although the study notes mutability, it sometimes treats the schema as more stable than is realistically justified.
Implicit equation of "client-consumed = system-canonical"
Partially incorrect. The client receives a canonical client view, not necessarily the complete server-side truth (e.g., moderation layers, hidden state, policy layers).
Absence of internal services as negative proof
Methodologically correct in wording, but risky in reception: "Not visible in the HAR" means not exposed to the client, not nonexistent in the system. The study states this, but readers may overlook it.
Tool metadata as a complete representation of tool usage
Incomplete. Server-side tool orchestration, ranking, retrieval, and post-processing are explicitly invisible and implicitly underweighted.

Part 6: Sources

None.

This study intentionally does not cite external sources. The reason is methodological, not incidental.

The analysis is based exclusively on first-hand empirical observation of client-side network traffic captured in HAR (HTTP Archive) files. All claims are derived directly from observable artifacts: request URLs, hosts, methods, status codes, headers, timing, and payload fragments present in the captured sessions.

HAR analysis is, by definition, a primary-source method. It does not rely on documentation, blog posts, marketing materials, leaked code, or third-party interpretations. Introducing external sources would not increase evidentiary strength and could instead dilute the core methodological guarantee: every assertion is traceable to something that was actually observed on the wire by the client.

Where the HAR does not provide direct evidence, the study either:

explicitly marks statements as plausible rather than proven, or
refrains from making claims about internal systems, motivations, or server-side logic.

Accordingly, the absence of traditional sources is a feature of the study's design, not a gap.
The HAR itself is the source.

Acknowledgments

Even though there are no formal sources, this contribution is the result of many intensive discussions over the past months.

I would like to express my particular thanks to Alexander Holl for reigniting the joy of learning in his unmistakable way.

To Hanns Kronnenberg for his tireless work, and especially for his generous sharing of knowledge.

To Metehan Yesilyurt, who set me on the right track.

…last but not least,

ChatGPT 5.1 / 5.2, Opus 4.5, and Gemini 3.

Without this support, it would not have been possible for me to conceive and execute this contribution on my own within such a short timeframe.

Try It Yourself

Apply this methodology to your own ChatGPT session. Capture a HAR file from your browser's DevTools and drop it here to explore the conversation structure.

Drop HAR file here

How to capture a HAR: Open DevTools (F12) → Network tab → Load a ChatGPT conversation → Right-click → "Save all as HAR"

Are my HAR data stored when using the tool?

No. No data are stored externally or transmitted.

The HAR processing happens entirely client-side in the browser:

No server upload – There are no fetch() or POST requests to external servers.
IndexedDB only temporarily – HAR data are stored in IndexedDB only briefly to transfer them between pages and are automatically deleted after a single use.
Pure JavaScript analysis – Parsing and analysis are performed entirely within the user's browser.

The HAR file therefore never leaves your local machine.

Validate this study using an AI model.

Use ChatGPT, Gemini, or another model to critically assess the claims, assumptions, and conclusions based on the evidence presented.

ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI

Dissecting a ChatGPT Web HAR: Architecture, Conversation Flow and Data

Executive Summary

⚠️ Disclaimer

Part 1: Architecture

1) Planes and boundaries

A. Static asset plane (SPA runtime)

B. Edge/bot mitigation plane

C. Application API plane (single public BFF/API gateway)

D. Security gate plane (application-layer)

E. Realtime plane

F. Telemetry and experiments plane

G. Tool ecosystem plane

H. Blob/content store plane

2) Architecture diagram

3) Interactive architecture visualization

Node Details

Part 2: Conversation Flow

1) What "complete conversation flow" means in a HAR

2) Observed chronological phases

Phase A — Page navigation and SPA boot

Phase B — Parallel hydration against backend-api

Phase C — Security and realtime

Phase D — Tool UI and tool execution (when present)

3) "All turns" timeline diagram

4) Interactive conversation flow

Details

Part 3: Conversation Data

1) The canonical conversation object: what the client consumes

2) How turns are represented

3) Where "prompts" exist

4) Tool traces inside conversation data

5) Data implications for SEO/GEO (mechanics, not folklore)

6) Minimal schema sketch (for developers)

Part 4: Methodology

4.1 Inputs and scope control

4.2 Step 1 — Inventory: hosts, counts, and dominant paths

4.3 Step 2 — Timeline reconstruction (critical path vs. parallel hydration)

4.4 Step 3 — Endpoint clustering into logical services

4.5 Step 4 — Proving realtime and gating (not assumptions)

4.6 Step 5 — Tool ecosystem tracing (widget UI vs. tool execution)

4.7 Step 6 — Conversation "all turns" reconstruction from state hydration

4.8 Step 7 — Validation by contradiction (diagram correction loop)

4.9 Step 8 — Output artifacts: diagrams and narrative

4.10 Reproducibility checklist

Part 5: Validation of the Study

5.1 Correct (supported by HAR data)

5.2 Plausible, but not fully provable

5.3 Overstated or potentially incorrect

Part 6: Sources

Acknowledgments

Try It Yourself

HAR Analysis Summary

Are my HAR data stored when using the tool?