ION AI Data Access Platform
The platform should not be a giant natural-language SQL machine. It should be a governed access layer where agents, chat interfaces, and internal tools use approved business-level tools with identity, permissions, citations, and audit logs built in.
One governed front door. Many controlled data tools.
Claude, ChatGPT, internal agents, and chat apps should not each invent their own database path. They should call the same governed tool layer, and that layer should decide which source API, MCP, or curated analytics path is allowed.
Important correction to the original plan: SQL is still useful, but it should be a controlled analytics capability inside the system. The platform should not depend on the LLM understanding every raw table and column to answer business questions.
Use managed platforms to move faster without giving up control.
Google Cloud and Vanna can help the team get live faster, but they should sit behind the same governed access model. They accelerate analytics, retrieval, evaluation, and chat UX. They do not replace the MCP Gateway or the business-level tool catalog.
Zach's MCP Gateway
Owns identity, tool allowlists, client/location scope, audit decisions, and source routing for every interface.
Google Cloud + Vanna
Provides managed BigQuery MCP, Conversational Analytics, retrieval/search infrastructure, Vanna chat UX, golden query memory, tracing, and evaluation support.
Faster, safer answers
Agents answer with citations, freshness, permission checks, and consistent metric ownership instead of improvising over raw data.
Each source gets a controlled access path.
The original document correctly listed many data sources, but the access pattern should shift from raw warehouse querying to source-specific APIs, MCP tools, and curated analytics surfaces.
| Source | Primary Agent Access | Use It For | Warehouse Role | Notes |
|---|---|---|---|---|
| Comm Manager | Existing API + MCP | Calls, VTT transcripts, recaps, client emails, phone calls, agendas, customer timeline, meeting-created tasks. | Curated extracts only if needed. | No direct agent access to production Postgres. Use API/MCP tools for permissioned business operations. |
| Agency Framework | GraphQL API + MCP | Aggregated performance metrics, normalized client/account performance summaries. | Can feed BigQuery for snapshots or trend reporting. | Confirm whether this is the preferred source for rollups before duplicating metric logic elsewhere. |
| SearchAtlas | API + MCP | GBP heatmaps/geogrids, KRT/SERP tracking, keyword/rank data, LLM visibility, SEO metrics, local search performance. | Historical snapshots or fallback extracts. | Keep live SEO/local data routed to SearchAtlas unless a specific report needs warehouse history. |
| Slack | Official Slack MCP or Real-time Search API | Internal decisions, blockers, handoffs, escalations, project context, client escalation context. | Avoid broad raw sync initially. | Search grouped channels through the identity resolver. Fetch full threads and return confidence, matched alias, and permalink citations. |
| BigQuery | HIP Analytics MCP/API + Google BigQuery MCP where useful | Facebook Ads, Google Ads, TikTok, Snapchat, GSC, GA4 if confirmed, GHL tables, Supermetrics, relationship tables, historical snapshots. | Primary analytics warehouse. | Expose curated views, business tools, and selected data agents. Keep broad SQL restricted and scoped by customer/location. |
| ClickUp | API/MCP + BigQuery sync | Tasks, docs, agendas, project state, list/folder structure, custom fields. | Reporting snapshots and relationship mapping. | Fresh workflow questions may need API/MCP; reporting can use BigQuery snapshots. |
| GHL / Practice Beacon | Existing sync/API + BigQuery | CRM, contacts, opportunities, appointments, tasks, notes, SMS/text communication layer. | Core reporting and relationship identity tables. | Use locationId as the backbone, but add ownership, drift detection, aliases, inactive clients, and manual overrides. |
| Vanna AI | Analytics chat MVP over curated views/tools | Repeated analytics questions, streaming tables/charts, golden query memory, SQL pattern learning. | Uses BigQuery/Postgres only through approved datasets or business tools. | Useful accelerator, not the central source of truth. Validate enterprise support path before critical reliance. |
Messy names need a resolver, not hope.
Slack threads, ClickUp tasks, emails, agendas, and CRM records will not use perfect client names. The platform needs a client identity resolver that maps shorthand, misspellings, doctor names, location names, former names, and system IDs back to a canonical customer and locationId.
Tools should accept canonical IDs
search_client_slack_context( locationId: string, topics: string[], lookbackDays: number ) Internally expands: - canonical client name - aliases and shorthand - doctor/location names - common misspellings - ClickUp and Comm Manager IDs
Let AI propose aliases, not silently approve them
| Confidence Band | Required Evidence | Runtime Behavior | Storage Status |
|---|---|---|---|
| High | Multiple systems agree, client/location context matches, and recent usage is consistent. | Use normally, with citations and matched alias shown when useful. | Eligible for approval or policy-based auto-approval. |
| Medium | Likely phrase match, but only one strong source or incomplete contextual proof. | Return as possible related context, not as confirmed fact. | Review queue with evidence. |
| Low | Weak text similarity or ambiguous abbreviation. | Do not use in final answer unless the user explicitly asks to broaden search. | Candidate only; wait for repeat evidence. |
Tools execute. Skills route.
A skill or instruction set should help the model choose the right tool and answer format. The tool itself should enforce parameters, permissions, source routing, and output shape. For common workflows, use composite tools instead of making the model manually chain several low-level tools.
Preferred user-facing access
prepare_client_call_briefComposite workflow for call prep across comms, tasks, performance, CRM, Slack, and agenda context.
resolve_client_identityMaps user input to canonical customerId, locationId, aliases, and confidence evidence.
get_paid_media_performanceComposite summary across active paid channels for a client/date range.
get_google_ads_performanceChannel-specific Google Ads metrics and trend context.
get_facebook_ads_performanceFacebook Ads spend, leads, conversion, and campaign performance.
get_tiktok_ads_performanceTikTok performance when the client has that service active.
get_snapchat_ads_performanceSnapchat performance scoped by mapped ad account IDs.
get_gsc_performanceSearch Console query/page performance when confirmed in BigQuery or SearchAtlas.
get_ga4_traffic_summaryGA4 traffic summary, pending final source confirmation.
get_crm_hygiene_reportGHL/Practice Beacon hygiene checks for stale opportunities, missing fields, appointment drift, and owner issues.
get_client_timelineRecent meetings, recaps, email threads, tasks, and relevant updates.
search_client_commsPermissioned search across calls, recaps, client emails, and docs.
search_client_slack_contextUser-scoped Slack context search using locationId, aliases, grouped channels, thread citations, and confidence labels.
get_searchatlas_local_visibilityGBP heatmap/geogrid and local search visibility.
get_searchatlas_krt_rankingsWebsite SERP/KRT keyword ranking performance.
Admin and data-team fallback
run_readonly_bigquery_sql( sql: string, reason: string ) Required enforcement: - table allowlist - SELECT-only parser - dry-run cost check - row and byte limits - locationId scope when applicable - sensitive-column blocklist - full audit log
Do not make this the default user path. Most users should never need raw SQL. They should ask questions that map to business tools like paid media performance, client timeline, or SearchAtlas visibility.
Composite Tool Example
get_paid_media_performance can call Google Ads, Facebook Ads, TikTok, and Snapchat under one governed workflow.
Channel Tool Example
get_google_ads_performance should only touch Google Ads data and relevant relationship tables.
Skill Role
Skills decide which tool to use and how to explain the answer. They should not be trusted as the security layer.
Stop duplicate answers before agents amplify them.
If multiple systems can answer the same business question, the agent needs a priority order. Otherwise two interfaces can give different answers for the same client and date range.
| Question Type | Preferred Source | Fallback |
|---|---|---|
| Paid media rollups | Agency Framework if normalized | BigQuery / Supermetrics |
| Raw channel metrics | BigQuery / Supermetrics | Agency Framework |
| GBP geogrid/local visibility | SearchAtlas | Historical BigQuery snapshot if synced |
| Website SERP/KRT rankings | SearchAtlas | None unless exported |
| Client communication timeline | Comm Manager | ClickUp docs for linked agendas/recaps |
| Internal decisions/context | Slack MCP / RTS API | ClickUp comments/docs if relevant |
| Tasks/docs/agendas | ClickUp + Comm Manager links | BigQuery snapshot for reporting |
Agency Framework vs. BigQuery for performance rollups
The team should confirm which system owns normalized client performance summaries. If Agency Framework already applies the correct definitions, use it as the preferred rollup source and keep BigQuery for raw channel metrics, historical snapshots, relationship tables, and deeper data-team analysis.
Composite tools should encode real team workflows.
Call prep is the clearest example. The agent should not manually improvise the process every time. A skill can route the workflow, but a composite tool should execute the repeatable data gathering with stable inputs, permissions, citations, and freshness labels.
prepare_client_call_brief(locationId, meetingId)
What the workflow touches
Comm ManagerLast call, VTT highlights, recap, emails, phone calls, timeline.
ClickUpAgenda doc, tasks, overdue work, linked post-call actions.
Agency FrameworkNormalized performance rollups where authoritative.
BigQueryChannel metrics, GHL tables, Supermetrics, snapshots.
GHLOpportunities, appointments, CRM hygiene, SMS layer.
SlackGrouped-channel decisions, blockers, handoffs, escalations.
SearchAtlasGBP geogrids, KRT/SERP ranks, local visibility, SEO signals.
Identity ResolverAliases, locationId, account mappings, confidence evidence.
Fast responses come from prepared data, not shallow answers.
The ideal system should feel fast, but quality matters more than raw latency. For 500+ clients, millions of emails, and thousands of calls, the answer is caching, snapshots, indexing, and progressive refreshes rather than forcing every user question through fresh deep retrieval.
Serve a recent client brief snapshot first when the workflow allows it, then refresh volatile sections in parallel.
Every generated answer should identify source records, source freshness, confidence, and gaps instead of presenting unsupported synthesis.
For large email/call/Slack searches, return a useful brief quickly and continue deeper retrieval as a background refresh.
| Use Case | Primary Path | Accelerator | Avoid |
|---|---|---|---|
| Paid media question | Business metric tool with source ownership rules. | Vanna or BigQuery Conversational Analytics over curated views. | Raw SQL over every marketing table. |
| Call prep | Composite call prep tool with cached brief snapshot and live refresh. | Unstructured retrieval index, parallel tools, source freshness labels. | Model manually searching each system in an ad hoc order. |
| Ad hoc analyst question | Curated BigQuery MCP/API with row, table, and cost controls. | Vanna tool memory, golden queries, BigQuery data agents. | Unrestricted production database credentials. |
| Internal blocker search | Client-scoped Slack context tool using alias resolver and thread retrieval. | Slack MCP, Slack Real-time Search API, or Gemini Enterprise federation if available. | Trusting fuzzy matches without evidence or confidence labels. |
Permission checks live outside the model.
Prompts and skills improve behavior, but they are not security controls. The Gateway, APIs, MCP servers, and curated data layer must enforce the rules before records are fetched.
Authenticated user context
Every request carries who the user is, where the request came from, and which customer/location scope they can access.
Tool and source allowlists
The Gateway exposes only the tools and data classes allowed for that user, interface, and workflow.
Minimized model input
Tools should return the least sensitive useful data, with summaries and citations where possible.
Answer traceability
Log the question, selected tools, source records, permission decisions, and final answer metadata.
Preserve from the original plan
Keep- Data quality monitoring and backfill for GHL and Supermetrics.
- Relationship tables like location_master, location_platform_map, client_services, and SearchAtlas mappings.
- Permission matrix thinking for transcript, recording, recap, task, and client access levels.
- Known caveats like missing transcripts, left joins, stale touchpoint refreshes, and GHL timestamp casting.
Change from the original plan
Replace- Replace direct LLM-to-Postgres with Comm Manager API/MCP tools.
- Replace broad natural-language SQL with business-level analytics tools.
- Add Slack as an internal context source using official Slack MCP or RTS API.
- Add Agency Framework and SearchAtlas as first-class API/MCP sources.
Build the access layer before scaling agent usage.
The phases keep the original plan's foundation-first sequencing, but move the user-facing AI work toward governed tools rather than unrestricted SQL generation.
Source inventory and data ownership
Confirm what lives in Comm Manager, Agency Framework, SearchAtlas, BigQuery, ClickUp, Slack, and GHL. Decide authoritative source per metric and data class.
Identity, permissions, and Gateway policy
Wire authenticated user context into Zach's MCP Gateway, define role/location/service access, and enforce tool allowlists before any source calls.
Client identity and alias resolver
Build the customer/location resolver, alias table, evidence scoring, and review workflow before relying on Slack, ClickUp, or email search for client context.
Business-level tool catalog
Implement approved tools for client timelines, paid media, channel performance, SearchAtlas visibility, Slack context, ClickUp tasks, GHL reporting, and call prep.
Unstructured retrieval and brief snapshots
Index calls, VTT transcripts, recaps, client emails, ClickUp docs, Slack threads, and meeting agendas. Generate cached client brief snapshots with citations and freshness.
Analytics accelerators
Expose curated analytics tools over Facebook Ads, Google Ads, TikTok, Snapchat, GSC, GA4 if confirmed, GHL, relationship tables, and historical snapshots. Compare Vanna, BigQuery MCP, and Conversational Analytics for the chat MVP.
Skills, test questions, audit logs, rollout
Define repeatable skills, test the top business questions, verify permission failures, review answer citations, add evals/tracing, and roll out internal-only before client-facing use.
External source notes.
These sources support the updated direction: use official APIs/MCPs and managed enterprise services where they help, while keeping the Gateway and business tools as the control point.
Slack
Use official Slack MCP or Slack Real-time Search API for user-scoped internal context retrieval.
SearchAtlas
Use SearchAtlas API/MCP as the live access path for local SEO, GBP heatmaps, KRT/SERP tracking, LLM visibility, and ranking data. Endpoint-level tools should be finalized from their API documentation.
Google Cloud
Use Google Cloud's BigQuery MCP, Conversational Analytics, MCP governance, and Agent Platform evaluation/logging where they accelerate governed analytics and retrieval.
Vanna AI
Use Vanna as an analytics chat accelerator for curated views, golden queries, and streaming tables/charts. Validate current enterprise support before treating it as platform-critical.
Message to send with the document.
Use this with the HTML attachment so the team understands the architecture change and where Zach's MCP Gateway fits.