Skip to main content

Snowflake Analytics — Usage Guide

Legion analytics data is replicated from MongoDB into Snowflake via DMS and Snowpipe. This doc covers how to access, explore, query, and build dashboards.


Table of Contents

  1. Access
  2. Inviting New Users
  3. Data Architecture
  4. Exploring Data (Data Preview)
  5. Canonical Organization ID
  6. Querying
  7. Dashboards
  8. Maintaining the Schema
  9. Infrastructure Reference

Access

  1. Go to app.snowflake.com and sign in
  2. Select the LEGION_ANALYTICS database from the left sidebar

Inviting New Users

Users with the ACCOUNTADMIN or USERADMIN role can invite new users to Snowflake.

  1. Go to Governance & Security → Users tab
  2. Click Invite
  3. Enter the user's Legion email address
  4. Assign the SYSADMIN role
  5. Send the invite

After accepting the invite, the new user should enable MFA on their account.


Data Architecture

Two schemas are relevant:

SchemaPurpose
RAW_CLEANPipeline objects — raw tables, deduplication, pipe, stage. Do not query directly unless debugging ingestion.
ANALYTICSViews for everyday querying — start here.
┌─────────────────────────────────────────────────────────────────┐
│ RAW_CLEAN schema │
│ │
│ S3/GCS Stage │
│ │ │
│ │ PIPE_EVENTS (Snowpipe — auto-ingests JSON files) │
│ ▼ │
│ RAW_EVENTS append-only, one row per DMS event │
│ │ │
│ │ deduplication (keep latest per object, every 1h) │
│ ▼ │
│ EVENTS_DEDUPED latest snapshot per (type, id, region) │
│ │
└──────────────────────────────┬──────────────────────────────────┘
│ views read from EVENTS_DEDUPED

┌─────────────────────────────────────────────────────────────────┐
│ ANALYTICS schema │
│ │
│ AUDIT_V audit events (recordings, runs) │
│ WORKFLOWS_V workflow records │
│ WORKFLOW_GRAPHS_V workflow graph versions │
│ USECASES_V use case / template records │
│ USERS_V user records │
│ DECISIONS_V AI step decisions │
│ INVESTIGATION_STEP_EVENTS_V step lifecycle events │
│ ORGANIZATIONS_V org name + ID mapping │
│ │
└─────────────────────────────────────────────────────────────────┘

Object Reference

ObjectSchemaTypeDescription
RAW_EVENTSRAW_CLEANTableRaw ingest — one row per DMS event
EVENTS_DEDUPEDRAW_CLEANDynamic TableLatest version per (SOURCE_TYPE, OBJECT_ID, REGION), refreshed every 1h
ORGANIZATIONS_VANALYTICSViewMaps auth_idcustomer_id + org name
AUDIT_VANALYTICSViewAudit events — recordings, workflow runs, decisions
WORKFLOWS_VANALYTICSViewWorkflow records
WORKFLOW_GRAPHS_VANALYTICSViewWorkflow graph versions
USECASES_VANALYTICSViewUse case / template records
USERS_VANALYTICSViewUser records (org_id, sub)
DECISIONS_VANALYTICSViewAI step decisions with recommendation and accuracy
INVESTIGATION_STEP_EVENTS_VANALYTICSViewInvestigation step lifecycle events

Exploring Data (Data Preview)

Before writing queries, use Snowflake's built-in preview to see what columns and values exist in a view.

  1. In the left sidebar, expand LEGION_ANALYTICS → ANALYTICS
  2. Click any view name (e.g. AUDIT_V)
  3. In the panel that opens on the right, click the Data Preview tab
  4. Sample rows with all columns and values are shown

Use this to confirm column names and value formats before writing a query.


Canonical Organization ID

Every org can be identified in two ways depending on the source:

FieldSource
auth_idWorkOS auth system (e.g. org_01JM7E9VQ1S8Y7JWP64MGYYBAN)
customer_idLegion internal ID (e.g. org_HJ8em7FhxL7DhZEE)

ORGANIZATIONS_V exposes a canonical_org_id field that resolves this automatically — it returns customer_id when a mapping exists, otherwise falls back to auth_id.

Always use canonical_org_id when joining across views. It is the stable, consistent identifier.

Find an org's canonical ID:

SELECT organization_name, canonical_org_id
FROM LEGION_ANALYTICS.ANALYTICS.ORGANIZATIONS_V
WHERE organization_name ILIKE '%acme%';

Filter another view using it:

SELECT * FROM LEGION_ANALYTICS.ANALYTICS.AUDIT_V
WHERE customer_id = '<canonical_org_id>'
LIMIT 50;

Querying

No version history. Snowflake does not version worksheets or dashboard tiles. Overwriting a query is permanent. Always develop and test in a new worksheet — only update a shared dashboard tile once you are confident the query is correct.

Open a Worksheet

  1. Click Worksheets in the left sidebar → + Worksheet
  2. Set context in the top bar: Database = LEGION_ANALYTICS, Schema = ANALYTICS

Examples

List all orgs — start here to get canonical IDs for other queries

SELECT organization_name, canonical_org_id
FROM ORGANIZATIONS_V
ORDER BY organization_name;

All recordings for an org

SELECT created_at, type, status, usecase_id
FROM AUDIT_V
WHERE customer_id = '<canonical_org_id>'
AND type = 'recording'
ORDER BY created_at DESC;

All workflows for an org

SELECT workflow_id, usecase_id, created_at
FROM WORKFLOWS_V
WHERE customer_id = '<canonical_org_id>'
AND is_preview = false
AND deleted IS NULL
ORDER BY created_at DESC;

Workflow count across all orgs

SELECT o.organization_name, COUNT(*) AS workflows
FROM WORKFLOWS_V w
JOIN ORGANIZATIONS_V o ON w.customer_id = o.canonical_org_id
WHERE w.is_preview = false
AND w.deleted IS NULL
GROUP BY 1
ORDER BY 2 DESC;

Dashboards

Same warning applies — dashboard tiles have no version history. Test queries in a worksheet before editing a tile.

Create a New Dashboard

  1. Click Dashboards in the left sidebar → + Dashboard
  2. Name it → New TileNew from Worksheet
  3. Write and run your query
  4. Click the chart icon (top right of results) to switch to a visualization
  5. Choose chart type, configure axes, then click Return to <Dashboard Name>

Add a Tile to an Existing Dashboard

  1. Open the dashboard → + Tile (top right) → New from Worksheet

Add Filters to a Dashboard

Filters let viewers slice all tiles by a value (e.g. filter by org) without editing any query.

  1. Open the dashboard → click the filter icon (top right, next to + Tile)
  2. Click + Filter and give it a display name (e.g. Organization)
  3. Set the type (Text, Date, Number, etc.)
  4. Click Done

Then wire each tile to the filter:

  1. Click the ... menu on a tile → Edit query
  2. In the query, replace the hardcoded value with the filter keyword using :filter_name syntax:
    WHERE customer_id = :organization
  3. Run the query — Snowflake will prompt you to map :organization to the filter you created
  4. Save and return to the dashboard — the filter now controls that tile

Wire the filter to every tile that should respond to it. Tiles not wired are unaffected.

Suggested Chart Types

MetricChart
Workflow count over timeLine
Recordings per orgBar
Time-to-workflowBar (sorted ascending)
Single KPI (e.g. total workflows)Scorecard

Maintaining the Schema

Adding a New Org Mapping

ORGANIZATIONS_V uses a hardcoded auth_id → customer_id map. When a new customer is onboarded:

  1. Open snowflake/organization-view.sql
  2. Add a row to the org_map VALUES block:
    ('org_<AUTH_ID>', 'org_<CUSTOMER_ID>'),
  3. Run the CREATE OR REPLACE VIEW statement in a Snowflake worksheet to apply the change

Infrastructure Reference

ComponentDetail
PipePIPE_EVENTS — auto-ingests .json / .json.gz files from the stage
StageLEGION_ANALYTICS.RAW_CLEAN.CLEAN_LEGION_STAGE (S3/GCS)
WarehouseANALYTICS_WH (runs dynamic table refreshes)
Dedup lag1 hour — EVENTS_DEDUPED refreshes automatically on this cadence