SQL Models

SQL models let you write queries against your BigQuery tables to create custom views, aggregations, and transformations. The output can be synced to any compatible destination.

Creating a SQL Model

Navigate to Models and click New Model
Select SQL Model from the template gallery
Enter a name and optional description
Choose a data type for the output (this determines which destinations can receive the data)
Write your SQL query
Click Validate to check the query against your BigQuery schema
Save the model

Data Types

The data type tells Vendo how to treat the model’s output when syncing to destinations:

Data Type	Use Case	Example
Events	Time-series event data	Aggregated daily ad spend by campaign
User Profiles	One row per user with attributes	Customer LTV scores, purchase segments
Group Profiles	One row per group/company	Account-level metrics for B2B
Lookup Tables	Reference/enrichment data	Product category mappings
Custom	Freeform output for export	Any table for BigQuery or S3 export

Writing Queries

Your SQL runs against BigQuery. Reference tables in your Vendo dataset directly:


SELECT
  customer_id,
  SUM(total_price) AS lifetime_value,
  COUNT(*) AS order_count,
  MIN(created_at) AS first_order_date,
  MAX(created_at) AS last_order_date
FROM orders
WHERE financial_status = 'paid'
GROUP BY customer_id

Validation

When you click Validate, Vendo:

Runs the query against your BigQuery schema (dry run — no data scanned)
Returns the column names and types
Flags any syntax or schema errors

Fix any errors before saving. Only validated models can be synced to destinations.

Incremental Syncing

For large datasets, configure incremental syncing to avoid re-processing all data on every run:

Primary Key Columns — Columns that uniquely identify each row (e.g., customer_id)
Incremental Column — A timestamp or monotonically increasing column used to detect new/updated rows (e.g., updated_at)

When incremental syncing is enabled, Vendo only processes rows where the incremental column is greater than the last sync’s maximum value.

Scheduling

SQL models run when:

Manually triggered from the model detail page
On a schedule — set a frequency from every 15 minutes to weekly
After a source syncs — trigger the model when a specific source app finishes importing
After an upstream model — chain models together so downstream models run after upstream ones complete

Example: Daily Revenue by Channel


SELECT
  DATE(o.created_at) AS date,
  COALESCE(a.utm_source, 'direct') AS channel,
  COUNT(DISTINCT o.id) AS orders,
  SUM(o.total_price) AS revenue
FROM orders o
LEFT JOIN events a
  ON o.customer_id = a.user_id
  AND a.event = 'checkout_started'
WHERE o.financial_status = 'paid'
GROUP BY date, channel
ORDER BY date DESC

Models Overview
Python Models — For advanced transformations with pandas and scikit-learn
Audiences — For rule-based user segments