Skip to Content

SQL Models

SQL models let you write queries against your BigQuery tables to create custom views, aggregations, and transformations. The output can be synced to any compatible destination.

Creating a SQL Model

  1. Navigate to Models and click New Model
  2. Select SQL Model from the template gallery
  3. Enter a name and optional description
  4. Choose a data type for the output (this determines which destinations can receive the data)
  5. Write your SQL query
  6. Click Validate to check the query against your BigQuery schema
  7. Save the model

Data Types

The data type tells Vendo how to treat the model’s output when syncing to destinations:

Data TypeUse CaseExample
EventsTime-series event dataAggregated daily ad spend by campaign
User ProfilesOne row per user with attributesCustomer LTV scores, purchase segments
Group ProfilesOne row per group/companyAccount-level metrics for B2B
Lookup TablesReference/enrichment dataProduct category mappings
CustomFreeform output for exportAny table for BigQuery or S3 export

Writing Queries

Your SQL runs against BigQuery. Reference tables in your Vendo dataset directly:

SELECT customer_id, SUM(total_price) AS lifetime_value, COUNT(*) AS order_count, MIN(created_at) AS first_order_date, MAX(created_at) AS last_order_date FROM orders WHERE financial_status = 'paid' GROUP BY customer_id

Validation

When you click Validate, Vendo:

  1. Runs the query against your BigQuery schema (dry run — no data scanned)
  2. Returns the column names and types
  3. Flags any syntax or schema errors

Fix any errors before saving. Only validated models can be synced to destinations.

Incremental Syncing

For large datasets, configure incremental syncing to avoid re-processing all data on every run:

  • Primary Key Columns — Columns that uniquely identify each row (e.g., customer_id)
  • Incremental Column — A timestamp or monotonically increasing column used to detect new/updated rows (e.g., updated_at)

When incremental syncing is enabled, Vendo only processes rows where the incremental column is greater than the last sync’s maximum value.

Scheduling

SQL models run when:

  • Manually triggered from the model detail page
  • On a schedule — set a frequency from every 15 minutes to weekly
  • After a source syncs — trigger the model when a specific source app finishes importing
  • After an upstream model — chain models together so downstream models run after upstream ones complete

Example: Daily Revenue by Channel

SELECT DATE(o.created_at) AS date, COALESCE(a.utm_source, 'direct') AS channel, COUNT(DISTINCT o.id) AS orders, SUM(o.total_price) AS revenue FROM orders o LEFT JOIN events a ON o.customer_id = a.user_id AND a.event = 'checkout_started' WHERE o.financial_status = 'paid' GROUP BY date, channel ORDER BY date DESC
Last updated on