BigQuery
Source + Destination — Vendo streams your Shopify e-commerce data directly to Google BigQuery, giving you a powerful data warehouse for advanced analytics, custom reporting, and machine learning. Own your data and run unlimited SQL queries against your complete customer dataset.
Key Benefits
- Raw Data Access — Complete, unsampled Shopify data in your BigQuery project
- Real-time Streaming — Client-side events streamed directly to BigQuery
- Historical Sync — Full backfill of orders, customers, products, and more
- SQL Analytics — Run complex queries for custom analysis
- Data Ownership — Your data in your Google Cloud project
Data Tables
Vendo creates and maintains these tables in your BigQuery dataset:
| Table Name | Description | Update Frequency |
|---|---|---|
orders | Complete order data | Real-time + backfill |
customers | Customer profiles | Every few hours |
products | Product catalog | Daily |
events | Client-side tracking events | Real-time streaming |
abandoned_checkouts | Abandoned cart data | Hourly |
fulfillments | Order fulfillment data | Real-time |
inventory_items | Inventory levels | Daily |
errors | Integration error logs | Real-time |
If ad platform integrations are connected, ad data is also written to BigQuery:
| Table | Description |
|---|---|
export_ad_data | Daily ad metrics (impressions, clicks, spend) per ad |
change_history | Ad account change events (Google Ads, Meta Ads) |
Dataset Structure
Data is organized in a dataset named vendo_{shop_name} in your GCP project.
Platform Details
| Setting | Value |
|---|---|
| Dataset Naming | vendo_{shop_name} |
| Sync Method | Server-side via BigQuery API + real-time client streaming |
| Identity | Shopify Customer ID |
| Historical Backfill | Full order/customer history |
Identity & Deduplication
- Identity — Shopify Customer ID
- Deduplication — Records are keyed by source IDs within each table
What to Expect After Setup
- Dataset Created —
vendo_{shop_name}dataset appears in your BigQuery project - Tables Created — All schema tables are created automatically
- Historical Backfill — Past data loads within hours
- Real-time Events — Client-side events stream immediately
Data Freshness
| Data Type | Latency |
|---|---|
| Client-side events | Real-time (seconds) |
| Orders | Near real-time (minutes) |
| Customers | Every few hours |
| Products | Daily |
| Abandoned carts | Hourly |
Verify Setup
- Confirm the
vendo_{shop_name}dataset exists in BigQuery - Check that new orders or events are appearing in the relevant tables
- Review the
errorstable for any failed syncs
Estimated Storage Costs
BigQuery pricing (approximate):
| Metric | Cost |
|---|---|
| Storage | ~$0.02/GB/month |
| Queries | ~$5/TB scanned (first 1TB free/month) |
| Store Size | Estimated Data Volume |
|---|---|
| Small store | < 1 GB/month |
| Medium store | 1–10 GB/month |
| Large store | 10–100 GB/month |
Compatible Sources
BigQuery accepts table exports from all Vendo sources:
| Source | What Vendo Exports |
|---|---|
| Shopify | Orders, customers, products, events, abandoned checkouts, fulfillments |
| Stripe | Payments, subscriptions, customers, invoices |
| Google Ads | Ad performance metrics, geo data, change history |
| Meta Ads | Ad performance metrics, geo data, change history |
| TikTok Ads | Ad performance metrics, geo data |
| Snap Ads | Ad performance metrics, geo data |
| Microsoft Ads | Ad performance metrics, geo data |
| LinkedIn Ads | Ad performance metrics, geo data |
| X Ads | Ad performance metrics, geo data |
| Mixpanel | Event and user data |
| Segment | Event and user data |
| Amplitude | Event and user data |
BigQuery also receives output from SQL models, Python models, and audiences.
Related Guides
Last updated on