Building Models

The agent can generate complete Python data models — including code, input table configuration, column mapping, and output settings — all from a natural language description.

How It Works

When you ask the agent to build a model, it follows a complete workflow:

Browse datasets — Finds your available BigQuery tables
Inspect schemas — Reads column names, types, and sample data
Generate code — Writes a Python script tailored to your data
Configure inputs — Maps BigQuery columns to code variables
Run preview — Executes the model against sample data to validate it
Display config — Shows the complete model configuration in the right panel

What You Can Build

Model Type	Example Prompt
Churn prediction	”Build a churn model using my Mixpanel event data”
User segmentation	”Segment users by behavior patterns”
Revenue analysis	”Calculate monthly recurring revenue from Stripe data”
Cohort analysis	”Create weekly cohorts based on signup date”
Feature engineering	”Build a feature table with user engagement metrics”
Data transformation	”Normalize and clean my ad spend data”

The Model Config Panel

When the agent generates a model, the right panel switches to the Model Builder view with:

Model Details

Name and description (editable)
Input table references with remove buttons

Column Mapping

Zapier-style two-column layout showing BigQuery column names mapped to code variable names
Editable — adjust column mappings before saving

Python Code

Syntax-highlighted code viewer with line numbers
Toggle between view and edit mode
Python keywords, builtins, and mapped variables are color-coded

Preview Results

Click Run Preview to test the model against live data
Shows a sample of output rows with column types
Errors are displayed with the full error message

Output Configuration

Set the output BigQuery table name
Choose write mode (Replace or Append)
Save Model to add it to your pipeline

Python Code Rules

The agent generates code for Vendo’s sandboxed executor:

No imports — pandas is pre-loaded as pd
Flat script — No function definitions (def)
input_data — A pandas DataFrame pre-loaded with your first input table
output — Assign your result DataFrame to this variable
Column mapping — Use the mapped variable names, not raw BigQuery column names


# Example: User engagement scoring
df = input_data
df['signup_date'] = pd.to_datetime(df['signup_date'])
df['days_since_signup'] = (pd.Timestamp.now() - df['signup_date']).dt.days
df['engagement_score'] = df['event_count'] / df['days_since_signup']
output = df[['user_id', 'engagement_score', 'days_since_signup']]

Tips

Start broad — “Build a churn model” and let the agent discover what data is available
The agent self-debugs — If the preview fails, it will fix the code and retry automatically
Edit after generation — Use the code editor and column mapping UI to fine-tune before saving
Check the preview — Always verify the preview results before saving the model

The agent currently builds Python models. Vendo also supports SQL models and rule-based audiences, which are created through the Models UI:

Models Overview — All model types (SQL, Python, Audiences)
SQL Models — Query-based transformations
Audiences — Rule-based user segments for ad platforms and CRMs