Building Models
The agent can generate complete Python data models — including code, input table configuration, column mapping, and output settings — all from a natural language description.
How It Works
When you ask the agent to build a model, it follows a complete workflow:
- Browse datasets — Finds your available BigQuery tables
- Inspect schemas — Reads column names, types, and sample data
- Generate code — Writes a Python script tailored to your data
- Configure inputs — Maps BigQuery columns to code variables
- Run preview — Executes the model against sample data to validate it
- Display config — Shows the complete model configuration in the right panel
What You Can Build
| Model Type | Example Prompt |
|---|---|
| Churn prediction | ”Build a churn model using my Mixpanel event data” |
| User segmentation | ”Segment users by behavior patterns” |
| Revenue analysis | ”Calculate monthly recurring revenue from Stripe data” |
| Cohort analysis | ”Create weekly cohorts based on signup date” |
| Feature engineering | ”Build a feature table with user engagement metrics” |
| Data transformation | ”Normalize and clean my ad spend data” |
The Model Config Panel
When the agent generates a model, the right panel switches to the Model Builder view with:
Model Details
- Name and description (editable)
- Input table references with remove buttons
Column Mapping
- Zapier-style two-column layout showing BigQuery column names mapped to code variable names
- Editable — adjust column mappings before saving
Python Code
- Syntax-highlighted code viewer with line numbers
- Toggle between view and edit mode
- Python keywords, builtins, and mapped variables are color-coded
Preview Results
- Click Run Preview to test the model against live data
- Shows a sample of output rows with column types
- Errors are displayed with the full error message
Output Configuration
- Set the output BigQuery table name
- Choose write mode (Replace or Append)
- Save Model to add it to your pipeline
Python Code Rules
The agent generates code for Vendo’s sandboxed executor:
- No imports —
pandasis pre-loaded aspd - Flat script — No function definitions (
def) input_data— A pandas DataFrame pre-loaded with your first input tableoutput— Assign your result DataFrame to this variable- Column mapping — Use the mapped variable names, not raw BigQuery column names
# Example: User engagement scoring
df = input_data
df['signup_date'] = pd.to_datetime(df['signup_date'])
df['days_since_signup'] = (pd.Timestamp.now() - df['signup_date']).dt.days
df['engagement_score'] = df['event_count'] / df['days_since_signup']
output = df[['user_id', 'engagement_score', 'days_since_signup']]Tips
- Start broad — “Build a churn model” and let the agent discover what data is available
- The agent self-debugs — If the preview fails, it will fix the code and retry automatically
- Edit after generation — Use the code editor and column mapping UI to fine-tune before saving
- Check the preview — Always verify the preview results before saving the model
Related
The agent currently builds Python models. Vendo also supports SQL models and rule-based audiences, which are created through the Models UI:
- Models Overview — All model types (SQL, Python, Audiences)
- SQL Models — Query-based transformations
- Audiences — Rule-based user segments for ad platforms and CRMs
Last updated on