Skip to Content

Building Models

The agent can generate complete Python data models — including code, input table configuration, column mapping, and output settings — all from a natural language description.

How It Works

When you ask the agent to build a model, it follows a complete workflow:

  1. Browse datasets — Finds your available BigQuery tables
  2. Inspect schemas — Reads column names, types, and sample data
  3. Generate code — Writes a Python script tailored to your data
  4. Configure inputs — Maps BigQuery columns to code variables
  5. Run preview — Executes the model against sample data to validate it
  6. Display config — Shows the complete model configuration in the right panel

What You Can Build

Model TypeExample Prompt
Churn prediction”Build a churn model using my Mixpanel event data”
User segmentation”Segment users by behavior patterns”
Revenue analysis”Calculate monthly recurring revenue from Stripe data”
Cohort analysis”Create weekly cohorts based on signup date”
Feature engineering”Build a feature table with user engagement metrics”
Data transformation”Normalize and clean my ad spend data”

The Model Config Panel

When the agent generates a model, the right panel switches to the Model Builder view with:

Model Details

  • Name and description (editable)
  • Input table references with remove buttons

Column Mapping

  • Zapier-style two-column layout showing BigQuery column names mapped to code variable names
  • Editable — adjust column mappings before saving

Python Code

  • Syntax-highlighted code viewer with line numbers
  • Toggle between view and edit mode
  • Python keywords, builtins, and mapped variables are color-coded

Preview Results

  • Click Run Preview to test the model against live data
  • Shows a sample of output rows with column types
  • Errors are displayed with the full error message

Output Configuration

  • Set the output BigQuery table name
  • Choose write mode (Replace or Append)
  • Save Model to add it to your pipeline

Python Code Rules

The agent generates code for Vendo’s sandboxed executor:

  • No importspandas is pre-loaded as pd
  • Flat script — No function definitions (def)
  • input_data — A pandas DataFrame pre-loaded with your first input table
  • output — Assign your result DataFrame to this variable
  • Column mapping — Use the mapped variable names, not raw BigQuery column names
# Example: User engagement scoring df = input_data df['signup_date'] = pd.to_datetime(df['signup_date']) df['days_since_signup'] = (pd.Timestamp.now() - df['signup_date']).dt.days df['engagement_score'] = df['event_count'] / df['days_since_signup'] output = df[['user_id', 'engagement_score', 'days_since_signup']]

Tips

  • Start broad — “Build a churn model” and let the agent discover what data is available
  • The agent self-debugs — If the preview fails, it will fix the code and retry automatically
  • Edit after generation — Use the code editor and column mapping UI to fine-tune before saving
  • Check the preview — Always verify the preview results before saving the model

The agent currently builds Python models. Vendo also supports SQL models and rule-based audiences, which are created through the Models UI:

  • Models Overview — All model types (SQL, Python, Audiences)
  • SQL Models — Query-based transformations
  • Audiences — Rule-based user segments for ad platforms and CRMs
Last updated on