Skip to Content

Models

Models are Python-based data transformations that process data in BigQuery. Use models for custom calculations, aggregations, ML predictions, and advanced data processing.

List Models

Retrieve all models for your account.

GET /api/v1/models

Query Parameters

ParameterTypeDescription
limitintegerNumber of items to return (default: 20, max: 100)
offsetintegerNumber of items to skip (default: 0)
sortstringSort field and order (e.g., created_at:desc)
statestringFilter by state: active, inactive
statusstringFilter by status

Example Request

curl -H "Authorization: Bearer YOUR_API_KEY" \ "https://app.vendodata.com/api/v1/models?state=active"

Example Response

{ "data": [ { "id": "ff0e8400-e29b-41d4-a716-446655440010", "accountId": "123e4567-e89b-12d3-a456-426614174000", "name": "Customer LTV Model", "description": "Calculate customer lifetime value from order history", "language": "python", "state": "active", "status": "completed", "lastRunAt": "2024-03-04T06:00:00Z", "lastError": null, "createdAt": "2024-02-01T10:00:00Z", "updatedAt": "2024-03-04T06:00:00Z" } ], "meta": { "pagination": { "total": 1, "limit": 20, "offset": 0, "hasMore": false } } }

Get Model Details

Retrieve details for a specific model.

GET /api/v1/models/{modelId}

Path Parameters

ParameterTypeDescription
modelIdstring (UUID)The model’s unique identifier

Example Request

curl -H "Authorization: Bearer YOUR_API_KEY" \ https://app.vendodata.com/api/v1/models/ff0e8400-e29b-41d4-a716-446655440010

Example Response

{ "data": { "id": "ff0e8400-e29b-41d4-a716-446655440010", "accountId": "123e4567-e89b-12d3-a456-426614174000", "name": "Customer LTV Model", "description": "Calculate customer lifetime value from order history", "language": "python", "code": "import pandas as pd\n\ndef transform(input_df):\n # Calculate LTV per customer\n ltv = input_df.groupby('customer_id').agg({\n 'order_total': 'sum',\n 'order_id': 'count'\n }).reset_index()\n ltv.columns = ['customer_id', 'lifetime_value', 'order_count']\n return ltv", "inputTable": "orders", "outputTable": "customer_ltv", "schedule": { "frequency_value": 1, "frequency_unit": "days", "daily_option": "morning" }, "state": "active", "status": "completed", "lastRunAt": "2024-03-04T06:00:00Z", "lastError": null, "createdAt": "2024-02-01T10:00:00Z", "updatedAt": "2024-03-04T06:00:00Z" } }

Create Model

Create a new Python data model.

POST /api/v1/models

Request Body

FieldTypeRequiredDescription
namestringYesModel name
descriptionstringNoDescription of what the model does
codestringYesPython code with a transform(input_df) function
inputTablestringYesBigQuery table to read from
outputTablestringYesBigQuery table to write results to
scheduleobjectNoExecution schedule

Code Requirements

Your Python code must define a transform function that:

  • Accepts a pandas DataFrame as input
  • Returns a pandas DataFrame as output
import pandas as pd def transform(input_df: pd.DataFrame) -> pd.DataFrame: # Your transformation logic here result = input_df.copy() result['new_column'] = result['existing_column'] * 2 return result

Example Request

curl -X POST \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "Daily Revenue Summary", "description": "Aggregate daily revenue by product category", "code": "import pandas as pd\n\ndef transform(input_df):\n return input_df.groupby([\"date\", \"category\"]).agg({\"revenue\": \"sum\"}).reset_index()", "inputTable": "orders", "outputTable": "daily_revenue", "schedule": { "frequencyValue": 1, "frequencyUnit": "days", "dailyOption": "morning" } }' \ https://app.vendodata.com/api/v1/models

Example Response

{ "data": { "id": "110e8400-e29b-41d4-a716-446655440011", "accountId": "123e4567-e89b-12d3-a456-426614174000", "name": "Daily Revenue Summary", "language": "python", "state": "active", "status": "pending", "createdAt": "2024-03-04T15:00:00Z" } }

Update Model

Update an existing model.

PATCH /api/v1/models/{modelId}

Request Body

FieldTypeDescription
namestringUpdated name
descriptionstringUpdated description
codestringUpdated Python code
inputTablestringUpdated input table
outputTablestringUpdated output table
scheduleobjectUpdated schedule

Example Request

curl -X PATCH \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "schedule": { "frequencyValue": 6, "frequencyUnit": "hours" } }' \ https://app.vendodata.com/api/v1/models/ff0e8400-e29b-41d4-a716-446655440010

Delete Model

Soft-delete a model.

DELETE /api/v1/models/{modelId}

Example Request

curl -X DELETE \ -H "Authorization: Bearer YOUR_API_KEY" \ https://app.vendodata.com/api/v1/models/110e8400-e29b-41d4-a716-446655440011

Run Model

Manually trigger a model execution.

POST /api/v1/models/{modelId}/run

Example Request

curl -X POST \ -H "Authorization: Bearer YOUR_API_KEY" \ https://app.vendodata.com/api/v1/models/ff0e8400-e29b-41d4-a716-446655440010/run

Example Response

{ "data": { "jobId": "job_model_12345", "status": "dispatched", "message": "Model execution job has been dispatched" } }

Available Python Libraries

Models run in a secure Python environment with these libraries pre-installed:

LibraryVersionDescription
pandas2.xData manipulation and analysis
numpy1.xNumerical computing
scikit-learn1.xMachine learning
google-cloud-bigquery3.xBigQuery client (for advanced queries)

Note: External network access is disabled for security. All data must come from the input table.


Best Practices

  1. Keep models focused — Each model should do one thing well
  2. Use meaningful names — Name output tables descriptively (e.g., customer_segments, daily_revenue)
  3. Handle edge cases — Check for empty DataFrames and null values
  4. Test locally first — Develop and test your transform function locally before deploying
  5. Monitor execution time — Long-running models may timeout; optimize for performance
Last updated on