Migrating from Timbr.ai's ontology-based semantic layer to Databricks Unity Catalog Metric Views requires converting knowledge graphs to YAML metric definitions, reimplementing relationships as joins, and translating semantic SQL to MEASURE() syntax. Typedef's Fenic framework bridges the migration gap by providing declarative semantic operations that work with Databricks, enabling teams to maintain semantic intelligence while gaining native lakehouse performance and governance.
Why Organizations Migrate from Timbr.ai to Databricks
The Consolidation Imperative
Organizations running Timbr.ai typically face infrastructure complexity: semantic layer sits atop existing databases, creating an additional query hop and operational overhead. When Databricks already serves as the primary data platform for ML, data engineering, and analytics, maintaining separate semantic infrastructure introduces friction.
The consolidation drivers:
- Single platform operations: Eliminate Timbr.ai licensing and infrastructure costs
- Native Unity Catalog integration: Metrics become first-class catalog objects with lineage
- Unified governance: RBAC, audit logs, and access control managed in one place
- Performance gains: Query execution happens natively in Spark/Photon without external hops
- Lakehouse alignment: Metrics work seamlessly across SQL, Python notebooks, and ML pipelines
When Migration Makes Sense
Migrate from Timbr.ai when:
- Databricks handles 80%+ of analytical workloads
- Data science and BI teams need unified metric definitions
- Infrastructure consolidation drives cost reduction
- Unity Catalog already governs data assets
- Native Databricks AI/BI features (Genie, Assistant) are strategic
Maintain Timbr.ai when:
- Multi-warehouse architecture spans Snowflake, BigQuery, Databricks
- SPARQL queries and RDF ontologies are requirements
- Heavy investment in OWL-based domain ontologies
- Data federation across external sources is core use case
Architecture Comparison: Timbr.ai vs Databricks Metric Views
Timbr.ai Architecture
Virtual knowledge graph sits between consumers and data sources:
BI Tools / Notebooks
↓
Semantic SQL (Timbr)
↓
SQL Knowledge Graph
↓
Ontology Mapping
↓
Physical Databases/Data Lakes
Key characteristics:
- Ontology-driven: OWL/RDFS concepts mapped to SQL tables
- Relationship-first: Explicit semantic relationships replace JOINs
- Data virtualization: No data movement, query pushdown to sources
- Inference capabilities: Semantic reasoning over relationships
- SQL interface: Standard SQL queries against ontology concepts
Databricks Metric Views Architecture
Native catalog objects within the lakehouse:
BI Tools / Notebooks / Genie
↓
MEASURE() SQL syntax
↓
Unity Catalog Metric Views
↓
Spark SQL / Photon
↓
Delta Lake Tables
Key characteristics:
- Measure-driven: Aggregations defined independently of dimensions
- Schema-based joins: Star/snowflake schema relationships in YAML
- Native execution: Spark SQL compiles measures into optimized queries
- Unity Catalog integration: Metrics are governed catalog objects
- Flexible aggregation: Query measures across any dimension at runtime
Critical Differences
| Aspect | Timbr.ai | Databricks Metric Views |
|---|---|---|
| Semantic Model | Ontology (concepts, properties, relationships) | Metric View (measures, dimensions, joins) |
| Query Syntax | Semantic SQL (relationships implicit) | Standard SQL + MEASURE() clause |
| Execution | Virtual layer with pushdown | Native Spark SQL / Photon |
| Data Location | Federated across sources | Databricks Delta Lake |
| Governance | Timbr RBAC + source permissions | Unity Catalog RBAC |
| Inference | Built-in semantic reasoning | No inference (explicit definitions) |
Migration Process Overview
Phase 1: Assessment and Planning
1. Inventory Timbr.ai Assets
Document existing semantic infrastructure:
sql-- Query Timbr metadata to list all ontologies SELECT ontology_name, concept_count, relationship_count FROM timbr.metadata.ontologies; -- Export ontology definitions SELECT concept_name, properties, relationships FROM timbr.metadata.concepts WHERE ontology_name = 'sales_ontology';
Catalog:
- Ontologies: Business domains modeled (Sales, Customer, Product)
- Concepts: Entity types defined (Customer, Order, Product)
- Relationships: Connections between concepts (Customer → Order, Order → Product)
- Measures: Aggregations and calculations
- Access patterns: Most queried concepts and relationships
2. Map Data Sources to Databricks
Identify where Timbr.ai queries data:
Timbr SQL Knowledge Graph
↓
Federated Sources:
- Oracle (CRM data)
- PostgreSQL (Orders)
- Snowflake (Analytics)
Migration strategy per source:
- Already in Databricks: Direct mapping to Delta tables
- External databases: Replicate to Databricks via Delta Live Tables or ETL
- Real-time sources: Consider Databricks streaming ingestion
3. Assess Query Patterns
Analyze how business users query Timbr.ai:
sql-- Example Timbr semantic SQL SELECT customer.region, SUM(order.amount) as total_revenue FROM customer WHERE customer.tier = 'enterprise'
Note implicit relationships: Timbr automatically joined customer ↔ order via ontology relationships. Databricks requires explicit join definitions in metric views.
Phase 2: Data Consolidation
1. Replicate External Data to Databricks
For data outside Databricks, establish ingestion pipelines:
python# Delta Live Tables pipeline example import dlt from pyspark.sql.functions import * @dlt.table( name="customers_bronze", comment="Raw customer data from Oracle CRM" ) def customers_bronze(): return ( spark.readStream .format("jdbc") .option("url", "jdbc:oracle:thin:@crm-db:1521/ORCL") .option("dbtable", "customers") .option("user", dbutils.secrets.get("db", "oracle_user")) .option("password", dbutils.secrets.get("db", "oracle_password")) .load() ) @dlt.table( name="customers_silver", comment="Cleansed customer data with quality checks" ) @dlt.expect_or_drop("valid_email", "email IS NOT NULL AND email LIKE '%@%'") def customers_silver(): return ( dlt.read("customers_bronze") .withColumn("ingestion_timestamp", current_timestamp()) .dropDuplicates(["customer_id"]) )
2. Build Star Schema in Delta Lake
Transform data into dimensional models:
sql-- Create dimension tables CREATE TABLE IF NOT EXISTS sales.dim_customers ( customer_key BIGINT GENERATED ALWAYS AS IDENTITY, customer_id STRING, customer_name STRING, region STRING, tier STRING, created_date DATE ); CREATE TABLE IF NOT EXISTS sales.dim_products ( product_key BIGINT GENERATED ALWAYS AS IDENTITY, product_id STRING, product_name STRING, category STRING, price DECIMAL(10,2) ); -- Create fact table CREATE TABLE IF NOT EXISTS sales.fact_orders ( order_key BIGINT GENERATED ALWAYS AS IDENTITY, customer_key BIGINT, product_key BIGINT, order_date DATE, order_amount DECIMAL(10,2), quantity INT, CONSTRAINT fk_customer FOREIGN KEY (customer_key) REFERENCES sales.dim_customers(customer_key), CONSTRAINT fk_product FOREIGN KEY (product_key) REFERENCES sales.dim_products(product_key) );
Phase 3: Convert Ontologies to Metric Views
1. Map Timbr.ai Concepts to Databricks Tables
Timbr.ai ontology:
Concept: Customer
Properties: customer_id, name, region, tier
Concept: Order
Properties: order_id, order_date, amount
Relationship: Order.customer → Customer
Concept: Product
Properties: product_id, name, category
Relationship: Order.product → Product
Databricks mapping:
dim_customers → Customer concept
fact_orders → Order concept
dim_products → Product concept
Joins define relationships
2. Translate Measures to Metric View YAML
Timbr.ai measure definitions (conceptual):
sql-- Measures in Timbr typically defined in UI or metadata MEASURE total_revenue = SUM(Order.amount) MEASURE order_count = COUNT(Order.order_id) MEASURE avg_order_value = SUM(Order.amount) / COUNT(Order.order_id)
Databricks Metric View YAML:
yamlmetric_view: name: sales_metrics source_table: sales.fact_orders joins: - name: customers source: sales.dim_customers on: fact_orders.customer_key = customers.customer_key - name: products source: sales.dim_products on: fact_orders.product_key = products.product_key dimensions: - name: customer_region expr: customers.region description: "Customer geographic region" - name: customer_tier expr: customers.tier description: "Customer tier: enterprise, pro, free" - name: product_category expr: products.category description: "Product category" - name: order_date expr: fact_orders.order_date description: "Date order was placed" measures: - name: total_revenue expr: SUM(order_amount) description: "Total revenue from orders" - name: order_count expr: COUNT(*) description: "Number of orders" - name: avg_order_value expr: SUM(order_amount) / COUNT(*) description: "Average revenue per order" - name: unique_customers expr: COUNT(DISTINCT customer_key) description: "Number of distinct customers"
3. Create Metric View in Unity Catalog
sql-- Create metric view from YAML CREATE METRIC VIEW sales.sales_metrics FROM 'dbfs:/metric_views/sales_metrics.yaml'; -- Or define inline using SQL CREATE METRIC VIEW sales.sales_metrics AS SELECT SUM(order_amount) AS total_revenue, COUNT(*) AS order_count, COUNT(DISTINCT customer_key) AS unique_customers, customers.region AS customer_region, customers.tier AS customer_tier, products.category AS product_category FROM sales.fact_orders JOIN sales.dim_customers customers ON fact_orders.customer_key = customers.customer_key JOIN sales.dim_products products ON fact_orders.product_key = products.product_key;
Phase 4: Query Translation
From Timbr.ai Semantic SQL:
sql-- Implicit relationships via ontology SELECT customer.region, SUM(order.amount) as revenue FROM customer JOIN order ON customer.customer_id = order.customer_id WHERE customer.tier = 'enterprise' GROUP BY customer.region;
To Databricks Metric View:
sql-- Explicit MEASURE() syntax SELECT customer_region, MEASURE(total_revenue) AS revenue FROM sales.sales_metrics WHERE customer_tier = 'enterprise' GROUP BY customer_region;
Phase 5: Semantic Intelligence with Typedef
Databricks Metric Views handle aggregation but lack Timbr.ai's semantic extraction and enrichment. Typedef's Fenic framework fills this gap.
Semantic extraction from unstructured data:
pythonimport fenic as fc from pydantic import BaseModel, Field from typing import Literal # Define schema for customer feedback class FeedbackInsight(BaseModel): sentiment: Literal["positive", "neutral", "negative"] key_issues: list[str] product_mentions: list[str] urgency: Literal["low", "medium", "high"] # Configure Fenic session config = fc.SessionConfig( app_name="customer_feedback_enrichment", semantic=fc.SemanticConfig( language_models={ "fast": fc.OpenAILanguageModel(model_name="gpt-4o-mini", rpm=100, tpm=50000) }, default_language_model="fast" ) ) session = fc.Session.get_or_create(config) # Read feedback from Delta table feedback_df = session.read.delta("sales.customer_feedback") # Extract structured insights enriched_feedback = ( feedback_df .with_column("insights", fc.semantic.extract(fc.col("feedback_text"), FeedbackInsight)) .unnest("insights") .filter(fc.col("sentiment") == "negative") .filter(fc.col("urgency") == "high") ) # Write back to Delta Lake for metric views to consume enriched_feedback.write.delta("sales.feedback_insights")
Now Databricks Metric Views can aggregate over enriched semantic data:
yamlmetric_view: name: feedback_metrics source_table: sales.feedback_insights dimensions: - name: sentiment expr: sentiment - name: urgency expr: urgency measures: - name: feedback_count expr: COUNT(*) - name: negative_high_urgency_count expr: COUNT_IF(sentiment = 'negative' AND urgency = 'high')
Semantic join for entity resolution:
python# Join customer feedback with product catalog using semantic similarity products_df = session.read.parquet("sales/dim_products.parquet") # Semantic join matches product mentions in feedback to actual products matched_feedback = ( enriched_feedback .semantic.join( other=products_df, predicate=""" The product mention "{{ left_on }}" refers to the product "{{ right_on }}" """, left_on=fc.col("product_mentions"), right_on=fc.col("product_name") ) ) matched_feedback.write.parquet("sales/feedback_products.parquet")
This pattern bridges Timbr.ai's semantic intelligence with Databricks' native performance.
Migration Patterns for Complex Use Cases
Pattern 1: Hierarchical Dimensions
Timbr.ai approach: Ontology naturally represents hierarchies (Product → Category → Department)
Databricks approach: Define snowflake schema joins
yamlmetric_view: name: product_hierarchy_metrics source_table: sales.fact_orders joins: - name: products source: sales.dim_products on: fact_orders.product_key = products.product_key - name: categories source: sales.dim_categories on: products.category_key = categories.category_key - name: departments source: sales.dim_departments on: categories.department_key = departments.department_key dimensions: - name: product_name expr: products.product_name - name: category_name expr: categories.category_name - name: department_name expr: departments.department_name measures: - name: total_sales expr: SUM(order_amount)
Query at any hierarchy level:
sql-- Department level SELECT department_name, MEASURE(total_sales) FROM sales.product_hierarchy_metrics GROUP BY department_name; -- Category level SELECT category_name, MEASURE(total_sales) FROM sales.product_hierarchy_metrics GROUP BY category_name;
Pattern 2: Time-Based Analytics
Timbr.ai: Often handles date hierarchies through ontology relationships
Databricks: Define time dimensions explicitly
yamldimensions: - name: order_date expr: fact_orders.order_date - name: order_month expr: DATE_TRUNC('month', fact_orders.order_date) - name: order_quarter expr: DATE_TRUNC('quarter', fact_orders.order_date) - name: order_year expr: YEAR(fact_orders.order_date)
Period-over-period comparisons:
sqlWITH current_period AS ( SELECT customer_region, MEASURE(total_revenue) AS current_revenue FROM sales.sales_metrics WHERE order_quarter = DATE_TRUNC('quarter', CURRENT_DATE()) GROUP BY customer_region ), prior_period AS ( SELECT customer_region, MEASURE(total_revenue) AS prior_revenue FROM sales.sales_metrics WHERE order_quarter = DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE(), -3)) GROUP BY customer_region ) SELECT c.customer_region, c.current_revenue, p.prior_revenue, ((c.current_revenue - p.prior_revenue) / p.prior_revenue) * 100 AS growth_pct FROM current_period c JOIN prior_period p ON c.customer_region = p.customer_region;
Pattern 3: Complex Business Rules
Timbr.ai: Business logic embedded in ontology or measures
Databricks: Use filters and derived measures
yamlmetric_view: name: customer_segmentation_metrics source_table: sales.fact_orders # Base filter applied to all queries filter: "order_date >= CURRENT_DATE() - INTERVAL 365 DAYS" joins: - name: customers source: sales.dim_customers on: fact_orders.customer_key = customers.customer_key dimensions: - name: customer_segment expr: | CASE WHEN customers.tier = 'enterprise' AND customers.annual_spend > 100000 THEN 'enterprise_high' WHEN customers.tier = 'enterprise' THEN 'enterprise_standard' WHEN customers.tier = 'pro' THEN 'professional' ELSE 'standard' END measures: - name: active_customer_count expr: COUNT(DISTINCT CASE WHEN order_date >= CURRENT_DATE() - 90 THEN customer_key END) - name: churn_rate expr: | (COUNT(DISTINCT CASE WHEN last_order_date < CURRENT_DATE() - 90 THEN customer_key END) / COUNT(DISTINCT customer_key)) * 100
Testing and Validation
1. Compare Query Results
Run parallel queries against Timbr.ai and Databricks:
sql-- Timbr.ai SELECT customer.region, SUM(order.amount) as revenue, COUNT(order.order_id) as order_count FROM customer JOIN order ON customer.customer_id = order.customer_id WHERE order.order_date >= '2025-01-01' GROUP BY customer.region;
sql-- Databricks Metric View SELECT customer_region, MEASURE(total_revenue) as revenue, MEASURE(order_count) as order_count FROM sales.sales_metrics WHERE order_date >= '2025-01-01' GROUP BY customer_region;
Validate results match within acceptable tolerance (account for timestamp differences, rounding).
2. Performance Benchmarking
Compare query execution times:
pythonimport time # Timbr query via JDBC timbr_start = time.time() timbr_result = spark.read \ .format("jdbc") \ .option("url", "jdbc:timbr://...") \ .option("query", "SELECT ...") \ .load() timbr_duration = time.time() - timbr_start # Databricks native query databricks_start = time.time() databricks_result = spark.sql(""" SELECT customer_region, MEASURE(total_revenue) FROM sales.sales_metrics GROUP BY customer_region """) databricks_duration = time.time() - databricks_start print(f"Timbr: {timbr_duration:.2f}s, Databricks: {databricks_duration:.2f}s")
3. Semantic Quality Validation
When using Typedef for semantic enrichment, validate extraction accuracy:
python# Sample validation set sample_feedback = [ "Product is terrible, completely broken", "Loving the new features, works great", "App crashes on startup, urgent fix needed" ] # Extract insights results = ( session.create_dataframe({"feedback": sample_feedback}) .with_column("insights", fc.semantic.extract(fc.col("feedback"), FeedbackInsight)) .unnest("insights") .collect() ) # Manual validation for row in results: print(f"Feedback: {row['feedback']}") print(f"Detected sentiment: {row['sentiment']}") print(f"Urgency: {row['urgency']}") print("---")
Production Deployment Strategy
1. Phased Rollout
Phase A: Read-Only Validation (Weeks 1-2)
- Metric views created in
devcatalog - BI tools read from both Timbr.ai and Databricks
- Compare results in dashboards side-by-side
- No production traffic to Databricks yet
Phase B: Pilot Users (Weeks 3-4)
- Select 5-10 early adopters
- Switch their dashboards to Databricks metric views
- Collect feedback on performance and accuracy
- Timbr.ai remains primary for most users
Phase C: Gradual Migration (Weeks 5-8)
- Migrate teams incrementally (10% → 25% → 50% → 100%)
- Monitor query performance via Databricks Query History
- Track adoption through Unity Catalog insights
- Maintain Timbr.ai as fallback
Phase D: Complete Cutover (Week 9+)
- All users on Databricks metric views
- Decommission Timbr.ai infrastructure
- Archive ontology definitions for reference
2. Access Control Migration
Map Timbr.ai permissions to Unity Catalog:
sql-- Create groups matching Timbr.ai roles CREATE GROUP IF NOT EXISTS sales_analysts; CREATE GROUP IF NOT EXISTS finance_team; CREATE GROUP IF NOT EXISTS executive_viewers; -- Grant permissions on metric views GRANT SELECT ON METRIC VIEW sales.sales_metrics TO sales_analysts; GRANT SELECT ON METRIC VIEW sales.sales_metrics TO finance_team; GRANT SELECT ON METRIC VIEW sales.sales_metrics TO executive_viewers; -- Apply row-level security if needed ALTER TABLE sales.fact_orders SET ROW FILTER region_filter ON (region = current_user_region());
3. Documentation and Training
Create migration artifacts:
Metric view catalog (resource):
- List all metric views with descriptions
- Map old Timbr.ai concept names to new metric view names
- Document MEASURE() syntax examples
- Provide query patterns for common use cases
Training materials:
- Side-by-side query comparisons (Timbr.ai SQL vs Databricks)
- Video walkthrough of Catalog Explorer
- FAQs addressing common migration questions
Advanced: Maintaining Semantic Intelligence
Databricks Metric Views excel at aggregation but don't natively handle:
- Semantic extraction from unstructured text
- Entity resolution across inconsistent naming
- Classification into business taxonomies
- Relationship inference beyond explicit joins
Typedef's Fenic framework provides these capabilities through declarative semantic operators.
Semantic Enrichment Pipeline
pythonimport fenic as fc from pydantic import BaseModel, Field from typing import List, Literal # Schema for extracting structured data from support tickets class TicketExtraction(BaseModel): issue_category: Literal["bug", "feature_request", "question", "complaint"] affected_product: str = Field(description="Name of the product mentioned") severity: Literal["low", "medium", "high", "critical"] customer_sentiment: Literal["positive", "neutral", "negative"] action_items: List[str] # Configure session config = fc.SessionConfig( app_name="support_ticket_enrichment", semantic=fc.SemanticConfig( language_models={ "extractor": fc.OpenAILanguageModel(model_name="gpt-4o-mini", rpm=100, tpm=50000), "classifier": fc.AnthropicLanguageModel(model_name="claude-3-5-haiku-latest", rpm=50, input_tpm=100000, output_tpm=50000) }, default_language_model="extractor" ) ) session = fc.Session.get_or_create(config) # Read raw tickets from Delta tickets = session.read.parquet("support/raw_tickets.parquet") # Semantic enrichment pipeline enriched_tickets = ( tickets # Extract structured information .with_column("extracted", fc.semantic.extract(fc.col("ticket_text"), TicketExtraction, model_alias="extractor") ) .unnest("extracted") # Classify customer tier using semantic predicate .with_column("high_value_customer", fc.semantic.predicate( "Is this a high-value customer based on their history? {{ history }}", history=fc.col("customer_history"), model_alias="classifier" ) ) # Semantic join to match affected product to product catalog .semantic.join( other=session.read.parquet("products/catalog.parquet"), predicate=""" The mentioned product "{{ left_on }}" refers to catalog product "{{ right_on }}" """, left_on=fc.col("affected_product"), right_on=fc.col("product_name") ) # Generate embeddings for similarity search .with_column("ticket_embedding", fc.semantic.embed(fc.col("ticket_text")) ) ) # Write enriched data back to Delta enriched_tickets.write.parquet("support/enriched_tickets.parquet")
Now create a metric view over enriched data:
sqlCREATE METRIC VIEW support.ticket_metrics FROM 'dbfs:/metric_views/ticket_metrics.yaml';
yamlmetric_view: name: ticket_metrics source_table: support.enriched_tickets joins: - name: products source: products.catalog on: enriched_tickets.product_key = products.product_key dimensions: - name: issue_category expr: issue_category - name: severity expr: severity - name: customer_sentiment expr: customer_sentiment - name: product_name expr: products.product_name measures: - name: ticket_count expr: COUNT(*) - name: critical_issues expr: COUNT_IF(severity = 'critical') - name: negative_sentiment_pct expr: (COUNT_IF(customer_sentiment = 'negative') / COUNT(*)) * 100 - name: avg_resolution_time expr: AVG(resolution_time_hours)
Query semantically enriched metrics:
sqlSELECT product_name, MEASURE(ticket_count), MEASURE(critical_issues), MEASURE(negative_sentiment_pct) FROM support.ticket_metrics WHERE severity IN ('high', 'critical') GROUP BY product_name ORDER BY MEASURE(critical_issues) DESC;
Common Migration Challenges and Solutions
Challenge 1: Timbr.ai Inference Rules
Problem: Timbr.ai ontologies support inference (e.g., "Enterprise customers are VIP customers"). Databricks has no native inference.
Solution: Materialize inferred relationships in Delta tables or use SQL CASE expressions:
sql-- Materialized inference CREATE OR REPLACE TABLE sales.customer_classifications AS SELECT customer_key, tier, CASE WHEN tier = 'enterprise' THEN TRUE WHEN lifetime_value > 50000 THEN TRUE ELSE FALSE END AS is_vip FROM sales.dim_customers; -- Reference in metric view metric_view: joins: - name: customers source: sales.customer_classifications on: fact_orders.customer_key = customers.customer_key dimensions: - name: is_vip expr: customers.is_vip
Challenge 2: Query Performance Regression
Problem: Initial Databricks queries slower than Timbr.ai due to unoptimized joins.
Solution: Optimize star schema and leverage Databricks features:
sql-- Create optimized tables with Z-ordering OPTIMIZE sales.fact_orders ZORDER BY (order_date, customer_key, product_key); -- Use liquid clustering for dimension tables ALTER TABLE sales.dim_customers CLUSTER BY (customer_key, region); -- Enable Photon for SQL warehouses (automatic in newer Databricks runtimes) -- Set compute settings: Photon Acceleration = ON
Challenge 3: Missing Semantic Context
Problem: Databricks metric views aggregate numbers but don't extract meaning from text fields.
Solution: Typedef enrichment pipeline (learn more):
python# Weekly batch enrichment job def enrich_feedback_batch(): session = fc.Session.get_or_create(config) # Read new feedback since last run new_feedback = ( session.read.parquet("support/raw_feedback.parquet") .filter(fc.col("processed") == False) ) # Semantic extraction enriched = ( new_feedback .with_column("insights", fc.semantic.extract(fc.col("feedback_text"), FeedbackSchema)) .unnest("insights") .with_column("embedding", fc.semantic.embed(fc.col("feedback_text"))) .with_column("processed", fc.lit(True)) ) # Merge back to source enriched.write.parquet("support/raw_feedback.parquet", mode="overwrite") # Schedule as Databricks job # Runs daily at 2 AM UTC
Monitoring and Observability
Track Migration Success
Unity Catalog Insights:
sql-- Query frequency by metric view SELECT table_name, COUNT(*) as query_count, AVG(execution_time_ms) as avg_execution_ms FROM system.access.audit WHERE table_catalog = 'sales' AND table_schema = 'metrics' AND action_name = 'SELECT' AND event_date >= CURRENT_DATE() - 30 GROUP BY table_name ORDER BY query_count DESC;
Typedef Pipeline Metrics:
python# Metrics are always included in QueryResult - no parameter needed result = enriched_tickets.collect() print(f"Total LLM tokens: {result.metrics.lm_metrics.total_tokens}") print(f"Estimated cost: ${result.metrics.lm_metrics.estimated_cost:.4f}") print(f"Pipeline execution time: {result.metrics.execution_time}s")
Conclusion
Migrating from Timbr.ai to Databricks Unity Catalog Metric Views consolidates semantic layer infrastructure while gaining native lakehouse performance. The migration requires converting ontology concepts to metric view YAML, reimplementing relationships as explicit joins, and translating semantic SQL to MEASURE() syntax.
For organizations that valued Timbr.ai's semantic intelligence—extracting meaning from text, resolving entities, inferring relationships—Typedef's Fenic framework bridges the gap. Declarative semantic operators provide the extraction and classification capabilities Databricks lacks, while Unity Catalog Metric Views handle governed aggregation at scale.
The result: unified metrics definitions across SQL, Python, and AI tools, native Unity Catalog governance, and the semantic processing power needed for modern data workloads.
Migration Resources:
- Building Reliable AI Pipelines with Fenic Semantic Operators
- Create Composable Semantic Operators for Data Transformation
- Eliminate Fragile Glue Code in AI Data Processing
Get Started:
