Data Ops & Analytics: Unlocking the Hidden Value of Runtime Data

A massive divide traditionally exists between software engineering and data engineering. Application code is usually only responsible for throwing data into the database, while subsequent tasks like data quality validation, schema drift detection, and real-time dashboard analytics are delegated to heavy offline ETL data pipelines or specialized CDC (Change Data Capture) tools like Debezium.

As a next-generation data runtime, TeaQL shifts many traditionally offline data analysis tasks into the application layer, enforcing quality control and preliminary dimension reduction right at the source of data generation and flow.

(Note: Some advanced governance features are already implemented in the Java version and are being progressively ported to the Rust version's roadmap.)

1. Schema Drift Detection & Self-Healing

In fast-iterating microservice architectures, discrepancies between the entity structure in code and the actual database table structure often occur, causing catastrophic failures.

At startup, TeaQL's first priority is to construct a robust in-memory metadata registry (EntityDescriptor). Building on this, the DataServiceExecutor provides powerful self-bootstrapping and validation capabilities:

Automatic Table Creation & Column Appending: If a new table does not exist, TeaQL can automatically issue a CREATE TABLE DDL. If new fields are added, TeaQL detects the missing columns and runs ALTER TABLE ADD COLUMN.
Initial Data Seeding (Initial Graphs): When an enumeration or configuration table is newly created, TeaQL allows you to insert initial graph mutation configurations during the UserContext startup phase. The system automatically checks whether the table already contains data, enabling idempotent data seeding.

This removes the need for development teams to maintain complex external SQL migration scripts, drastically reducing operational cognitive load in monoliths or lightweight microservice deployments.

When developing backend dashboards, it is common to display the active user list while simultaneously displaying aggregate data like "user count distribution by membership level" or "daily active trends over the past seven days".

To achieve this traditionally, developers must issue multiple queries: one SELECT * to fetch details, and several SELECT COUNT(*) ... GROUP BY queries to gather statistics.

TeaQL's built-in SmartList<T> combined with facet_by expressions resolves this redundancy directly in the programming model:

// A single query retrieves both the detail list and aggregate facets
let active_users = Q::users()
    .comment("Query users").purpose("Load data").with_status_eq("ACTIVE")
    // Facet: Count distribution by user level
    .facet_by_level_as(
        "level_distribution", 
        Q::user_level().comment("Query user_level").purpose("Load data").count_users()
    )
    .execute_for_smart_list(&ctx).await?;

// The detail list itself
println!("Returned {} user records", active_users.items().len());

// The accompanying aggregated side-profile (Facet)
let distribution = active_users.facets().get("level_distribution");
// [ {"VIP": 120}, {"Normal": 800} ]

This pattern reduces the code volume for backend statistics interfaces by 80% while significantly optimizing round-trip time (RTT) to the database.

3. Toward Zero-ETL: Runtime CDC Event Streaming

With the popularity of real-time analytics, synchronizing business changes to analytical databases like ClickHouse or Doris in sub-seconds is now a standard requirement.

TeaQL completely reinvents Change Data Capture (CDC). Because all mutations are funneled through entity.audit_as("Save operation").save(&ctx) and generate precise EntityEvent objects (detailing modified dirty fields and business intent), the framework already possesses a perfect stream of data changes at runtime.

In the future, by registering custom EntityEventSink plugins, TeaQL will enable CDC event streaming with zero additional infrastructure:

// Concept Demo: Stream precise events directly to Kafka without deploying external listeners
#[async_trait]
impl EntityEventSink for RealtimeDataWarehouseSink {
    async fn on_updated(&self, ctx: &UserContext, event: EntityUpdatedEvent) {
        if event.entity_name == "Order" {
            // Push precise order status changes directly to the stream processing engine
            kafka_producer.send(event.to_json_bytes()).await;
        }
    }
}

Compared to reading low-level database binlogs, this application-level interception captures changes with rich domain intent, providing a cleaner, more direct data source for upstream analytics.

Summary

Looking forward, data is about more than just saving and retrieving. TeaQL integrates a code-defined schema protection system, a multi-dimensional statistics engine, and real-time change streams into a unified runtime, equipping your application with powerful Data Ops capabilities from day one.

1. Schema Drift Detection & Self-Healing​

2. Facet: Dimension Reduction for Dashboards​

3. Toward Zero-ETL: Runtime CDC Event Streaming​

Summary​

1. Schema Drift Detection & Self-Healing

2. Facet: Dimension Reduction for Dashboards

3. Toward Zero-ETL: Runtime CDC Event Streaming

Summary