Beyond Batch Reports
Business users expect real-time visibility — live order counts, active user metrics, revenue tracking, and operational health dashboards that update in seconds, not hours. Building real-time analytics requires a different architecture than traditional batch reporting. At Nexis Limited, our products include real-time dashboards — Bondorix tracks shipments in real time, and Ultimate HRM displays live workforce metrics.
Architecture Overview
A real-time analytics pipeline has four layers:
- Data ingestion: Capture events from application services as they happen.
- Stream processing: Transform, aggregate, and enrich events in real time.
- Storage: Store processed data in a query-optimized database.
- Visualization: Display data in dashboards with real-time updates.
Data Ingestion
Capture events at the source — application services publish events to a message broker (Kafka, AWS Kinesis, or Redis Streams) whenever significant actions occur: user signups, orders placed, shipments dispatched, errors encountered. Events should include a timestamp, event type, and all relevant context.
Design events to be immutable facts: "Order #123 was placed at 2026-07-30T14:23:00Z by User #456 for $89.50." Events are append-only — they are never modified or deleted.
Stream Processing
Stream processing transforms raw events into analytics-ready data:
- Aggregation: Count events per time window (orders per minute, signups per hour).
- Enrichment: Join event data with reference data (add customer name to order events).
- Filtering: Remove noise and extract relevant events for specific dashboards.
- Windowing: Group events into time windows (tumbling, sliding, session) for time-series analysis.
Tools include Apache Flink, Kafka Streams, Apache Spark Streaming, and AWS Kinesis Data Analytics. For simpler use cases, application-level aggregation with periodic database writes may be sufficient.
Analytics Storage
Choose storage optimized for analytical queries:
- ClickHouse: Column-oriented database designed for real-time analytics. Excellent query performance on large datasets with aggregation. Our preferred choice for analytics workloads.
- TimescaleDB: Time-series database built on PostgreSQL. Good for teams already using PostgreSQL who need time-series analytics.
- Apache Druid: Real-time analytics database designed for sub-second queries on event data.
- Materialized views in PostgreSQL: For simpler use cases, materialized views that refresh periodically can serve analytics queries from your existing database.
Dashboard Visualization
Real-Time Updates
Dashboards receive updates through:
- Polling: Frontend polls the API periodically (every 5-30 seconds). Simplest to implement.
- WebSockets: Server pushes updates to the frontend as they happen. Lower latency but more complex.
- Server-Sent Events (SSE): Server streams updates to the frontend. Simpler than WebSockets for one-directional data flow.
Visualization Libraries
- Recharts / Nivo: React-based charting libraries. Good for custom dashboards.
- D3.js: Low-level visualization library for custom, interactive charts.
- Grafana: Open-source dashboarding platform with built-in support for many data sources. Excellent for operational dashboards.
- Apache Superset: Open-source BI and visualization platform.
Performance Considerations
- Pre-aggregate: Compute aggregations during stream processing, not at query time. Dashboards should read pre-computed values, not run complex queries on raw data.
- Limit data granularity: Real-time dashboards typically show data at minute or hour granularity. Second-level granularity is rarely needed and increases storage and query costs.
- Cache dashboard queries: Cache API responses for frequently accessed dashboards. A 5-second cache is still "real-time" for most business purposes.
- Paginate and filter: Large datasets should be filtered by time range and aggregation level — do not attempt to load all data at once.
Conclusion
Real-time analytics dashboards require a purpose-built data pipeline — from event ingestion through stream processing to optimized storage and visualization. Start with clear requirements on latency, scale, and the specific metrics you need to display. Choose components that match your scale — simpler solutions (polling + PostgreSQL views) work for many applications before investing in full streaming architectures.
Building analytics capabilities? Our team designs and builds real-time analytics platforms.