How InfraSight is Leveraging Databricks to Scale Data Services with Flexibility and Efficiency

Mateus Oliveira – VP of Data Services, InfraSight

In a modern data organization, delivering valuable insights while maintaining operational efficiency is no longer optional — it’s expected. At InfraSight Software, we’re committed to building data platforms that are not only scalable and reliable but also designed to evolve with the changing needs of our clients and own innovation roadmap.

To that end, Databricks has become our most important strategic platform.

While it’s often viewed as a solution for large enterprises with complex AI workloads, we’ve shown that with the right approach, Databricks can be used with precision — and cost-efficiency — even in lean, agile environments.

The Starting Point

Our original telemetry architecture was based on traditional Spark job development, a proven approach for batch and streaming workloads. It enabled us to process OpenTelemetry data — across multiple client environments.

However, as the scope of our services expanded, we faced new challenges:

  • The pipelines required manual orchestration and maintenance, increasing engineering effort;
  • Scaling to new clients introduced operational complexity;
  • Running large Spark jobs daily led to relevant infrastructure expenses.

The architecture served us well, but it lacked the flexibility and automation needed to support our next phase of growth.

The Shift

To modernize our data services and improve operational efficiency, we migrated to a modular, declarative pipeline framework using Lakeflow (formerly known as DLT).

This shift was not about simply reducing code — it was about building a pipeline engine that adapts dynamically to each customer and use case, without sacrificing performance or governance.

What We Achieved:

1. Reusable Pipeline Framework

We built parameterized DLT notebooks that allows each telemetry pipeline to be dynamically instantiated and configured. This eliminated the need to clone or manually maintain pipeline logic for each client.

2. Customization Without Overhead

Using Databricks’ native APIs in Python, we retained full control over data quality rules, schema evolution, scheduling, and exception handling — while keeping the logic simple and maintainable.

3. Governed, Incremental Processing

DLT’s native support for incremental updates and lineage tracking allows us to deliver fresh insights efficiently and reliably, even as data volumes grow.

Lower Costs, Higher Throughput, Stronger Foundation

This transformation has helped us move faster, scale more cleanly, and reduce costs — all while improving data quality, transparency, and reliability.

  • Runtime
    • Before (Traditional Spark Jobs): 14 hours/day (cumulative)
    • After (DLT Framework): 1 hour/day
    • Improvement: 93% reduction
  • Infrastructure Cost: 55% total savings across Databricks and AWS
  • Pipeline Maintainability
    • Before: Manual, per-job logic
    • After: Modular & parameterized
    • Improvement: Lower engineering effort

Closing the Loop

With our new Databricks-powered architecture, we’ve created more than just a data pipeline — we’ve built a foundation for intelligence.

The same telemetry data that powers observability dashboards is now enabling us to build next-gen features for clients and AI use cases.

We’ve closed the loop between observability and actionability, and Databricks plays a central role.

Why It Matters for C-Suite Leaders

For technology executives evaluating data infrastructure investments, here are the key insights from our experience:

  • Enterprise tools can be cost-effective when used surgically. We’ve reduced operational costs without giving up advanced capabilities.
  • Automation and customization don’t need to be opposites. We’ve achieved both through smart use of Databricks APIs and pipeline design patterns.
  • Well-structured data pipelines unlock long-term value. From observability to AI, the return on clean, governed telemetry is exponential.

This is not just about processing data efficiently. It’s about creating an ecosystem where data fuels innovation — both for our clients and for our own roadmap.

What’s Next

We continue to evolve our data infrastructure, using Databricks to push the boundaries of what’s possible in telemetry and financial data processing, observability, and AI integration.

The path forward is clear: modular, intelligent, and cost-conscious data services — built on a platform that gives us the flexibility to grow without compromise.

If you’re exploring how to modernize your telemetry or observability stack, I’d be glad to share more.

Let’s build data infrastructure that works smarter — not just harder. Think Strategic.

- Mateus

Posts