Addressing the limitations of traditional BI tools for complex analyses

10 minute read

October 28, 2024

TL;DR: BI tools are great for standardized reports, but they're holding teams back when it comes to the messy, complex analysis that actually drives business decisions. While BI has its place, data teams need a flexible toolkit that includes code-first approaches (SQL/Python), collaborative tools, and AI-assisted analytics. This shift from rigid BI-only environments to a balanced multi-tool approach helps transform data teams from report factories into strategic partners who can tackle any analytical challenge thrown their way.

Back when I needed data as a product manager, I typically had three choices:

Open up a BI dashboard
Open up a SaaS product analytics solution like Posthog, Amplitude, or Mixpanel
Run a SQL query, export the data, and dump it into a spreadsheet or Jupyter notebook

Time and time again I faced this decision (despite the BI team’s best efforts to provide self-serve data) and realized there was an obvious pattern: BI was great if I needed a report on something I was sure had an answer, but if my use case or question was a bit fuzzy (or super unique to my own work) I was left wrestling with SQL.

For example, if I was looking for our enterprise net dollar retention over time, I could almost guarantee I could find those numbers in the latest board report (assuming I had permission). On the other hand, if I was conducting an ad hoc analysis where I needed to figure out if there was an issue with our integrations and whether or not the team should be prioritizing a fix, it was trickier. I may need to answer a super specific question like “How many customers have we onboarded in the last 30 days that already have an expired Salesforce token?”

This example–and countless others that data and non data teams alike run into every day–illustrates a major trend: BI has grown at a rapid rate in the US and globally in recent years. It’s forecasted to continue growing at a 10% compound annual growth rate, and yet it is not the complete solution to make full use of the massive amounts of critical data we continue to accumulate.

Below, I’ll talk through a few of the limitations of “traditional” BI as we know it and some of the ways modern teams can overcome them to better meet their needs.

‍

Traditional BI tools: Their strengths and intended use cases

BI tools are perfectly designed for one thing: Providing reports and dashboards that are extremely tightly defined, leaving little room for error or interpretation. Generally, this is a great thing. It means that, as a data team, you can define a metric in one specific way and ensure your stakeholders are viewing the right number, no matter how they slice it. On the flip side, as a business stakeholder, you can use those numbers with confidence (at least, that’s the theory. Things are a bit more messy in practice).

BI tools can offer these guardrails thanks to their semantic layer. Whether or not the semantic layer is labeled as such, every BI tool has some flavor of it. For example, Looker has LookML as its semantic layer. Other modern BI alternatives either have their own version or integrate with third-party semantic layers such as dbt and Cube. This semantic layer is a way to codify measure and dimensions in a supervised manner.

Let’s revisit the Superdope company that we’ve explored in past posts and say we want to define the ARR in Looker. It might look something like:

view: superdope_subscriptions {
  sql_table_name: superdope_db.subscriptions ;;
  
  dimension: subscription_id {
    primary_key: yes
    type: string
    sql: ${TABLE}.subscription_id ;;
  }
  
  dimension: mrr {
    label: "Monthly Recurring Revenue"
    type: number
    value_format_name: usd
    sql: ${TABLE}.mrr ;;
  }
  
  dimension: contract_end_date {
    type: date
    sql: ${TABLE}.contract_end_date ;;
  }
  
  measure: arr {
    label: "Annual Recurring Revenue"
    type: sum
    sql: CASE
           WHEN ${contract_end_date} > CURRENT_DATE THEN ${mrr} * 12
           ELSE 0
         END ;;
    value_format_name: usd
  }
}

Once defined in this format, a label for “Annual Recurring Revenue” will appear in the Looker user interface, which non-technical users can then drag and drop to slice in different ways. This is fantastic, if the measures are well defined.

At this stage, it’s also worth calling out the BI paradox: The more you try and define every metric in BI, the more surface area you have to maintain to prevent definition drift. This also creates more opportunities for ad hoc questions because the numbers sometimes don’t seem right to end users.

If you have a single metric in BI, you can put a lot of effort in ensuring that that metric is correct. If you have 1,000 metrics, it becomes more complicated. Even if drift doesn’t happen – or you catch it all with solid data engineering best practices – keeping track of every metric as your business evolves will become more and more complicated.

‍

The complexity gap: Where traditional BI falls short

As we’ve seen in the example above, the configuration maintenance for measures and dimensions requires fairly deep technical expertise and time. The power of the semantic layer is precisely what makes traditional BI fall short when it comes to more complex, exploratory data analysis.

There are three main reasons that this is the case:

1. Limited data manipulation capabilities

Traditional BI tools often lack the flexibility to handle ad hoc, complex queries with data from multiple sources. This forces analysts to use time-consuming, manual processes when gathering and manipulating their data.

Let’s revisit the PM example I talked about before. I want to find out how many Superdope customers we’ve onboarded in the last 30 days and already have an expired Salesforce token. There’s a slim chance someone on the data team defined those metrics and made them available.

Chances are, our workflow looks like the following:

Go to Salesforce, Hubspot, or some other source to find a list of customers and their onboarded date. Export the data.
Run a few SQL queries against your read-only production database replica to find a list of organizations with expired Salesforce tokens. Export the data.
Import the data into a spreadsheet or Jupyter notebook to blend the data and conduct the analysis.

This convoluted workflow exemplifies the limitations of traditional BI tools when it comes to complex, multi-source analyses.

Instead of providing a streamlined process within a single environment, it forces you to juggle multiple tools, export and import data repeatedly, and manually blend information from disparate sources. Not only is this time-consuming and error-prone, but it also hinders the agility and speed of your decision making.

‍

2. Inflexibility in visualization options

Traditional BI tools often fall short when it comes to creating custom or complex visualizations, limiting your ability to effectively communicate insights and tell compelling data stories.

Charting in BI works if the data is clean and you’re using a standard chart. If your data is a little bit messy, or you want to get just a little creative with your visualizations, it can get complicated quickly. Even creating simple conditional formatting in tables can be tedious in traditional BI solutions.

This rigidity in visualization options is a big blocker when it comes to effective data communication. When analysts are constrained to basic charts or struggle with even simple customizations like conditional formatting, they lose the ability to tailor visual storytelling to their specific audience and message.

After all, the power of data lies not just in its analysis, but in how it's presented and understood by stakeholders. By forcing data professionals to work within narrow visual constraints or resort to external tools for more advanced visualizations, traditional BI solutions create unnecessary barriers to impactful data-driven decision making.

Flexible, code-based visualization options, as demonstrated by this Python Sankey diagram example, offer a level of customization and creativity that’s key to bringing data insights to life.

‍

3. Difficulties in incorporating advanced statistical methods and machine learning

Not all analyses can be as simple as a SQL query or pivot table, which traditional BI thrives at. Certain analyses, often the most important and strategic, require more advanced methods such as machine learning and statistical models. These sophisticated techniques are typically beyond the scope of what traditional BI tools can handle natively.

For example, let's say our friends at Superdope wants to study their customers to identify distinct profiles to help them craft better marketing campaigns. This type of analysis might require a clustering algorithm such as K-Means, which relies on specialized machine learning packages. Traditional BI tools generally don't support such advanced algorithms out of the box, nor do they provide the flexibility to integrate custom code for these purposes.

Other examples of advanced analyses that traditional BI struggles with include:

Predictive modeling for forecasting future trends
Natural language processing for analyzing customer feedback
Time series analysis for detecting anomalies in system performance
Deep learning for image recognition or complex pattern detection

The inability to seamlessly incorporate these advanced analytical methods within traditional BI environments often forces data folks to switch between multiple tools and platforms, breaking the flow of analysis and potentially introducing errors in the process. This not only slows down time to insight but also restricts the depth of insights that we can derive from the data.

As businesses increasingly rely on sophisticated data techniques to gain a competitive edge, the constraints of traditional BI in this area will feel tighter and tighter.

‍

The impact of BI limitations on data teams and decision-making

The limits of traditional BI tools have far-reaching consequences within modern companies.

Teams that attempt to conduct all their data analysis within the confines of traditional BI tools often find themselves restricted to easily definable metrics and standardized reports. This limitation tends to pigeon-hole data teams into becoming mere report builders rather than strategic partners to the business.

This causes more complex, nuanced questions to be deprioritized or overlooked entirely, which is bad news for both the data team’s ability to provide comprehensive insights but also whether or not they’re viewed as a strategic advisor within the organization.

The reality is that most data teams recognize the need to answer these more complex questions. However, without the right tools at their disposal, they're forced to resort to time-consuming workarounds, such as:

Re-writing complex SQL queries for each new analysis
Exporting data from multiple sources
Manipulating data in spreadsheets or Jupyter notebooks
Creating ad-hoc visualizations outside of the BI platform

All of the above options create reports that are not reproducible and not traceable, poor collaboration and miscommunication between teams, and, worst of all, missed opportunities to drive revenue and growth with data.

‍

Modern solutions for complex data analyses

Thankfully, while traditional BI tools become less and less helpful for complex analysis, there are a handful of modern options that offer the flexibility, power, and collaboration capabilities to handle more advanced use cases.

Code-first approaches (SQL, Python)

Complex, ad hoc analyses require advanced and flexible tools. SQL and Python in tandem form a very powerful combination: SQL to query data and perform basic manipulation and Python for more advanced statistical analysis, machine learning and visualization.

A code-first approach helps you use proper versioning and follow engineering best practices, which, in turn, ensures your work is traceable and fully reproducible by your future self and your coworkers (more on this when we discuss collaboration below).

The versatility of these languages empowers data teams to explore data deeply, apply sophisticated algorithms, and generate insights that would be impossible within the constraints of traditional BI tools.

Notebook environments

Jupyter notebooks have long been popular in the data community for their step-by-step approach to analysis. They offer a simple way to construct an analysis step-by-step using code, and can help tell a story behind the analysis and the data. This type of format is particularly well suited for ad hoc data and exploratory data analysis. Even if you aren’t sure where the data may lead you, you can start breaking down the problem into individual steps until you get to a conclusion.

However, traditional notebooks have limitations: variable state management can hinder result reproduction, and sharing/collaboration can be challenging. Modern notebook environments address these issues with improved version control and collaborative features, enhancing their utility for team-based data analysis while maintaining their strength in exploratory work.

Collaborative data analysis tools

Complex data questions require richer collaboration. Features like row highlighting and tagging coworkers in comments, familiar from tools like Google Sheets, are crucial for focused discussion on specific data points. This keeps teams working in a unified environment, avoiding context loss through screenshots.

Beyond comments, true collaboration involves technical team members working simultaneously on analyses. While traditional BI lacks version control, code-first approaches enable tandem work without conflicts. This method allows team members to revisit and understand peers' analyses step-by-step, fostering knowledge sharing and ensuring analytical continuity within data teams.

AI-assisted analytics

We can’t talk about modern data analysis solutions without talking about AI. An AI solution configured specifically for your data and with a dedicated purpose of analyzing data can instantly 10X a data analyst’s or data scientist's productivity. AI can help with everything from ideating around a hypothesis to exploring data and schemas to summarizing findings to generating or optimizing code.

A well-tuned AI can automate data insights and help you extract value from your data in a way that simply isn’t possible in BI due to its purposely-designed rigidity.

These modern solutions–code-first approaches, advanced notebook environments, collaborative platforms, and AI-assisted analytics–form a powerful toolkit to help data teams to break free from the constraints of traditional BI so they can dive deeper into insights and support more meaningful business decisions.

‍

Balancing BI and advanced analytics tools in your data stack

BI platforms are essential for any data team nowadays. Even (read: especially) in the age of AI, they give data teams a way to put trustworthy reports in the hands of stakeholders.

However, no matter how much you’ve invested in BI, you’ll always face ad hoc data analysis questions. In most cases, these ad hoc questions are important asks that require one off reports and advanced data analysis capabilities.

Building a strategic data team that can impact the growth trajectory of the business requires more than building data models and layering a BI tool on top. You need to balance a variety of tools to provide you and your stakeholders the right capabilities for the right job. To adapt quickly to changing priorities, we have to make sure we’re never holding up a decision due to a lack of data or, worse, making critical decisions based on gut instinct alone.

‍

Embracing a multi-tool approach for comprehensive data analysis

We’ve long been promised – and hoped for – a unified platform that would solve all our data needs. However, as the data needs of modern companies have become more and more complicated, it’s clear that a multi-tool approach offers companies the flexibility, advanced capabilities, and interoperability necessary to respond to those needs.

The key is not to have one platform to rule them all. Rather, data teams should focus on finding the best tool for each specific job they need done and ensuring those tools play well within their existing stack.

When done well, this approach can shift a data team from dashboard builder to strategic partner in the business, one who can have a meaningful impact on the revenue and growth trajectory of the company.

If you’re ready to rethink how you handle complex and ad hoc analysis, Fabi.ai offers all the modern solutions you need from code-first flexibility to a collaborative notebook and reporting experience to AI-assisted analytics in one integrated platform.

Getting set up with a free trial takes less than five minutes.

Marc Dupuis

CEO & Co-Founder @ Fabi.ai

Example H2

Example H3