Introducing Analyst Agent: Deploy custom AI agents for self-service analytics in minutes

TL;DR: Analyst Agent helps data teams make self-serve data analytics a real, trustworthy possibility thanks to specialized AI data agents. Business users can explore data independently with AI assistance, while data teams maintain full control over data quality and access.

In early 2023, Fabi.ai’s now-CTO, Lei, called me to get my feedback on an idea. 

With the recent release of ChatGPT, what if we made a self-service BI solution that could answer all of the business’ questions? 

We both had acutely felt both sides of the pain of frustrating data analysis workflows in our careers. Lei had led data teams and dealt with constant requests from the business during his time at (fill in). And I knew how terrible it felt to be the data consumer constantly asking questions from my time as a product manager at (fill in). 

The promise of an AI data tool that could magically answer any questions from folks like me and spare data experts like Lei the manual exploratory work sounded magical. However, after much experimentation and iteration, we’ve concluded that a monolithic AI connected to a data source (with or without a semantic layer), simply isn’t the answer. Instead, we’ve found a new unlock with specialized AI data agents that can be configured, managed and deployed by data teams.

Today - we’re excited to announce Fabi.ai Analyst Agent, which helps data and business experts alike experience this magic themselves in a novel way. With Analyst Agent, you can:  

  • Deploy specialized agents in minutes: Connect to any data source, merge multiple sources in memory, and launch customized AI agents quickly - all while maintaining full control over data quality and access on an individual agent level.
  • Run advanced analytics without code: Leverage Python-powered AI for complex analysis like regression models and propensity scoring, going far beyond simple data filtering and basic SQL queries.
  • Control exactly what data users can access: Deploy AI agents that only work with curated datasets you've approved, ensuring nearly 100% accuracy and maintaining confidence in every response.

Check out our short introduction video to see it in action:

Why data and business experts alike need dedicated AI agents 

Despite the text-to-SQL prowess of LLMs and recent improvements in the last year, AI data tools often fall short of their marketing promises and create more headache and stress than they cut out. This is because of three core issues: 

  1. AI requires a pristine semantic layer to answer any question. For AI to be trusted as the first line of defense for business questions, it must be able to answer most business questions reliably. This high-caliber reliability needs a pristine, up-to-date semantic layer - an almost impossible task as businesses and their data models constantly evolve. Maintaining this type of semantic layer is a Sisyphean task.
  2. The analytics team’s reputation is on the line. Even with a semantic layer that's 90% accurate, a single wrong answer to a CFO's urgent question could damage the data team's credibility. Like self-driving cars, the question of accountability when things go wrong makes widespread adoption challenging.
  3. Business users usually don’t really know what they’re asking for. Business users often approach the data team with surface-level questions that mask deeper needs, or ask misaligned questions due to unfamiliarity with the data model (through no fault of their own). A great data team's value lies not just in writing SQL queries, but in understanding the business context and providing truly helpful insights.

Faced with these challenges in the current ecosystem, we were excited about the idea of creating something that could solve the collaboration challenge between data teams and business teams. We believed we could help the data team focus on big questions. They could spend less time building dashboards and moving widgets. At the same time, we wanted to give the business team more room to explore the data. 

This is where Analyst Agent came into play. 

Introducing Analyst Agent: A new way to collaborate around data

Typically, reporting requests that the data team deals with fall into two categories:

  1. Adjustments to existing dashboards: This might be a request like, “Can I see the data by X?” Or it could be an export for users to analyze their data in a familiar setting.
  2. Bigger, more nuanced exploratory questions. Once you’ve agreed on the questions being asked and what data to pull, this typically gets shared back as a rigid report, which then leads you back to #1

We saw a solution for both of these problems: dedicated AI agents trained on specific datasets curated by the data expert. Data teams can now curate specific datasets. Then, using SQL, Python and AI, they can clean and wrangle their data before deploying these specialized agents in minutes.

Today we’re introducing an entirely new way to build and deploy specialized data agents to enable business teams to explore data on their own while letting the data team stay in full control.

For non-technical team members, Fabi.ai Analyst Agent lets them feel confident exploring datasets on their own with the help of AI blessed by their data team. 

For data teams, Analyst Agent means no more plugging AI into a messy data warehouse with a partially configured semantic layer and hoping for the best. Instead, data experts can build targeted datasets with the confidence that the AI is only going to use that dataset they’ve cleaned and curated for a specific use case. 

Here’s how: 

  • Tight guardrails: Data teams create agents that only work with specific, curated datasets and inform users when questions are out of scope. This focused approach means Analyst Agent agents have nearly 100% accuracy in responses and builds confidence for both data teams and users.
  • Python-first analysis: Unlike basic AI data tools and text-to-SQL solutions, Analyst Agent leverages Python packages for deeper analysis like regression models and propensity scoring. The AI auto-installs and manages packages, enabling complex analysis without code.
  • Built-in collaboration: Thanks to complex, world-class infrastructure, every user gets their own AI agent within shared reports, accessing live data without affecting others. Users can filter and explore curated datasets independently while their AI agent stays in sync with the latest changes.
  • Universal data connectivity: Connect to any data source and join multiple sources in memory - from spreadsheets to Snowflake tables. Query, merge, and deploy agents with just a few clicks.
  • Stress-free data cleaning and curation: Data practitioners can query, clean, and configure data in minutes from any source in Fabi.ai Smartbooks using their favorite tools: SQL, Python, AI, and no-code.
  • Easy testing: Gain confidence that your AI agent will represent your team well by testing them before deployment with configurable datasets and probing responses. Analyst Agent lets you quickly verify the accuracy of its answers. Plus, we have future plans to surface questions asked of the AI to the builder to help iterate and improve these agents.

Getting started: Build better data analysis workflows with Analyst Agent

You can build and share your first specialized agent with Analyst Agent in 30 minutes or less.

Here’s how:

Step 1: Create a Fabi.ai account and connect or upload your data

Log in to Fabi.ai and create your account.

Fabi.ai will prompt you to connect your data source or upload a file. We support all common data sources, or you can simply upload a CSV or Excel file.

Connect all your data sources to Fabi.ai

ℹ️ Note: Security and privacy is a built-in feature in Fabi.ai. We’re SOC2 compliant and you can review our policies and security stance here.

Step 2: Prepare your dataset

Once you’ve connected your data, you can use SQL, Python, and AI to wrangle your dataset. This is your opportunity to prepare and clean your dataset to make it AI-ready, no matter how messy the original dataset is.

Use SQL, Python and AI to prepare your data for Analyst Agent

ℹ️ Tip: Before diving in, you should consider writing down the types of questions you want this AI agent to answer. The more aggregated the data the fewer questions it can handle, but too much granularity can cause the data to be too large and impact latency. For example, if you’re creating an agent designed to answer questions such as “What’s my total ARR for enterprise accounts?” Instead of asking, “Which mid-market customers are close to enterprise?”, try listing accounts by ARR and the number of seats per account.

Step 3: Configure your AI agent and publish your report

Now that you have your datasets ready to go, it’s time to publish your agent. Simply navigate to the Report Builder, and in the right-hand configuration panel, look for “AI Agent Configuration.” Search for the Python DataFrame that you’ve curated and give it a name and description.

This description is both helpful for the AI and for your stakeholders. They will see this in the listed artifacts in the AI interface in the report.

Configure and label your datasets and artifacts for Analyst Agent

At this stage, you can also re-adjust the report and layout. If you’ve added text descriptors, filters, or charts, you can design this report to look just the way you want it to. Once you’re ready to share your work with the world, click “Publish.” This will launch a report with the Analyst Agent agent.

And that’s it! Now you, or anyone else, that you’ve shared this report with can see which datasets (artifacts) the AI agent has access to and start asking questions.

Analyst Agent can answer questions directly in Smart Reports.

Under the hood: The magic of our AI data tools

We’ll cover how we built Analyst Agent in full detail in a future post, but here’s a quick rundown of how it actually works under the hood. 👀

Autonomous AI agent

Fabi.ai customers have used our AI for a while in enterprise settings, so we’ve had some time to observe real user behavior and gather feedback in the wild. 

We learned a few things thanks to these real-world use cases:

  • The AI needs the right amount of context. Retrieval-augmented generation (RAG) is strong, and we still use it, but finding that balance was tough. The AI struggled when it didn't get enough context. It couldn’t answer the question or, even worse, it would hallucinate. If there was too much context, the AI got confused and ignored parts of the prompt.
  • User questions vary widely. One user might request the AI to create account scores with a regression model. Another user could ask for a revenue forecast using a time series model. How should we handle this when the Python packages needed to answer these questions are drastically different?
  • Conversations change over time. Users usually begin with a question. They might refine it, and then they start to move on. The AI needs to understand that it's using some sort of memory system. Simply providing the AI with the entire history of the context could overload the context and reduce accuracy (see first point). Using RAG coupled with a selective memory retrieval tool seemed like an interesting idea.
  • One-shot code generation doesn’t work. How many times have you asked AI to create code, only to find an error when you run it? It’s easy to get stuck in a cycle of asking AI to fix the error, rerunning the code, and rinsing and repeating. Instead, we wanted to let you just have the AI decide if it should dry-run the code and make sure it works before giving you a response.  

Shortly after AI agents entered the general discourse, we had an “Aha!” moment. We were already exploring ways to make our AI more dynamic and truly independent - beyond just retrieving information through RAG. We needed an AI that could handle tasks like installing its own Python packages when needed. This led us to completely rebuild our architecture from the ground up around AI. 

This approach helped us unlock a vision of AI that was more dynamic and could act autonomously to do things like find, install, and manage required packages to reduce friction. 

Here’s what this looks like in practice: First, our AI agent drafts a plan. Then it can invoke any number of tools to get the job done. For example, our agents can decide on their own if the questions need more historical context from the conversation, making the AI’s memory more dynamic. As the conversation evolves, the AI can choose to rewind to see if there are some clues about what you’re asking from 10 questions ago.

Python kernel gymnastics

The AI agent is only half the magic. We also had a few requirements that pushed us to design our infrastructure to be able to pull ready-to-go Python kernels off the shelf:

  • Business users should be able to interact with their own agents on shared reports.
  • AI agents should be ready instantly, as soon as a user navigates to a report.
  • Artifacts accessed by the agent should be up-to-date as users update them using filters in the report. For example, you can add a “Created date” filter to your accounts and set it to show the last 30 days by default, or further if needed. 
  • Reports and agents should be scalable and secure to safeguard the valuable data entrusted to us by our customers. 

Now, when a user goes to a report, we preload the report with the latest cached data using a Python kernel that we’ve already warmed up. Each user gets their own kernel, which is spun down when they’re no longer using it to help reduce costs.

This kernel system means the AI must stay in sync with the report. To do this, the core kernel powering the report must align with each user report.

We’ll dive into more of the details here in a future article.

The future of AI agents for data teams

So, where do we go from here? Our users have already sent us many requests that are shaping our vision. At a high level, we see this agent evolving beyond just answering data questions (even though that’s already super cool!)

In the future, users will be able to send AI outputs as requests to the dashboard builder. By that same token, we want to continue giving builders the tools they need to supervise the AI and improve on what they build. This means providing insights into the types of questions users are asking and how AI is handling the questions.

Thinking bigger, we truly think AI agents will change the way we get our work done. Not only will the Fabi.ai agent be able to answer complex and subtle questions about specific business contexts, but they’ll be equipped to invoke other agents, along with a whole constellation of tools built by us and our users. We think getting an executive summary delivered to your team’s Slack channel every Monday should be as simple as asking: “Could you please send this summary to #marketing-team every Monday morning, pretty please?” (Just remember to be polite if you want our future AI overlords to be nice to you in turn 🤖).

Ready to get started with your AI data agent in less than 10 minutes? Sign up for free.

Related reads

Subscribe to Query & Theory