📚 LLM-as-a-Judge: a Complete Guide on Using LLMs for Evaluations. Get your copy

Tutorials

A tutorial on building ML and data monitoring dashboards with Evidently and Streamlit

Last updated:

April 15, 2025

Published:

March 22, 2023

contents‍

Start testing your AI systems today

Get demo

Are you looking for an open-source tool to build an ML monitoring dashboard from scratch? Or, have you already started using the Evidently Reports for ML monitoring but now look at how to turn them into a web app?

One option is to use Evidently together with Streamlit, an open-source Python tool for creating shareable web apps. In this tutorial, you will go through an example of creating and customizing an ML monitoring dashboard using the two open-source libraries.

Code example: if you prefer to head straight to the code, open this example folder on GitHub.

⚠️ Disclaimer:
This example uses the Evidently API as available in version 0.6.7 or lower. Please ensure you are using the correct version when running this example. For updated and new examples, visit our documentation.

Background

If you are new to Evidently, Streamlit, or the idea of ML monitoring and observability, here is a quick recap.

[fs-toc-omit]When do you need an ML monitoring dashboard?

The ultimate objective of an ML project is to deploy a valuable model that benefits its end users. However, even if an ML model is exceptional after training, things might go wrong in production. You might face data drift, concept drift or data quality issues that would lead to ML model quality decay, affecting user experience and business outcomes.

To address this, you need to monitor your production model performance, keeping tabs on the input data and model outputs. One approach is to run periodic checks on the model and data quality. Depending on the model usage scenario and ground truth delay, you can run them, say, daily, weekly, or monthly.

You can then create an ML performance dashboard as a web application displaying key metrics and supporting visuals. This dashboard helps to communicate the data and model quality to model creators and other stakeholders and expose and explore any issues.

[fs-toc-omit]Get started with AI observability

Try our open-source library with over 25 million downloads, or sign up to Evidently Cloud to run no-code checks and bring all the team to a single workspace to collaborate on AI quality.

Sign up free ⟶

Or try open source ⟶

[fs-toc-omit]Evidently

Evidently open-source Python library helps evaluate, test, and monitor ML models from validation to production. Under the hood, it has over a hundred metrics and tests that assess various aspects of the data and model quality.

Evidently has several different components: Reports, Test Suites, and the Monitoring UI. In this tutorial, you will use Evidently Reports. A visual Report combines multiple metrics and generates rich interactive visualizations.

You can call and configure one of the default Reports, for example, using Data Drift or Data Quality presets (see the documentation). Or, you can create a Report from scratch by combining individual metrics.

[fs-toc-omit]Streamlit

Streamlit is an open-source Python library for building interactive web apps for data science and machine learning. It helps easily create web-based user interfaces with minimal code.

Streamlit includes pre-built components one can easily add to a web app. They include charts, sliders, text inputs, and other widgets necessary to create interactive user interfaces.

Streamlit supports deployment to various platforms, including local servers, cloud services, and containerization platforms like Docker. This makes it easy to share web apps with others.

[fs-toc-omit]How the integration works

The idea behind this example is to “embed” the Evidently Report as part of a Streamlit web app.

Evidently already provides all the necessary building blocks, metrics, and visuals to design the components of the ML monitoring and data quality dashboard. For example, you can easily create an interactive widget that detects and visualizes data distribution drift:

You can explore the Evidently Reports directly in Jupyter notebook or Colab and export them as standalone HTML files. The HTML file is already shareable but is not always convenient. For example, organizing and navigating multiple reports for past periods might take a lot of work.

In this case, you can use Streamlit to create a web application that will add a user interface to organize and query multiple Evidently Reports. This is exactly what we are going to do!

Tutorial scope

In this tutorial, you will learn how to create a minimalistic ML monitoring dashboard using Evidently and Streamlit. You will track and visualize data quality, data drift, and ML model performance for different periods.

You can also take this example as an inspiration and further customize the Report contents (using Evidently) and the web application interface (using Streamlit).

[fs-toc-omit]Pre-requisites

You have basic knowledge of Python.
You went through the Evidently Get Started Tutorial and can generate visual reports.

All code examples are based on Python version 3.8 or above. Note that Python 3.9.7 is an exception since at the moment of writing Streamlit does not work correctly with this version.

Note: we tested this example on macOS/Linux.

[fs-toc-omit]Tutorial structure

The tutorial is structured into two parts.

First, you will launch and interact with a pre-built ML monitoring app. It uses toy data. You can explore this example to understand the integration principle and how the components connect.

This should be quick and painless. If you are familiar with both tools, the example can be self-explanatory and inspire you to create a custom app!

Second, you will follow the steps to adapt the app to your use case. You will learn how to generate and add new Reports to the application. You can use these instructions to make it work for your model and dataset.

This second part is optional. You can bookmark and return to it whenever you have a specific dataset you are working on.

1. Pre-built ML monitoring application

This section guides you through the steps to launch the pre-built Monitoring Dashboard app and explains its components.

Launch the example

We pre-built the example using the bike demand forecasting dataset. We created a toy ML model and populated the app with data on its performance.

To launch the example, head to the GitHub folder and follow the instructions in the readme. You will need to:

clone the repository
set up a Python virtual environment and install the required libraries.

After installing the dependencies, you can run the app! You will launch a local Streamlit server and open the application in your default browser.

You can immediately see the Reports for the Bike Sharing project in the app interface:

You can also navigate the different Reports. Select the project, the monitoring period, and the report's name to do that. You also may change the app color scheme in the Settings menu.

How does the app work?

This Monitoring Dashboard is a Streamlit application.

The app UI consists of three main blocks: a sidebar menu, a header, and a report. Each block is implemented using the corresponding Streamlit component:

The Sidebar menu is based on st. sidebar widget and includes several st.selectbox input widgets.
Header utilizes st.header text element to display text in header formatting.

Report block is an HTML component that allows embedding of Evidently HTML reports inside of a Streamlit app.

The Monitoring Dashboard application has only three Python scripts. You will find them in the streamlit-app directory:

app.py file is the main application that implements the app workflow.
src/ui.py file contains functions for visual components: header, selectors, content blocks, etc.
src/utils.py file contains some Python utility functions for the app.

Generating reports

Now, let’s understand how the data gets into the app.

To make the demo example, we trained a simple regression model. We used a part of the toy dataset for training. Then, we generated the predictions for a later period in the data. This way, we simulated having model prediction logs with known labels.

Once we had the predictions and labels, we generated the Evidently Reports. In this case, we chose several reports on Data Drift, Data Quality, Target Drift, and Regression Performance (see the documentation).

We saved the HTML reports in a designated folder. The Streamlit application parses this folder to get the reports. This way, they appear in the app UI.

Tip: while this tutorial shows the Evidently Reports, you can do the same for the Evidently Test Suites. With Test Suites, you compare the metrics against a condition and get an explicit pass or fail result. The HTML output has a different look. Keep this option in mind when designing your monitoring dashboard! Both Reports and Test Suites are covered in the Getting Started tutorial.

To better understand the example, let’s add more Reports!

To do that, run a Jupyter notebook bicycle_demand_monitoring.ipynb. This notebook generates the predictions and Reports for the next week (you can adjust it!) and saves the HTML files in the destination folder. Once the HTML files are there, the Reports will appear in the app!

Let’s look in more detail at how we can organize the Reports.

Organizing reports and projects

The evidently-streamlit app generates the Monitoring Dashboard by parsing Evidently Reports in the projects/ directory.

There are two projects in the repository's main folder. Following the folder structure and naming convention, you may easily add a new project to the app.

We introduced this convention for the project directory to make the app extendable. It expects that each project contains a reports/ directory with HTML files inside.

projects/
	bike-sharing/
		reports/
        your-project/
		reports/

Inside the reports/ directory, you may organize Reports in different ways, for example:

by dates
by model names

Tip: you can technically use any Report name, and it will appear in the dropdown menu. Don’t forget to change the name of the corresponding navigation section in the app UI if you change the logic!

This tutorial example uses the monitoring period start and end dates as the folder names. For example, a directory named “2011-02-12_2011-02-18” has reports generated from 12 February 2011 to 18 February 2011.

Following this approach, you may store the Evidently model monitoring Reports in different directories and select them from the app UI.

You can also add hierarchy by adding sub-directories in the reports/ folder. In this case, Reports from the sub-directories appear in the Streamlit Tabs.

For example, if you add more Reports inside the “data_quality” and “model_performance” sub-folders, they will appear as tabs on top of the dashboard.

2. Create an ML monitoring dashboard

Now, let’s understand how to adapt this example to your data.

We assume you already went through part 1 of the tutorial and have the local copy of the Monitoring Dashboard app. You will continue working with it.

Note that this part of the tutorial describes the steps to add new data to the existing app. You need to prepare your dataset.

Tip: if you do not have a dataset but want to follow along, you can copy the folder with the Bike-Sharing project and give it a new name to treat it as a new project.

Add new reports to the project

Prepare the data. If you already have an ML model in production, you can simply use the latest prediction logs. Ground truth is optional. To generate the reports like Data Drift and Data Quality, it is enough to have the model input features. For the Model Quality reports, you need both predictions and actuals.

Let’s say we want to add a new Grocery Sales Prediction project! There is a nice Kaggle dataset we worked with to create the demo illustrations.

Generate the Evidently Reports. Next, you need to create the Reports on your data. You can generate Reports in a Jupyter notebook, run a Python script, or use a workflow manager like Airflow to schedule the Report generation process.

For simplicity, you can start with a manual process, generate the Reports locally, and then move them to the folder with a Streamlit app.

Tip: If you need a refresher on customizing the Evidently Reports, head to the documentation. You can start with presets and later customize the Reports by selecting specific metrics.

Once you generate the Reports for your data, save them as HTML files in the target folder following a naming convention. If you want to keep the existing UI that allows browsing the Reports by time range, continue naming the folders after the time period.

Move the Reports to the Evidently-Streamlit directory. For the app to run, you should place the newly-generated Reports directly where they are expected.

If you created the Reports in your repo, simply copy them to the target destination.

Here is the workflow to add the Reports to a new project inside the app directory:

Create a new project directory: evidently-streamlit/projects/PROJECT.
Copy HTML reports to the folder: evidently-streamlit/projects/PROJECT/reports.

Here is what you should get:

Launch the app

Now, you can browse the newly added Reports from the Streamlit app UI.

You can navigate to the new Reports by selecting the project name and the monitoring period. Monitoring periods displayed in the Select period widget in the app follow the folder names inside the reports/ directory.

The Select report widget also uses parsed names of the Report files to allow switching between them.

Customize the UI

In many cases, keeping all HTML reports in a single folder for a specific period is OK.

However, you may also want to add an extra level to your reports folder structure.

Here are the two common scenarios:

Your reports are very long, and you want to split them to make navigation easier.
You want to group the reports by topic, model, etc.

The Monitoring Dashboard app supports such scenarios as well! As we’ve already seen, if you put additional folders inside the directory named after the time period, these folders will appear in the interface as Tabs.

To repeat this for your project, add the new Reports to sub-directories. In our example, we generated new Evidently Reports for a period from 2016-01-19 to 2016-01-26. We customized each Report to include a limited set of metrics to make it shorter.

We then put multiple data quality, and model performance reports in the corresponding sub-directories named “data_quality” and “model_performance.”

Evidently - Streamlit app UI customization

The Monitoring Dashboardapp automatically extracts the names of the HTML files inside of sub-directories and treats them as names of Tabs. It’s easier to navigate to separate reports using Tabs instead of scrolling long HTML reports.

In another scenario, we might generate separate Reports for different categories in our dataset.

The overall model performance reports show only aggregate quality. In our example dataset, we predicted sales for 16 unique products (item_nbr). We might want to explore how the model performs for each product.

Here is the code we used to generate separate Evidently Reports for every product in the dataset and store the HTML in a defined sub-directory.

# Get product IDs 
items = reference.item_nbr.unique()

# Generate reports for products 
for product in items:
    
		# Define metric to include in the reports 
    product_report = Report(metrics=[
        RegressionQualityMetric(),
        RegressionPredictedVsActualPlot(),
        RegressionPredictedVsActualScatter(),
        RegressionErrorDistribution(),
        RegressionErrorNormality(),
    ])
    
		# Build a reports 
    product_report.run(
        reference_data=reference[reference.item_nbr==product],
        current_data=current[current.item_nbr==product],
        column_mapping=column_mapping
    )

		# Save the report
    product_report_path = prpoduct_report_dir / f'{product}.html'
    product_report.save_html(product_report_path)
    
    del product_report

As a result, we get a bunch of reports organized using Tabs.

Evidently - Streamlit app organizing reports by tabs

[fs-toc-omit]Support Evidently

Did you enjoy the blog? Star Evidently on GitHub to contribute back! This helps us continue creating free, open-source tools and content for the community.

⭐️ Star on GitHub ⟶

[fs-toc-omit]Summing up

In this tutorial, you went through the steps to build a machine learning and data monitoring dashboard using Evidently and Streamlit. You can take it as an inspiration and build other applications to showcase and share your ML model performance.

You can further automate how you generate and add new Reports to the app, for example, using a workflow management tool like Airflow.

This approach is lightweight, and allows quickly sharing the Report results with other team members by deploying a Streamlit app. Overall, this can be a good starting point to organize Report-based monitoring for a few batch models before you decide to scale it up.

However, this example has limitations. An app like this works for batch model checks when you occasionally perform a manual review by looking at the dashboard. However, it does not automatically alert on any changes. You also need to generate a Report for each new period separately and re-generate Reports outside the app if you need to update anything.

As an alternative, you can use the Evidently ML monitoring dashboard that helps track the metrics over time by pulling them from individual Reports.

Here is how it looks:

You can self-host a monitoring dashboard, or sign up for Evidently Cloud to get a hosted version.

Evidently is a flexible tool and can integrate into alternative architectures, including with tools like Grafana, Airflow, Mlflow, and others. Browse documentation for more example integrations, and sign up for the newsletter to get updates on new blogs and tutorials!