Build Interactive Machine Learning Apps with Gradio

As a developer working with machine learning models, you likely spend hours writing scripts and adjusting hyperparameters. But when it comes to sharing your work or letting others interact with your models, the gap between a Python script and a usable web app can feel enormous. Gradio is an open source Python library that lets you turn your Python scripts into interactive web applications without requiring frontend expertise.

In this blog, we’ll take a fun, hands-on approach to learning the key Gradio components by building a text-to-speech (TTS) web application that you can run on an AI PC or Intel® Tiber™ AI Cloud and share with others. (Full disclosure: the author is affiliated with Intel.)

An Overview of Our Project: A TTS Python Script

We will develop a basic python script utilizing the Coqui TTS library and its xtts_v2 multilingual model. To proceed with this project, make a requirements.txt file with the following content:

gradio
coqui-tts
torch

Then create a virtual environment and install these libraries with

pip install -r requirements.txt

Alternatively, if you’re using Intel Tiber AI Cloud, or if you have the uv package manager installed on your system, create a virtual environment and install the libraries with

uv init --bare
uv add -r requirements.txt

Then, you can run the scripts with

uv run

Gotcha Alert For compatibility with recent dependency versions, we are using `coqui-tts` which is a fork of the original Coqui `TTS`. So, do not attempt to install the original package with pip install TTS.

Next, we can make the necessary imports for our script:

import torch
from TTS.api import TTS

Currently, `TTS` gives you access to 94 models that you can list by running

print(TTS().list_models())

For this blog, we will use the XTTS-v2 model, which supports 17 languages and 58 speaker voices. You may load the model and view the speakers via

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

print(tts.speakers)

Here is a minimal Python script that generates speech from text and :

import torch
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

tts.tts_to_file(
    text="Every bug was once a brilliant idea--until reality kicked in.",
    speaker="Craig Gutsy",
    language="en",
    file_path="bug.wav",
)

This script works, but it’s not interactive. What if you want to let users enter their own text, choose a speaker, and get instant audio output? That’s where Gradio shines.

Anatomy of a Gradio App

A typical Gradio app comprises the following components:

Interface for defining inputs and outputs
Components such as Textbox, Dropdown, and Audio
Functions for linking the backend logic
.launch() to spin up and optionally share the app with the option share=True.

The Interface class has three core arguments: fn, inputs, and outputs. Assign (or set) the fn argument to any Python function that you want to wrap with a user interface (UI). The inputs and outputs take one or more Gradio components. You can pass in the name of these components as a string, such as "textbox" or "text", or for more customizability, an instance of a class like Textbox().

import gradio as gr


# A simple Gradio app that multiplies two numbers using sliders
def multiply(x, y):
    return f"{x} x {y} = {x * y}"


demo = gr.Interface(
    fn=multiply,
    inputs=[
        gr.Slider(1, 20, step=1, label="Number 1"),
        gr.Slider(1, 20, step=1, label="Number 2"),
    ],
    outputs="textbox",  # Or outputs=gr.Textbox()
)

demo.launch()

Image by author

The Flag button appears by default in the Interface so the user can flag any “interesting” combination. In our example, if we press the flag button, Gradio will generate a CSV log file under .gradio\flagged with the following content:

Number 1,Number 2,output,timestamp

12,9,12 x 9 = 108,2025-06-02 00:47:33.864511

You may turn off this flagging option by setting flagging_mode="never" within the Interface.

Also note that we can remove the Submit button and automatically trigger the multiply function via setting live=True in Interface.

Converting Our TTS Script to a Gradio App

As demonstrated, Gradio’s core concept is simple: you wrap your Python function with a UI using the Interface class. Here’s how you can turn the TTS script into a web app:

import gradio as gr
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")


def tts_fn(text, speaker):
    wav_path = "output.wav"
    tts.tts_to_file(text=text, speaker=speaker, language="en", file_path=wav_path)
    return wav_path


demo = gr.Interface(
    fn=tts_fn,
    inputs=[
        gr.Textbox(label="Text"),
        gr.Dropdown(choices=tts.speakers, label="Speaker"),
    ],
    outputs=gr.Audio(label="Generated Audio"),
    title="Text-to-Speech Demo",
    description="Enter text and select a speaker to generate speech.",
)
demo.launch()