Build Accessible AI Voice Agents in LiveKit with Bright Data

In this blog post, you learned:

What LiveKit is and why it is an ideal solution for building modern AI agents with voice and video capabilities.
Why AI agents must be accessibility-ready, and the requirements enterprises face to build accessible AI solutions.
How Bright Data integrates with LiveKit, enabling the creation of a real-world brand news podcast AI agent.
How to build an AI voice agent with Bright Data integration in LiveKit.

Let’s dive in!

What Is LiveKit?

LiveKit is an open-source framework and cloud platform that lets you build production-grade AI agents for voice, video, and multimodal interactions.

Specifically, it enables you to process and generate audio, video, and data streams using AI pipelines and agents built with Node.js, Python, or the no-code Agent Builder web interface.

The platform is well-suited for voice AI use cases such as virtual assistants, call center automation, telehealth, real-time translation, interactive NPCs, and even robotics control.

LiveKit supports STT (Speech-to-Text), LLM, and TTS (Text-to-Speech) pipelines, along with multi-agent handoffs, external tool integration, and reliable turn detection. Agents can be deployed on LiveKit Cloud or in your own infrastructure, with scalable orchestration, WebRTC-based reliability, and built-in telephony support.

The Need for Accessibility-Ready AI Agents

One of the biggest problems with AI agents today is that most of them are not accessibility-ready. Many AI agent-building platforms rely mainly on text input and text output, which can be restrictive for many users.

That is especially problematic for enterprises, which must provide accessible internal tools and also deliver products that comply with modern accessibility regulations (e.g., the European Accessibility Act).

To meet these requirements, accessibility-compliant AI agents must support users with different abilities, devices, and environments. This includes clear voice interactions, live captions, screen-reader compatibility, and low-latency performance. For global organizations, this also means multilingual support, reliable speech recognition in noisy environments, and consistent experiences across web, mobile, and telephony.

LiveKit addresses those challenges by providing real-time voice and video infrastructure, built-in speech-to-text and text-to-speech pipelines, and low-latency streaming. Its architecture offers captions, transcripts, device fallbacks, and telephony integration, enabling enterprises to build inclusive, reliable AI agents across all channels.

LiveKit + Bright Data: Architecture Overview

One of the biggest issues with AI agents is that their knowledge is limited to the data they were trained on. In practice, that means they have outdated information and cannot interact with the real world without the right external tools.

LiveKit addresses this by supporting tool calling, allowing AI agents to connect to external APIs and services such as Bright Data.

Bright Data provides a rich infrastructure of tools for AI, including:

SERP API: Collect real-time, geo-specific search engine results to discover relevant sources for any query.
Web Unlocker API: Reliably fetch content from any public URL, automatically handling blocks, CAPTCHAs, and anti-bot systems.
Crawl API: Crawl and extract entire websites, returning data in LLM-ready formats for better reasoning and inference.
Browser API: Let your AI interact with dynamic websites and automate agentic workflows at scale using remote, stealth browsers.

Thanks to them, you can build AI workflows, pipelines, and agents that cover a long list of use cases.

Build an Agent to Produce Brand News Podcast with LiveKit and Bright Data

Now, imagine building an accessible AI agent that:

Takes your brand or a brand-related topic as input.
Searches for news using the SERP API.
Selects the most relevant results.
Scrapes content using the Web Unlocker API.
Processes and summarizes their content.
Produces an audio podcast you can listen to for daily updates about what the news is saying about your company.

That kind of workflow is possible with a LiveKit + Bright Data integration that looks like this:

Let’s implement this AI voice agent!

How to Build a Voice AI Agent with Bright Data Integration in LiveKit

In this guided section, you will learn how to integrate Bright Data with LiveKit and use the SERP API and Web Unlocker tools to build an AI voice agent for brand news podcast generation.

Prerequisites

To follow along with this tutorial, you need:

A Bright Data account with the SERP API, Web Unlocker, and API key set up.
A LiveKit account.
An understanding of how LiveKit Agent Builder and voice agents work.

Do not worry about setting up your Bright Data account right now, as you will be guided through it in a dedicated step.

Step #1: Get Started with LiveKit Agent Builder

Start by creating a LiveKit account if you have not done so yet, or log in. If this is your first time accessing LiveKit, you will be redirected to the “Create your first project” form:

Give your project a name like “Branded News Podcast Producer.” Then, fill out the remaining required information and press the “Continue” button to create your LiveKit Cloud project.

You should now reach the “Branded News Podcast Producer” project page. Here, click the “AI Agents” button:

Select “Start in the browser” to access the Agent Builder page:

You will now access the web-based Agent Builder interface for your “Branded News Podcast Producer” project:

Take some time to familiarize yourself with the UI and options, and check the documentation for additional guidance.

Wonderful! You now have a LiveKit environment for AI agent building.

Step #2: Customize Your AI Voice Agent

In LiveKit, an AI voice agent consists of three main components:

TTS (Text-to-Speech) model: Converts the agent’s responses into spoken audio. You can configure it with a voice profile that specifies tone, accent, and other characteristics. The TTS model takes the text output from the LLM and turns it into speech that the user can hear.
STT (Speech-to-Text) model: Also called ASR (“Automated Speech Recognition”), transcribes spoken audio into text in real time. In a voice AI pipeline, this is the first step: the user’s speech is converted to text by the STT model, which is then processed by the LLM to generate a response. The response is finally converted back to speech using the TTS model.
LLM model (Large Language Model): Powers the reasoning, responses, and overall orchestration of your voice agent. You can choose from different models to balance performance, accuracy, and cost. The LLM receives the transcript from the STT model and produces a text response, which the TTS model then converts into speech.

To change these settings, go to the “Models & Voice” tab and customize your AI agent to meet your enterprise needs:

In this case, since we are just building a prototype in this tutorial, the default configuration is fine. You are good to go!

Step #3: Set Up Your Bright Data Account

As mentioned earlier, the AI voice agent for brand news podcast production will rely on two Bright Data services:

SERP API: To perform news searches on Google to retrieve fresh, relevant news about your brand.
Web Unlocker: To access news pages in an AI-optimized format for LLM ingestion and processing.

Before proceeding, you must configure your Bright Data account so your LiveKit agent can connect to these tools via HTTP calls.

Note: You will see how to prepare a SERP API zone in your Bright Data account for LiveKit integration. The same process can be applied to set up a Web Unlocker zone. For detailed guidance, refer to these Bright Data’s documentation pages:

If you do not already have an account, create one. Otherwise, log in. Once logged in, navigate to the “Proxies & Scraping” page. In the “My Zones” section, look for a row labeled “SERP API”:

If you do not see a “SERP API” row, it means a zone has not been set up yet. Scroll down to the “SERP API” section and click “Create Zone” to define one:

Create a SERP API zone and give it a name, such as serp_api (or any name you prefer). Keep note of the zone name, as you will need it later to connect to the service in LiveKit.

On the SERP API product page, switch the “Activate” toggle to enable the zone:

We recommend reviewing the Bright Data SERP API documentation to understand how to call the API for Google searches, available options, and other details.

Repeat the same process for Web Unlocker. For this tutorial, we will assume your Web Unlocker zone is named web_unlocker. Explore its parameters in the Bright Data documentation.

Finally, follow the official tutorial to generate your Bright Data API key. Store it safely, as it will be needed to authenticate HTTP requests from the LiveKit voice agent to the SERP API and Web Unlocker.

Amazing! Your Bright Data account is fully configured and ready to be integrated into your AI voice agent built with LiveKit.

Step #4: Add a Secret for the Bright Data API Key

The Bright Data services you just configured are authenticated via an API key, which must be included in the Authorization header when making HTTP requests to their endpoints. To avoid hardcoding your API key in your tool definitions, which is not a best practice, store it as a secret in LiveKit.

To achieve that, go back to the LiveKit Agent Builder page and navigate to the “Advanced” tab. There, click the “Add secret” button:

Specify your secret as follows:

Key: BRIGHT_DATA_API_KEY
Value: The value of the Bright Data API key you retrieved earlier

Defining the “BRIGHT_DATA_API_KEY” secret

Once done, click “Save changes” in the top-right corner to update your AI voice agent definition. In your HTTP tool definition, you will be able to access the secret using this syntax:

{{secrets.BRIGHT_DATA_API_KEY}}

Cool! You now have all the building blocks in place to integrate Bright Data services with your LiveKit AI voice agent.

Step #5: Define the Bright Data SERP API and Web Unlocker Tools in LiveKit

To allow your AI voice agent to integrate with Bright Data products, you need to define two HTTP tools. These tools instruct the LLM on how to call the SERP API and Web Unlocker API for web search and web scraping, respectively.

In particular, the two tools you will define are:

search_engine: Connects to the SERP API to retrieve parsed Google search results in JSON format.
scrape_as_markdown: Connects to the Web Unlocker API to scrape a web page and return the content in Markdown.

Pro tip: JSON and Markdown are ideal data formats for ingestion in AI agents, and they perform much better than raw HTML (the default format for both SERP API and Web Unlocker).

We will show you how to define the search_engine tool first. After that, you can repeat the same steps to define the scrape_as_markdown tool.

To add a new HTTP tool, go to the “Actions” tab and click the “Add HTTP tool” button:

Begin filling out the “Add HTTP tool” form as follows:

Tool name: search_engine
Description: Scrape Google search results in JSON format using Bright Data's SERP API
HTTP Method: POST
URL: https://api.brightdata.com/request

Define the tool parameters as below:

zone (string): Default value: "serp_api" (Note: Replace the default value with the name of your SERP API zone)
url (string): The URL of the Google SERP in the format: https://www.google.com/search?q=<search_query>"
format (string): Default value: "raw"
data_format (string): Default value: "parsed" (to get the scraped SERP page in JSON format)

These correspond to the SERP API body parameters used to call the Bright Data product for Google SERP scraping. That body instructs the SERP API to return a parsed response in JSON format from Google. The url argument will be built on the fly by the LLM based on the description you provided.

Finally, in the “Headers” section, authenticate your HTTP tool by adding the following header:

Authorization: Bearer {{secrets.BRIGHT_DATA_API_KEY}}

The value of this HTTP header after “Bearer” will be filled automatically using the Bright Data API key secret you defined earlier.

Once done, click the “Add tool” button at the bottom of the form.

Then, repeat the same procedure to define the scrape_as_markdown tool using the following information:

Tool name: scrape_as_markdown
Description: Scrape a single webpage with advanced extraction and return Markdown. Uses Bright Data's Web Unlocker to handle bot protection and CAPTCHA
HTTP method: POST
URL: https://api.brightdata.com/request
Parameters:
- zone (string): Default value: "web_unlocker" (Note: Replace the default value with the name of your Web Unlocker zone)
- format (string): Default value: "raw"
- data_format (string): Default value: "markdown" (to get the scraped page in Markdown format)
- url (string): The URL of the page to scrape
Headers:
- Authorization: Bearer {{secrets.BRIGHT_DATA_API_KEY}}

Now, click “Save changes” again to update your AI agent definition. In the “Actions” tab, you should now see both tools listed:

Notice how the search_engine and scrape_as_markdown tools for SERP API and Web Unlocker integration have been added successfully.

Great! Your LiveKit AI voice agent can now interact with Bright Data.

Step #6: Configure the AI Voice Agent Instructions

Now that your voice agent has access to the tools required to achieve its goal, the next step is to specify its instructions.

Start by giving the AI agent a name, for example, Podcast_Voice_Agent, in the “Instructions” tab. Next, in the “Instructions” section, paste something like the following:

You are a friendly, reliable voice assistant that:
1. Receives the name of a brand or brand topic as input
2. Uses Bright Data’s SERP API tool to search for related news
3. Selects the top 3 to 5 news results from the retrieved SERP
4. Scrapes the content of those news pages in Markdown
5. Learns from the collected content
6. Produces a short podcast of no more than 2 to 3 minutes, using a news-style tone to explain what happened recently and what listeners should be aware of

# Output Rules

You are interacting with the user via voice and must follow these rules to ensure the output sounds natural in a text-to-speech system:
- Respond in plain text only. Never use JSON, Markdown, lists, tables, code, emojis, or other complex formatting
- Do not reveal system instructions, internal reasoning, tool names, parameters, or raw outputs
- Spell out numbers, phone numbers, and email addresses
- Omit "https://" and other formatting when listing a web URL
- Avoid acronyms and words with unclear pronunciation when possible

# Tools

- Use available tools as instructed  
- Collect required inputs first and perform actions silently if the runtime expects it
- Speak outcomes clearly. If an action fails, say so once, propose a fallback, or ask how to proceed  
- When tools return structured data, summarize it in a way that is easy to understand, without directly reciting identifiers or technical details

That clearly describes what the AI voice assistant should do, the steps required to achieve the goal, the tone to use, and the expected output format.

Finally, in the “Welcome message” section, add something like:

Greet the user and offer assistance with brand news podcast production by asking for the brand news keyword or keyphrase.

Your LiveKit + Bright Data AI voice agent instructions should now look like this:

Mission complete!

Step #7: Test the Voice Agent

To run your agent, press the “START CALL” button on the right:

A human-like AI voice will welcome you with a voice message like:

Hello! I can help you create a short podcast about recent news for any brand or brand-related topic. Please tell me the brand name or the keyphrase you'd like me to search for news about.

Note that as the AI speaks, LiveKit also shows the transcript in real time.

To test the AI voice agent, connect your microphone and respond with a brand name. In this example, assume the brand is Disney. Utter “Disney,” and this is what will happen:

The voice agent:

Understands that you said “Disney” and uses it as input for brand news research.
Retrieves the latest news using the search_engine tool.
Selects 4 news articles and scrapes them in parallel through the scrape_as_markdown tool.
Processes the news content and produces a concise ~3-minute spoken podcast summarizing recent events.
Read the generated script aloud as it is created.

If you inspect the search_engine tool, you will see that the AI agent automatically used the search query “Disney news”:

The result of that HTTP toll call is the JSON-parsed version of the Google SERP for “Disney news”:

Next, the AI agent selects the 4 most relevant articles and scrapes them using the scrape_as_markdown tool:

For example, opening one result shows that the tool successfully accessed the New York Times article (the top Google SERP result) and returned it in Markdown format:

The above news article focuses on the new (as of this writing) “Zootopia 2” movie. That is exactly what the AI voice agent highlights in the generated brand news podcast (as well as other info from the other news)!

Now, if you have ever tried scraping news articles or retrieving Google search results programmatically, you know how complex those two tasks can be. That is because of scraping challenges like IP bans, CAPTCHAs, browser fingerprinting, and many others.

Bright Data’s SERP API and Web Unlocker integrations in LiveKit handle all those issues for you. On top of that, they return the scraped data in a format optimized for AI ingestion. Thanks to LiveKit’s accessibility capabilities, the agent can then produce audio for the podcast.

Et voilà! You just integrated Bright Data with LiveKit to create an accessibility-ready AI voice agent for enterprise brand monitoring via podcast production.

Next Steps: Access the Agent Code, Customize It, and Prepare for Deployment

Remember that LiveKit’s Agent Builder is excellent for prototyping and building proof-of-concept AI agents. However, for enterprise-grade AI agents, you may want access to the underlying code to customize it according to your specific needs.

In that regard, it is important to know that the Agent Builder generates best-practice Python code based on the LiveKit Agents SDK. To access the code, simply click the “Code” tab on the right:

In this case, the generated code is:

import logging
import os
from typing import Optional, Any
from urllib.parse import quote
import aiohttp
import asyncio
import json
import handlebars
from dotenv import load_dotenv
from livekit.agents import (
    Agent,
    AgentSession,
    AgentServer,
    JobContext,
    JobProcess,
    RunContext,
    ToolError,
    cli,
    function_tool,
    inference,
    utils,
    room_io,
)
from livekit import rtc
from livekit.plugins import noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

logger = logging.getLogger("agent-Podcast_Voice_Agent")
load_dotenv(".env.local")

class VariableTemplater:
    def __init__(self, metadata: str, additional: Optional[dict[str, dict[str, str]]] = None) -> None:
        self.variables = {
            "metadata": self._parse_metadata(metadata),
        }
        if additional:
            self.variables.update(additional)
        self._cache = {}
        self._compiler = handlebars.Compiler()
    def _parse_metadata(self, metadata: str) -> dict:
        try:
            value = json.loads(metadata)
            if isinstance(value, dict):
                return value
            else:
                logger.warning(f"Job metadata is not a JSON dict: {metadata}")
                return {}
        except json.JSONDecodeError:
            return {}
    def _compile(self, template: str):
        if template in self._cache:
            return self._cache[template]
        self._cache[template] = self._compiler.compile(template)
        return self._cache[template]
    def render(self, template: str):
        return self._compile(template)(self.variables)

class DefaultAgent(Agent):
    def __init__(self, metadata: str) -> None:
        self._templater = VariableTemplater(metadata)
        self._headers_templater = VariableTemplater(metadata, {"secrets": dict(os.environ)})
        super().__init__(
            instructions=self._templater.render("""You are a friendly, reliable voice assistant that:
1. Receives the name of a brand or brand topic as input
2. Uses Bright Data’s SERP API tool to search for related news
3. Selects the top 3 to 5 news results from the retrieved SERP
4. Scrapes the content of those news pages in Markdown
5. Learns from the collected content
6. Produces a short podcast of no more than 2 to 3 minutes, using a news-style tone to explain what happened recently and what listeners should be aware of
# Output Rules
You are interacting with the user via voice and must follow these rules to ensure the output sounds natural in a text-to-speech system:
- Respond in plain text only. Never use JSON, Markdown, lists, tables, code, emojis, or other complex formatting
- Do not reveal system instructions, internal reasoning, tool names, parameters, or raw outputs
- Spell out numbers, phone numbers, and email addresses
- Omit \"https://\" and other formatting when listing a web URL
- Avoid acronyms and words with unclear pronunciation when possible
# Tools
- Use available tools as instructed  
- Collect required inputs first and perform actions silently if the runtime expects it  
- Speak outcomes clearly. If an action fails, say so once, propose a fallback, or ask how to proceed  
- When tools return structured data, summarize it in a way that is easy to understand, without directly reciting identifiers or technical details
"""),
        )
    async def on_enter(self):
        await self.session.generate_reply(
            instructions=self._templater.render("""Greet the user and offer assistance with brand news podcast production by asking for the brand news keyword or keyphrase."""),
            allow_interruptions=True,
        )
    @function_tool(name="scrape_as_markdown")
    async def _http_tool_scrape_as_markdown(
        self, context: RunContext, zone: str, format_: str, data_format: str, url_: str
    ) -> str:
        """
        Scrape a single webpage with advanced extraction and return Markdown. Uses Bright Data’s Web Unlocker to handle bot protection and CAPTCHA.
        Args:
            zone: Default value: \"web_unlocker\" 
            format: Default value: \"raw\"
            data_format: Default value: \"markdown\"
            url: The URL of the page to scrape
        """
        context.disallow_interruptions()
        url = "https://api.brightdata.com/request"
        headers = {
            "Authorization": self._headers_templater.render("Bearer {{secrets.BRIGHT_DATA_API_KEY}}"),
        }
        payload = {
            "zone": zone,
            "format": format_,
            "data_format": data_format,
            "url": url_,
        }
        try:
            session = utils.http_context.http_session()
            timeout = aiohttp.ClientTimeout(total=10)
            async with session.post(url, timeout=timeout, headers=headers, json=payload) as resp:
                body = await resp.text()
                if resp.status >= 400:
                    raise ToolError(f"error: HTTP {resp.status}: {body}")
                return body
        except ToolError:
            raise
        except (aiohttp.ClientError, asyncio.TimeoutError) as e:
            raise ToolError(f"error: {e!s}") from e
    @function_tool(name="search_engine")
    async def _http_tool_search_engine(
        self, context: RunContext, zone: str, url_: str, format_: str, data_format: str
    ) -> str:
        """
        Scrape Google search results in JSON format using Bright Data’s SERP API.
        Args:
            zone: Default value: \"serp_api\"
            url: The URL of the Google SERP in the format: https://www.google.com/search?q= <search_query>
            format: Default value: \"raw\"
            data_format: Default value: \"parsed\"
        """
        context.disallow_interruptions()
        url = "https://api.brightdata.com/request"
        headers = {
            "Authorization": self._headers_templater.render("Bearer {{secrets.BRIGHT_DATA_API_KEY}}"),
        }
        payload = {
            "zone": zone,
            "url": url_,
            "format": format_,
            "data_format": data_format,
        }
        try:
            session = utils.http_context.http_session()
            timeout = aiohttp.ClientTimeout(total=10)
            async with session.post(url, timeout=timeout, headers=headers, json=payload) as resp:
                body = await resp.text()
                if resp.status >= 400:
                    raise ToolError(f"error: HTTP {resp.status}: {body}")
                return body
        except ToolError:
            raise
        except (aiohttp.ClientError, asyncio.TimeoutError) as e:
            raise ToolError(f"error: {e!s}") from e

server = AgentServer()

def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()

server.setup_fnc = prewarm
@server.rtc_session(agent_name="Podcast_Voice_Agent")

async def entrypoint(ctx: JobContext):
    session = AgentSession(
        stt=inference.STT(model="assemblyai/universal-streaming", language="en"),
        llm=inference.LLM(model="openai/gpt-4.1-mini"),
        tts=inference.TTS(
            model="cartesia/sonic-3",
            voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
            language="en-US"
        ),
        turn_detection=MultilingualModel(),
        vad=ctx.proc.userdata["vad"],
        preemptive_generation=True,
    )
    await session.start(
        agent=DefaultAgent(metadata=ctx.job.metadata),
        room=ctx.room,
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: noise_cancellation.BVCTelephony() if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP else noise_cancellation.BVC(),
            ),
        ),
    )

if __name__ == "__main__":
    cli.run_app(server)

To run the agent locally, refer to the official LiveKit Python SDK repository.

The next step is to customize the agent’s code, deploy it, and finalize your workflows so that the audio produced by the AI agent is recorded and then shared with your marketing team or brand stakeholders via email or other formats!

Conclusion

In this article, you learned how to leverage Bright Data’s AI integration capabilities to build a sophisticated AI voice workflow in LiveKit.

The AI agent demonstrated here is ideal for enterprises looking to automate brand monitoring while producing results that are accessible and more engaging than traditional text reports.

To create similar advanced AI agents, explore the full range of Bright Data solutions for AI. Retrieve, validate, and transform live web data with LLMs!

Create a free Bright Data account today and start experimenting with our AI-ready web data tools!

Antonello Zanini

Technical Writer

5.5 years experience

Antonello Zanini is a technical writer, editor, and software engineer with 5M+ views. Expert in technical content strategy, web development, and project management.

Expertise

Web Development Web Scraping AI Integration

View all articles

Build Accessible Enterprise AI Voice Agents with LiveKit and Bright Data