Building AI agents is getting easier by the day. In this piece, we’ll go through the process of using LangChain’s new BrightDataSERP tool. If you’re not familiar with the acronym, SERP stands for “Search Engine Results Page.”
This is tutorial is beginner friendly. All you need is a basic understanding of Python. By the time you finish this guide, you can add the following skills to your toolbox.
- Perform a basic search using BrightDataSERP
- Customize your SERP output
- Clean output for LLM-friendly usage
- Create an AI agent with search capabilities
Intro: Knowledge Limitations of AI
If you’re familiar enough with LLMs, you already know that their knowledge is static. By the time they’re released to the public, they are done with training and fine-tuning — no more knowledge can be added.
Before OpenAI added search capabilities, ChatGPT had a knowledge cutoff date. LLMs still have cutoff dates, based on their last fine-tuning period. That said, models are capable of using zero shot inference. You can learn more about the overall training process here.
AI models get deployed with a static knowledge base. Through zero-shot inference, models can make sense of new data, but they will not retain the information permanently.
How LangChain Addresses Limitations
LangChain allows us to create tools and connect them to different LLMs. If you can write Python functions, you can let LLMs call those functions — at their own discretion. You give the LLM access to the tool. It does everything else. If you ask it a question it can answer with pretraining, it won’t use the tool. If you ask it a question it doesn’t know, it will use its tools to try and find the answer.
LangChain even offers prebuilt tools for all of the following needs.
- Search
- Code
- Productivity
- Web Browsing
- Databases
- Finance
You can view LangChain’s full list of integrated tools here. We’ve got even better news. Bright Data is one of them!
Using LangChain With Bright Data
Now that we’ve gone over what it does, let’s take a look at how to actually use LangChain with Bright Data. We’ll assume you’ve got a basic familiarity with Python. We’ll walk through what it takes to get your API keys from OpenAI and Bright Data. Before continuing, make sure to go over our web scraping with LangChain and Bright Data guide first.
Prerequisites
For starters, you need to install LangChain’s Bright Data tools. The pip install
command below does exactly that.
pip install langchain-brightdata
Next, you need a Bright Data API key and you need an SERP instance called serp
. You can sign up for a free trial of our SERP API here. Ensure that your zone is named serp
. serp1
will not work. When you’re ready, click the “Add” button and finish setting up the tool.
Next, you can get your API key from the dashboard of your new SERP zone.
To get your OpenAI key, open their API keys page and click the “Create new secret key” button.
A Basic Example
We’ll start with just a simple example of how the tooling works. Swap the API key below with your own. The BrightDataSERP
class does the heavy lifting here. We just set the configuration and print the results. You don’t normally need .encode("utf-8")
, but we experienced some printing issues with Windows and this resolved it.
Here’s a snippet of sample output. If you see this (or something similar), you’re on the right track.
Advanced Usage
In the example below, we use kwargs to set a custom configuration with BrightDataSERP
. You can view the full documentation on customization here. We set our search type to shop
so we get more relevant shopping results.
You can customize any of the following to refine your search results.
query
country
language
search_type
device_type
results_count
Creating an AI Agent With Bright Data And OpenAI
Now that you’ve got a basic understanding of how to use BrightDataSERP
, let’s see how a real AI agent uses it. We’ll go through the code pieces and then show how it all works as a whole.
The Pieces
There are a couple more things you’ll need to install before we get started.
Install LangChain itself.
Install OpenAI support for LangChain.
Install LangGraph to create agents.
This one might be a bit of a shock in the age of AI, but we’ll install BeautifulSoup as well. You’ll see why soon enough.
pip install beautifulsoup4
Creating A Search Function
The function below retrieves our search results — much like the example from earlier. After receiving those results, we use BeautifulSoup to pull the text from them. Now, we’ll use far fewer tokens when passing the results into our LLM. All it sees is the site text. We keep \n
(newline) characters so the agent can better understand the page layout.
Once we’ve extracted the text, we return it.
Turning The Function Into a Tool
Now, we’ll use LangChain’s Tool
class to wrap the function. This allows our agent to call it as a tool. As you see below, it’s pretty simple. We give it a name and description. We also point the tool to a function with the func
argument.
Creating The Agent
The code below creates our agent. ChatOpenAI
creates an LLM instance. We pass our LLM and our tool into create_react_agent()
to create the actual agent.
A Boring But Functional UI
Every program needs a runtime, no matter how primitive. Here, we just create a basic terminal setup for the user to interact with the agent. The user inputs a prompt. The prompt gets passed into messages, and then we stream the agent’s output.
Putting It All Together
The Full Code
Here’s our full code example.
What Our Agent Sees
The snippet here is what the agent sees. This contains our prompt and the page it fetches for referencing.
Model Output
In the snippet below, our model has finished reviewing the results. As you can see, we have a clean summary of the search results.
Conclusion
AI development gets easier all the time. With LangChain and Bright Data, you can use some of the best search engines around — Google, Bing and more! Our example here was pretty minimal, an automated search assistant.
You can take this project to the next level and try adding multiple tools to LangChain. You now know how to create tools, trim SERP results and how to feed them to an AI agent for enhanced output. Take your new skills and go build something.
LangChain also offers integrations with the following tools.
Here at Bright Data, we offer products of every shape and size to fit your data collection needs. Sign up for a free trial and get started today!