AI

Building an AI News Research Assistant with Bright Data and Vercel AI SDK

This guide walks you through building an AI news research assistant that scrapes global news, bypasses paywalls, detects bias, and provides smart analysis using Bright Data and Vercel AI SDK.
8 min read
Bright Data with Vercel AI SDK blog image

This system searches news across global sources, extracts full article content (bypassing paywalls), detects bias in coverage, and generates intelligent analysis based on real news events.

You will learn:

  • How to build news research tools using Bright Data SDK for web scraping
  • How to leverage Vercel AI SDK for intelligent news analysis and conversation
  • How to bypass paywalls and anti-bot protection to access any news source
  • How to detect bias by comparing coverage across multiple outlets
  • How to create an automated pipeline from news discovery to fact-checking

Let’s Begin!

Prerequisites

To follow this tutorial, you need:

  • Basic knowledge of React and Next.js
  • Node.js 20.18.1+ installed on your local development environment
  • A Bright Data account with API access (free tier available)
  • An OpenAI API key with GPT-4 access
  • Familiarity with TypeScript and modern JavaScript
  • Basic understanding of working with APIs and environment variables

Challenges in Conventional News Consumption

Traditional ways of accessing information and news have several key limitations:

  • Information Overload: You encounter hundreds of headlines daily. This makes it hard to figure out what matters or relates to your interests.  
  • Bias and Perspective Gaps: Most people get their news from a small number of sources. You miss important viewpoints and often see one-sided coverage of complex issues.  
  • Paywall Barriers: Quality journalism is often behind paywalls. You find it difficult to access full articles from various sources for thorough research.  
  • Fact-Checking Burden: To verify claims, you need to search multiple sources, cross-check information, and assess source credibility. Most people don’t have time for this.  
  • Lack of Context: Breaking news often misses historical context or related events. You have trouble seeing the complete picture of developing stories.  

NewsIQ addresses these challenges. It combines AI-powered analysis with enterprise-grade web scraping. The system accesses any news source (bypassing anti-bot protection), analyzes coverage across multiple outlets, and provides intelligent insights with proper source attribution.

Building the News Research Assistant

We’ll build NewsIQ, a complete AI news research assistant using Bright Data and Vercel AI SDK. We’ll create a solution to process news from any source and provide intelligent analysis through a conversational interface.

Step 1: Project Setup

First, set up your Next.js development environment. Create a new directory for the project:

npx create-next-app@latest ai-news-assistant

When prompted, select the following option:

Selecting the usage of recommended settings

Navigate to your project directory and install the required packages:

cd ai-news-assistant && 
npm install @brightdata/sdk ai zod @ai-sdk/openai

These packages provide everything you need: Bright Data SDK for web scraping, Vercel AI SDK for intelligent analysis, Zod for type-safe schema validation and OpenAI for LLM text generation.

All packages added successfully and the project was created

Next, create a .env.local file to store your API credentials:

BRIGHTDATA_API_KEY=your_brightdata_api_key_here
OPENAI_API_KEY=your_openai_api_key_here

You need:

  • Bright Data API token: Generate from your Bright Data dashboard
  • OpenAI API key: For LLM text generation

Step 2: Define the News Research Tools

Create the core news research functionality by defining three tools to leverage Bright Data’s web scraping. In the project directory, create a new file called lib/brightdata-tools.ts:

import { tool, type Tool } from "ai";
import { z } from "zod";
import { bdclient } from "@brightdata/sdk";

type NewsTools = "searchNews" | "scrapeArticle" | "searchWeb";

interface NewsToolsConfig {
  apiKey: string;
  excludeTools?: NewsTools[];
}

export const newsTools = (
  config: NewsToolsConfig
): Partial<Record<NewsTools, Tool>> => {
  const client = new bdclient({
    apiKey: config.apiKey,
    autoCreateZones: true,
  });

  const tools: Partial<Record<NewsTools, Tool>> = {
    searchNews: tool({
      description:
        "Search for news articles on any topic using Google News. Returns recent news articles with titles, snippets, sources, and publication dates. Use this for finding current news coverage on specific topics.",
      inputSchema: z.object({
        query: z
          .string()
          .describe(
            'The news search query (e.g., "artificial intelligence", "climate change policy", "tech earnings")'
          ),
        country: z
          .string()
          .length(2)
          .optional()
          .describe(
            'Two-letter country code for localized news (e.g., "us", "gb", "de", "fr", "jp")'
          ),
      }),
      execute: async ({
        query,
        country,
      }: {
        query: string;
        country?: string;
      }) => {
        try {
          const newsQuery = `${query} news`;
          const result = await client.search(newsQuery, {
            searchEngine: "google",
            dataFormat: "markdown",
            format: "raw",
            country: country?.toLowerCase() || "us",
          });
          return result;
        } catch (error) {
          return `Error searching for news on "${query}": ${String(error)}`;
        }
      },
    }),

    scrapeArticle: tool({
      description:
        "Scrape the full content of a news article from any URL. Returns the complete article text in clean markdown format, bypassing paywalls and anti-bot protection. Use this to read full articles after finding them with searchNews.",
      inputSchema: z.object({
        url: z.string().url().describe("The URL of the news article to scrape"),
        country: z
          .string()
          .length(2)
          .optional()
          .describe("Two-letter country code for proxy location"),
      }),
      execute: async ({ url, country }: { url: string; country?: string }) => {
        try {
          const result = await client.scrape(url, {
            dataFormat: "markdown",
            format: "raw",
            country: country?.toLowerCase(),
          });
          return result;
        } catch (error) {
          return `Error scraping article at ${url}: ${String(error)}`;
        }
      },
    }),

    searchWeb: tool({
      description:
        "General web search using Google, Bing, or Yandex. Use this for background research, fact-checking, or finding additional context beyond news articles.",
      inputSchema: z.object({
        query: z
          .string()
          .describe(
            "The search query for background information or fact-checking"
          ),
        searchEngine: z
          .enum(["google", "bing", "yandex"])
          .optional()
          .default("google")
          .describe("Search engine to use"),
        country: z
          .string()
          .length(2)
          .optional()
          .describe("Two-letter country code for localized results"),
      }),
      execute: async ({
        query,
        searchEngine = "google",
        country,
      }: {
        query: string;
        searchEngine?: "google" | "bing" | "yandex";
        country?: string;
      }) => {
        try {
          const result = await client.search(query, {
            searchEngine,
            dataFormat: "markdown",
            format: "raw",
            country: country?.toLowerCase(),
          });
          return result;
        } catch (error) {
          return `Error searching web for "${query}": ${String(error)}`;
        }
      },
    }),
  };

  for (const toolName in tools) {
    if (config.excludeTools?.includes(toolName as NewsTools)) {
      delete tools[toolName as NewsTools];
    }
  }

  return tools;
};

This code defines three essential tools using the Vercel AI SDK’s tool interface. The searchNews tool queries Google News for recent articles. The scrapeArticle extracts full content from any news URL (bypassing paywalls). The searchWeb provides general web search for fact-checking. Each tool uses Zod schemas for type-safe input validation and returns structured data for the AI to analyze. The Bright Data client handles all the complexity of anti-bot protection and proxy management automatically.

Step 3: Create the AI Chat API Route

Build the API endpoint to power the conversational interface. Create app/api/chat/route.ts:

import { openai } from "@ai-sdk/openai";
import { streamText, convertToModelMessages, stepCountIs } from "ai";
import { newsTools } from "@/lib/brightdata-tools";

export const maxDuration = 60;

export async function POST(req: Request) {
  const { messages } = await req.json();
  const modelMessages = convertToModelMessages(messages);

  const tools = newsTools({
    apiKey: process.env.BRIGHTDATA_API_KEY!,
  });


  const result = streamText({
    model: openai("gpt-4o"),
    messages: modelMessages,
    tools,
    stopWhen: stepCountIs(5),
    system: `You are NewsIQ, an advanced AI news research assistant. Your role is to help users stay informed, analyze news coverage, and understand complex current events.

**Core Capabilities:**
1. **News Discovery**: Search for current news on any topic using searchNews
2. **Deep Reading**: Scrape full articles with scrapeArticle to provide complete context
3. **Fact Checking**: Use searchWeb to verify claims and find additional sources
4. **Bias Analysis**: Compare coverage across multiple sources and identify potential bias
5. **Trend Analysis**: Identify emerging stories and track how topics evolve

**Guidelines:**
- Always cite your sources with publication name and date
- When analyzing bias, be objective and provide evidence
- For controversial topics, present multiple perspectives
- Clearly distinguish between facts and analysis
- If information is outdated, note the publication date
- When scraping articles, summarize key points before analysis
- For fact-checking, use multiple independent sources

**Response Format:**
- Start with a clear, direct answer
- Provide source citations in context
- Use bullet points for multiple sources
- End with a brief analysis or insight
- Offer to explore specific aspects further

Remember: Your goal is to help users become better-informed, critical thinkers.`,
  });

  return result.toUIMessageStreamResponse();
}

This API route creates a streaming endpoint to connect your news research tools with OpenAI’s GPT-4. The comprehensive system prompt guides the AI to act as a professional news analyst. It emphasizes source citation, objectivity, and critical thinking. The streaming response shows users analysis in real-time as it generates, creating a responsive conversational experience.

Getting a response based on real-time data

Step 4: Build the Chat Interface

Create the user interface for interacting with NewsIQ. Replace the content of app/page.tsx with:

```typescript
"use client";

import { useChat } from "@ai-sdk/react";
import { useState } from "react";

export default function NewsResearchAssistant() {
  const { messages, sendMessage, status } = useChat();
  const [input, setInput] = useState("");

  const [exampleQueries] = useState([
    "🌍 What are the latest developments in climate change policy?",
    "💻 Search for news about artificial intelligence regulation",
    "📊 How are different sources covering the economy?",
    "⚡ What are the trending tech stories this week?",
    "🔍 Fact-check: Did [specific claim] really happen?",
  ]);

  return (
    <div className="flex flex-col h-screen bg-gradient-to-br from-slate-50 via-blue-50 to-indigo-50">
      {/* Header */}
      <header className="bg-white shadow-md border-b border-gray-200">
        <div className="max-w-5xl mx-auto px-6 py-5">
          <div className="flex items-center gap-3">
            <div className="bg-gradient-to-br from-blue-600 to-indigo-600 w-12 h-12 rounded-xl flex items-center justify-center shadow-lg">
              <span className="text-2xl">📰</span>
            </div>
            <div>
              <h1 className="text-2xl font-bold text-gray-900">NewsIQ</h1>
              <p className="text-sm text-gray-600">
                AI-Powered News Research & Analysis
              </p>
            </div>
          </div>
        </div>
      </header>

      {/* Main Chat Area */}
      <div className="flex-1 overflow-hidden max-w-5xl w-full mx-auto px-6 py-6">
        <div className="h-full flex flex-col bg-white rounded-2xl shadow-xl border border-gray-200">
          {/* Messages Container */}
          <div className="flex-1 overflow-y-auto p-6 space-y-6">
            {messages.length === 0 ? (
              <div className="h-full flex flex-col items-center justify-center text-center px-4">
                {/* Welcome Screen */}
                <div className="bg-gradient-to-br from-blue-500 to-indigo-600 w-20 h-20 rounded-2xl flex items-center justify-center mb-6 shadow-lg">
                  <span className="text-4xl">📰</span>
                </div>
                <h2 className="text-3xl font-bold text-gray-900 mb-3">
                  Welcome to NewsIQ
                </h2>
                <p className="text-gray-600 mb-8 max-w-2xl text-lg">
                  Your AI-powered research assistant for news analysis,
                  fact-checking, and staying informed. I can search across news
                  sources, analyze bias, and help you understand complex
                  stories.
                </p>

                {/* Feature Pills */}
                <div className="flex flex-wrap gap-3 justify-center mb-8">
                  <div className="px-4 py-2 bg-blue-100 text-blue-700 rounded-full text-sm font-medium">
                    🔍 Multi-Source Research
                  </div>
                  <div className="px-4 py-2 bg-purple-100 text-purple-700 rounded-full text-sm font-medium">
                    🎯 Bias Detection
                  </div>
                  <div className="px-4 py-2 bg-green-100 text-green-700 rounded-full text-sm font-medium">
                    ✓ Fact Checking
                  </div>
                  <div className="px-4 py-2 bg-orange-100 text-orange-700 rounded-full text-sm font-medium">
                    📊 Trend Analysis
                  </div>
                </div>

                {/* Example Queries */}
                <div className="w-full max-w-3xl">
                  <p className="text-sm font-semibold text-gray-700 mb-4">
                    Try asking:
                  </p>
                  <div className="grid grid-cols-1 md:grid-cols-2 gap-3">
                    {exampleQueries.map((query, i) => (
                      <button
                        key={i}
                        onClick={() => {
                          setInput(query);
                        }}
                        className="p-4 text-left bg-gradient-to-br from-gray-50 to-gray-100 hover:from-blue-50 hover:to-indigo-50 rounded-xl border border-gray-200 hover:border-blue-300 transition-all duration-200 text-sm text-gray-700 hover:text-gray-900 shadow-sm hover:shadow-md"
                      >
                        {query}
                      </button>
                    ))}
                  </div>
                </div>
              </div>
            ) : (
              // Messages Display
              messages.map((m: any) => (
                <div
                  key={m.id}
                  className={`flex ${
                    m.role === "user" ? "justify-end" : "justify-start"
                  }`}
                >
                  <div
                    className={`max-w-[85%] rounded-2xl px-5 py-4 ${
                      m.role === "user"
                        ? "bg-gradient-to-br from-blue-600 to-indigo-600 text-white shadow-lg"
                        : "bg-gray-100 text-gray-900 border border-gray-200"
                    }`}
                  >
                    <div className="flex items-center gap-2 mb-2">
                      <span className="text-lg">
                        {m.role === "user" ? "👤" : "📰"}
                      </span>
                      <span className="text-xs font-semibold opacity-90">
                        {m.role === "user" ? "You" : "NewsIQ"}
                      </span>
                    </div>
                    <div className="prose prose-sm max-w-none prose-headings:font-bold prose-h3:text-lg prose-h3:mt-4 prose-h3:mb-2 prose-p:my-2 prose-ul:my-2 prose-li:my-1 prose-a:text-blue-600 prose-a:underline prose-strong:font-semibold">
                      <div
                        className="whitespace-pre-wrap"
                        dangerouslySetInnerHTML={{
                          __html:
                            m.parts
                              ?.map((part: any) => {
                                if (part.type === "text") {
                                  let html = part.text
                                    // Headers
                                    .replace(/### (.*?)$/gm, "<h3>$1</h3>")
                                    // Bold
                                    .replace(
                                      /\*\*(.*?)\*\*/g,
                                      "<strong>$1</strong>"
                                    )
                                    // Links
                                    .replace(
                                      /\[(.*?)\]\((.*?)\)/g,
                                      '<a href="$2" target="_blank" rel="noopener noreferrer">$1</a>'
                                    );

                                  html = html.replace(
                                    /(^- .*$\n?)+/gm,
                                    (match: string) => {
                                      const items = match
                                        .split("\n")
                                        .filter((line: string) => line.trim())
                                        .map((line: string) =>
                                          line.replace(/^- /, "")
                                        )
                                        .map((item: any) => `<li>${item}</li>`)
                                        .join("");
                                      return `<ul>${items}</ul>`;
                                    }
                                  );

                                  // Paragraphs
                                  html = html
                                    .split("\n\n")
                                    .map((para: string) => {
                                      if (
                                        para.trim() &&
                                        !para.startsWith("<")
                                      ) {
                                        return `<p>${para}</p>`;
                                      }
                                      return para;
                                    })
                                    .join("");

                                  return html;
                                }
                                return "";
                              })
                              .join("") || "",
                        }}
                      />
                    </div>
                  </div>
                </div>
              ))
            )}

            {/* Loading Indicator */}
            {(status === "submitted" || status === "streaming") && (
              <div className="flex justify-start">
                <div className="bg-gray-100 rounded-2xl px-5 py-4 border border-gray-200">
                  <div className="flex items-center gap-3">
                    <div className="flex space-x-2">
                      <div className="w-2 h-2 bg-blue-500 rounded-full animate-bounce"></div>
                      <div className="w-2 h-2 bg-blue-500 rounded-full animate-bounce delay-100"></div>
                      <div className="w-2 h-2 bg-blue-500 rounded-full animate-bounce delay-200"></div>
                    </div>
                    <span className="text-sm text-gray-600">
                      Researching news sources...
                    </span>
                  </div>
                </div>
              </div>
            )}
          </div>

          {/* Input Area */}
          <div className="border-t border-gray-200 p-5 bg-gray-50">
            <form
              onSubmit={(e) => {
                e.preventDefault();
                if (input.trim()) {
                  sendMessage({ text: input });
                  setInput("");
                }
              }}
              className="flex gap-3"
            >
              <input
                value={input}
                onChange={(e) => setInput(e.target.value)}
                placeholder="Ask about any news topic, request analysis, or fact-check a claim..."
                className="flex-1 px-5 py-3 border border-gray-300 rounded-xl focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent bg-white shadow-sm text-gray-900 placeholder-gray-600"
                disabled={status === "submitted" || status === "streaming"}
              />

              <button
                type="submit"
                disabled={
                  status === "submitted" ||
                  status === "streaming" ||
                  !input.trim()
                }
                className="px-8 py-3 bg-gradient-to-r from-blue-600 to-indigo-600 text-white rounded-xl hover:from-blue-700 hover:to-indigo-700 disabled:opacity-50 disabled:cursor-not-allowed transition-all duration-200 font-semibold shadow-lg hover:shadow-xl"
              >
                {status === "submitted" || status === "streaming" ? (
                  <span className="flex items-center gap-2">
                    <svg className="animate-spin h-5 w-5" viewBox="0 0 24 24">
                      <circle
                        className="opacity-25"
                        cx="12"
                        cy="12"
                        r="10"
                        stroke="currentColor"
                        strokeWidth="4"
                        fill="none"
                      />
                      <path
                        className="opacity-75"
                        fill="currentColor"
                        d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"
                      />
                    </svg>
                    Analyzing
                  </span>
                ) : (
                  "Research"
                )}
              </button>
            </form>
            <div className="flex items-center justify-between mt-3">
              <p className="text-xs text-gray-500">
                Powered by Bright Data × Vercel AI SDK
              </p>
              <div className="flex gap-2">
                <span className="px-2 py-1 bg-green-100 text-green-700 rounded text-xs font-medium">
                  ✓ Real-time
                </span>
                <span className="px-2 py-1 bg-blue-100 text-blue-700 rounded text-xs font-medium">
                  🌐 Global Sources
                </span>
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  );
}

This interface creates an engaging conversational experience with the Vercel AI SDK’s useChat hook. The welcome screen features example queries to help you get started. The main chat area displays messages with streaming support. The design uses Tailwind CSS for a modern, professional appearance with gradient backgrounds and smooth animations. The component handles loading states gracefully and provides visual feedback during AI processing.

Screenshot of NewsIQ welcome screen with example queries

Step 5: Update the Root Layout

Complete the application setup by updating app/layout.tsx with proper metadata:

import type { Metadata } from 'next'
import { Inter } from 'next/font/google'
import './globals.css'

const inter = Inter({ subsets: ['latin'] })

export const metadata: Metadata = {
  title: 'NewsIQ - AI News Research Assistant',
  description:
    'AI-powered news research, analysis, and fact-checking tool. Search across sources, detect bias, and stay informed with intelligent insights.',
  keywords: [
    'news',
    'AI',
    'research',
    'fact-checking',
    'bias detection',
    'news analysis',
  ],
}

export default function RootLayout({
  children,
}: {
  children: React.ReactNode
}) {
  return (
    <html lang="en">
      <body className={inter.className}>{children}</body>
    </html>
  )
}

This layout configuration sets up proper SEO metadata and loads the Inter font for a clean, professional typography throughout the application.

Step 6: Running the Application

To run the application, use this command:

npm run dev

The application will start on http://localhost. To test NewsIQ’s capabilities, try this example query:

Fact-check: Did Apple announce a new product last week?

The AI will automatically use the appropriate tools based on your query. When you ask for news, it searches Google News. When you request full articles, it scrapes the content. For fact-checking, it cross-references multiple sources. You’ll see results streaming in real-time as the AI processes information.

NewsIQ in action - searching, scraping, and analyzing news

Step 7: Deploy to Vercel

To deploy your application to production, first push your code to GitHub:

git init
git add .
git commit -m "Initial commit: NewsIQ AI News Assistant"
git branch -M main
git remote add origin https://github.com/yourusername/ai-news-assistant.git
git push -u origin main

Then deploy to Vercel:

  1. Go to vercel.com and sign in with GitHub
  2. Click “Add New Project” and import your repository
  3. Configure environment variables:
  • Add BRIGHTDATA_API_KEY
  • Add OPENAI_API_KEY
  1. Click “Deploy”

Your application will be live in 2-3 minutes at a URL like https://ai-news-assistant.vercel.app.

Final Thoughts

This AI news research assistant shows how automation streamlines news gathering and analysis. If you’re interested in alternative approaches to integrating Bright Data with AI tools, explore our guide on web scraping with an MCP server. To further enhance your news monitoring workflows, consider Bright Data products like our Web Scraper API for accessing any news source, as well as other datasets and automation tools built for content aggregation and media monitoring teams.

Explore more solutions in the Bright Data documentation.

Create a free Bright Data account to start using your automated news research workflows.

Arindam Majumder

Technical Writer

Arindam Majumder is a developer advocate, YouTuber, and technical writer who simplifies LLMs, agent workflows, and AI content for 5,000+ followers.

Expertise
RAG AI Agents Python