Node.js Web Proxy Setup: Using node-fetch and https-proxy-agent - Bright Data

Node.js is a popular JavaScript runtime environment that lets you create server-side and network applications. For instance, if you need to fetch data from a remote API or website in Node.js, you can use a web proxy server that acts as an intermediary between your application and the internet. This allows you to bypass network restrictions, access geo-blocked content, and hide your IP address.

Most modern JavaScript apps use the Fetch API to make HTTP requests. It’s available in web browsers as a built-in feature, and since Node.js v16, it’s available without any external dependencies. However, there’s a catch: the Node.js built-in fetch functionality mirrors what’s available in web browsers and cannot work with proxies.

Thankfully, there’s a simple solution: the node-fetch library. This library includes a Fetch API for Node.js that adds additional Node.js-specific functionality, including the ability to work with Node.js HTTP agents. An HTTP agent is a tool that manages connection pooling, allowing you to reuse connections for HTTP requests. This means you can use an HTTP agent to set up your fetch requests as they go through the proxy.

In this article, you’ll learn how to use node-fetch along with the https-proxy-agent library to create an HTTP agent that supports both HTTP and HTTPS proxies. You’ll also learn about the Bright Data proxy service, which offers a variety of proxy types and features for your web scraping needs.

How to Use a Proxy with node-fetch

Before you begin this tutorial, you need the following prerequisites:

A working installation of Node.js.
Access to a web proxy. Alternatively, for testing purposes, you can set up your own web proxy using Node.js.

Once you have these prerequisites, you’re ready to start!

Create a Node.js Project

The first thing you need to do is create a Node.js project and initialize it with npm. To do this, open a terminal and run the following commands:

mkdir node-fetch-proxy
cd node-fetch-proxy
npm init -y

This command creates a folder called node-fetch-proxy, navigates into it, and creates a package.json file with some default values.

Install node-fetch and https-proxy-agent

Next, you need to install the node-fetch and https-proxy-agent libraries as dependencies for your project with the following command:

npm install –-save node-fetch https-proxy-agent

This command installs the libraries to your project’s node_modules folder and updates your package.json file accordingly.

Use the HttpsProxyAgent and node-fetch to Make HTTP Calls through a Proxy

After you install the node-fetch and https-proxy-agent libraries, you need to use the HttpsProxyAgent class from the https-proxy-agent library along with the fetch function from the node-fetch library to make HTTP calls through a proxy. To do this, create a file called proxy.mjs in your project folder and add the following code:

import fetch from 'node-fetch';
import { HttpsProxyAgent } from 'https-proxy-agent';

// Replace <proxy_url> with your actual proxy URL
const agent = new HttpsProxyAgent('<proxy_url>');

// Use fetch with the agent option to make an HTTP request through the proxy
// Replace <target_url> with the URL you want to request
fetch('<target_url>', { agent })
  .then((response) => response.text())
  .then((text) => console.log(text))
  .catch((error) => console.error(error));

This code completes the following actions:

It imports the fetch function from the node-fetch library, which provides a browser-compatible Fetch API for Node.js.
It imports the HttpsProxyAgent class from the https-proxy-agent library, which creates an HTTP agent that supports HTTPS proxies.
It creates an HttpsProxyAgent instance with your proxy URL. You need to replace <proxy_url> with your actual proxy URL, which should have the following format: http://username:password@host:port.
It uses the fetch function with the agent option to make an HTTP request through the proxy. You need to replace <target_url> with the URL you want to request, which can be any valid HTTP or HTTPS URL.
It handles the response and prints out any errors that occur.

Save the code and run the following command:

node proxy.mjs

This command executes your code and makes an HTTP request through the proxy. You should see the response text or any errors in your terminal.

Limitations to Using a Proxy with node-fetch

While this approach works well when using a simple proxy, you’ll find that many real-world use cases, such as web scraping, require a more sophisticated approach. Many websites and web APIs block the IP addresses of well-known proxies, making it difficult to gather the data you need.

Additionally, sites might have geographic restrictions that hide content from visitors based on their location. For instance, some US newspapers block European visitors from their websites due to difficulties in complying with the General Data Protection Regulation (GDPR) and the ePrivacy Directive.

Fortunately, more sophisticated proxies, like those offered by the Bright Data proxy service, provide several types of proxies that help users work around these limitations.

Bright Data Proxy Service

If you’re looking for a reliable and scalable proxy service for web scraping, look no further. Bright Data is a leading provider of proxy solutions, offering over 72 million IP addresses in 195 countries, with 99.9 percent uptime.

With Bright Data proxy servers, you can choose from different types of proxies, such as residential, datacenter, internet service provider (ISP), mobile, or super proxies, depending on your use case and budget. This range of choices is helpful when creating a web scraper that needs to obtain content from sites that block known proxy IP addresses.

Instead of relying on a generic proxy running in a location you can’t control, Bright Data proxies let you choose the type of proxy that best matches the type of user who typically accesses the content you’re trying to scrape. You can also access advanced features, such as geo-targeting, IP rotation, and session control, that ensure your application doesn’t get blocked.

Bright Data Proxy Types

Bright Data offers different types of proxies for different web scraping scenarios. Following are a few of the proxies they offer:

The residential proxy funnels your internet queries through genuine devices, like personal computers or laptops, that are linked to the web via a residential ISP. These proxies excel in web scraping activities because they blend in with ordinary users, which reduces the chance your proxied requests will be blocked.
The proxy servers acts as a gateway and grants you entry to the Bright Data extensive network of residential proxies. With a single super proxy URL, you can connect to any residential proxy without needing to designate a particular IP address. Moreover, the super proxy offers the flexibility to fine-tune multiple settings, including selecting the country, city, ISP, or ASN of the residential proxy you wish to employ.
The datacenter proxy uses an IP address allocated to a server situated within a datacenter facility. Datacenter proxies are prone to detection and subsequent blocking by websites and APIs, but they’re still useful for apps accessing data from URLs that don’t try to block proxy requests. They are most appropriate for web scraping projects where your scraping traffic doesn’t need to come from residential, business, or mobile IP addresses.
The ISP proxy is an IP address that’s assigned to a server hosted by an ISP. ISP proxies are similar to residential proxies, with the difference being that their IP addresses may be residential or business IPs. This makes them a good fit when scraping sites that expect a mix of residential and business traffic.
Mobile proxies are IP addresses tied to mobile hardware (ie smartphones or tablets) that connect to the internet via cellular networks. Bright Data mobile proxies are virtually undetectable, effectively simulating the activities of genuine mobile users. They are particularly useful when you need to make sure you’re scraping specific content served by mobile websites and APIs without being blocked from accessing the data from a non-mobile IP address.

Proxy Manager and APIs

With so many choices, managing your proxy use can be complicated. Fortunately, the Bright Data Proxy Manager lets you easily configure and manage your proxies, monitor your usage and performance, and troubleshoot any issues.

Additionally, you can use Bright Data APIs and integrations to integrate proxies into your code seamlessly. The Scraping Browser API makes it easy to use proxies from tools you’re likely familiar with (ie Playwright and Puppeteer), and the Web Unlocker helps you overcome blocking that would normally prevent your scraper from accessing a site.

Conclusion

In this article, you learned how to use the node-fetch library with the https-proxy-agent library to create an HTTP agent that supports HTTPS proxies. You also learned about the Bright Data proxy service, which offers a variety of proxy types and features for your web scraping needs.

Using a web proxy with fetch requests in Node.js can help you overcome network restrictions, access geo-blocked content, or hide your IP address. However, simple proxying using node-fetch and https-proxy-agent may not be enough for some web scraping scenarios, especially if you need anonymity, geo-targeting, IP rotation, or other advanced features. In these cases, you should consider using a Bright Data proxy instead.

Bright Data’s proxy services offers a wide range of features that can help you access any website or API without getting blocked or throttled. You can also choose from different types of proxies depending on your use case and budget. To get started with the Bright Data proxy service, sign up for a free trial today.

Start free trial

Start free with Google