Guide to Using cURL With Proxies

Use this detailed guide complete with code snippets to help jump start your cURL with proxies journey.
Daniel Shashko
Daniel Shashko | SEO Specialist
10-Nov-2022

In this post, we will cover:

What is curl

curl or ‘Client URL’ is a command-line tool which is used to transfer data to/from servers over the internet using URL Syntax. curl is used for proxy support, HTTP posts, user authentication, and data collection. Companies typically use curl to download entire web sites or specific pages.

Does curl support proxies?

Yes, to start using curl with proxies, enter the proxy addresses you wish to use with the help of the following commands:

-x
--proxy

Then go ahead and enter each proxies credentials using the following command lines:

-U
--proxy-user

If you fail to specify certain credentials then curl will substitute them for the following defaults:

Protocol: http://
Port number: 1080

Here is an example of what this should look like:

$ curl  --proxy proxy_FQDN_OR_IPAddress:PortNo --proxy-user Username:Password “Website link”


For Example -

$ curl -x proxy.example.com:3128 -U testuser:test123 https://www.reddit.com

Or

$ curl --proxy proxy.example.com:3128 --proxy-user testuser:test123 https://www.reddit.com

Or

$ curl --proxy testuser:[email protected]:3128 http://www.reddit.com

Installing curl

For those of you using MacOSX here are step-by-step instructions:

Open Mac terminal and run following command:

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)”

Sometimes you are required to enter a password, in which case enter your MacBook password here.

Now, run following command to make brew command available inside the Terminal:

$ echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile

Run the following command:

$ brew install curl

For those of you looking for Windows or Linux installation guides, check out these instructions

What you will need to start using proxies

In order to get started using curl with proxy services, you will need to have the following information handy: IP address, port number, protocol, and in some cases the username and password. The most common internet protocols currently in use include HTTP and HTTPS – the next section will explain more about which proxies work best with which use cases and internet protocols. 

Which proxies and protocols work best

In order to answer this question, one needs to understand that curl can be used to send API requests which have many components. The most important ones to focus on for our purposes are:

  • ‘Endpoints’ – This is the URL or the website from which we are attempting to extract/download data.
  • ‘Headers’ – This consists of the request metadata such as the User Agent.

Deciding which proxies to use for web scraping with curl will very much depend on your use case and the nature of your request. In Bright Data’s ‘Ultimate Guide for Proxies’, you can really get into the nitty gritty. But for now here are some highlights:

  • Datacenter: Demands more R&D resources in order to build things such as ‘user emulation’ and may benefit use cases that require ‘static IPs’ e.g., collecting retail data for a single location.
  • Residential: Requires fewer in-house resources. This is a real network of devices belonging to individuals and will work best when looking to collect geo-specific, customer tailored data. For example, localized competitor marketing campaigns.
  • ISP proxies: Are a combination between Datacenter and Residential as they are routed through datacenters but are treated as Residential requests by target sites. This network works best with web data extraction use cases that have specific city or country targeting such as product pricing and consumer sentiment.
  • Mobile proxies: Consist of real 3G/4G devices. This type of proxy works best for cellular based activities such as ad-verification, and application User Experience and Interface monitoring.

HTTP vs. HTTPS

Regarding HTTP vs. HTTPS, the latter is the preferable option both in terms of using curl as well as generally speaking when it comes to secure data collection. The ‘S’ stands for Secure as it has better end-to-end encryption employing Transport Layer Security (TLS) protocols.

HTTPS aims to authenticate the target website as well as protecting the privacy and integrity of the data being transferred. HTTPS is therefore more suited for discreet collection or transfer of sensitive data. Whereas HTTP is better suited for collecting in-depth market research data or data at scale.

Each business can use the above to decide which protocol best suits their needs both when using curl and generally speaking.

How to define a proxy in curl

Once you have decided on a proxy protocol type, you can setup your proxies in curl by using this command:

curl --help

Then, choose the following option from the output list:

-x, --proxy [protocol://]host[:port] 

Using environment variables

For those of you that are interested in using environment variables, go ahead and run the command that applies to your work as follows:

export http_proxy="http://user:[email protected]_IP_Address_or_FQDN:port"
export https_proxy=http://user:[email protected]_Ip_Address_or_FQDN:port

Example –

$ export http_proxy=”http://testuser:[email protected]:3128”
$ export https_proxy=”http://testuser:[email protected]:3128”

Now, you can continue running curl normally using the following command:

$ curl -v https://www.reddit.com

-v option can be helpful to investigate which proxy and port number is used to connect the target URL.

Important tricks and tips

In this section, we are going to show you some interesting tricks and valuable tips to using proxies with curl in a way that benefits your specific use case the most.

How to always use proxies for curl

If you want to designate proxies to only be used for curl-based jobs then go ahead and use the following string of commands:

One: cd ~
$ nano .curlrc

Two -  Add this line in the file:

proxy=http://user:[email protected]_address_or_FQDN:port

Example -

proxy=http://testuser:[email protected]:3128

Three - Now run cUrl regularly:

$ curl "https://www.reddit.com"

Turning proxies on and off

You can do this by creating an alias in your .bashrc file in your editor as follows:

$ cd ~

alias proxyon="export http_proxy=' http://user:[email protected]_IP_Or_FQDN:Port';export https_proxy='http://user:[email protected] Proxy_IP_Or_FQDN:Port'"


alias proxyoff="unset http_proxy;unset https_proxy"

Example –

alias proxyon="export http_proxy='http://testuser:[email protected]:3128';export https_proxy=' http://testuser:[email protected]:3128'"


Run alias command on terminal to quickly check the alias setup

Now, save the .bashrc and update the shell using:

$ ~/.bashrc

Bypass SSL certificate errors

When cURL experiences SSL certificate errors it blocks those requests. When looking to debug, especially in a one-off case scenario, you can ‘skip’ SSL certificate errors if you add -k or –insecure to the cURL command line as follows:

curl -x "[protocol://][host][:port]" -k [URL]

Getting More information about the request

In some cases, your requests won’t work as you expected, and you will probably want to diagnose the request path, headers and different errors.

In order to investigate the request, add -v (–verbose) to the request after the Curl, this will output all the request headers and connections you’ve experienced.

Ignore proxies for a single request

If you are looking to override a proxy for a specific request, go ahead and use the following command line:

curl --proxy "http://user:[email protected]_FQDN_Or_IPAddress" "https://reddit.com"

Or use:

$ curl --noproxy "*" https://www.reddit.com

If you want to bypass proxies altogether. Using option -v, it shows connection is going directly to Reddit without using any Proxy as shown in the image:

Not using curl with a proxy
Not using curl with a proxy

Using SOCK proxies

If you wish to use any kind of SOCK proxy (4/4a/5/5h) the code structure remains the same as before except you swap out the relevant section with the relevant socks type as follows:

curl -x "socks5://user:[email protected]_IP_or_FQDN:Port" https://www.reddit.com

For Example -

$ curl -x "socks5://testuser:[email protected]:3128" https://www.reddit.com

Pro Tip – No protocol specified will make curl default to SOCKS4!

The bottom line

When looking to use curl with proxies there are many technical decisions to make, but the most important point to remember throughout this journey is using a reputable proxy provider. Bright Data offers all of the-above mentioned proxy types, performing real-time network monitoring and implementing a zero IP address reselling policy.  

Additionally, Bright Data has one of the largest residential peer networks enabling data collection from a local user’s perspective. This is especially true for companies looking for US-based IPs, making Bright a popular choice among business professionals and developers alike. 

Daniel Shashko
Daniel Shashko | SEO Specialist

Daniel is an SEO specialist here at Bright Data with a B2C background. He is in charge of ensuring that businesses get exposed to articles that help them become more data-driven. He is fascinated by the intricate inner workings that the digital world is comprised of and how these can be navigated for hypergrowth.