Making HTTP requests is one of those skills that sounds intimidating but becomes second nature once you get the hang of it. Whether you're pulling data from an API, scraping a website, or building your own R package that talks to the web, you'll need to know how to send requests and handle responses.

In this guide, I'll show you the modern approach to HTTP requests in R using httr2, along with practical examples that solve real problems. By the end, you'll be able to hit any API, handle authentication, deal with rate limits, and even run parallel requests for better performance.

Table of contents

  • What are HTTP requests (and why should you care)?
  • Getting started with httr2
  • Your first GET request
  • Adding query parameters and headers
  • POST requests with body data
  • Authentication strategies
  • Error handling and retries
  • Rate limiting and throttling
  • Parallel requests for performance
  • Working with paginated APIs
  • Converting curl commands to R
  • Common pitfalls and how to avoid them

What are HTTP requests (and why should you care)?

Every time you visit a website, your browser sends an HTTP request to a server asking for data. The server processes that request and sends back an HTTP response containing the webpage, image, or whatever you asked for.

In R, you can do the same thing programmatically. Instead of clicking around in a browser, you write code that sends requests and processes responses. This is incredibly useful for:

  • Pulling data from APIs - Most modern web services offer APIs that return structured data (usually JSON)
  • Web scraping - When there's no API, you can request HTML pages and extract the data you need
  • Building integrations - Connect your R code to external services like Slack, GitHub, or Stripe
  • Automating workflows - Schedule scripts to fetch fresh data without manual intervention

The key difference from using a browser? You get to process the responses programmatically, which means you can extract exactly what you need and integrate it directly into your data analysis pipeline.

Getting started with httr2

R has several packages for making HTTP requests, but httr2 is the modern standard. It's a complete rewrite of the older httr package with a cleaner, pipe-friendly interface and better features for working with APIs.

If you've used httr before, httr2 might feel familiar but better. Think of it as httr that learned from five years of real-world usage and fixed all the annoying parts.

Install it like any other package:

install.packages("httr2")
library(httr2)

The core workflow in httr2 is straightforward:

  1. Create a request object
  2. Modify it (add headers, authentication, etc.)
  3. Perform the request
  4. Extract data from the response

This is different from httr, where you'd do everything in one function call. Having an explicit request object makes it easier to build complex requests step by step and works beautifully with pipes.

Your first GET request

Let's start simple. Here's how to fetch data from an API:

library(httr2)

# Create a request
req <- request("https://api.github.com/users/hadley")

# Perform it and get the response
resp <- req_perform(req)

# Check the status
resp

This returns something like:

<httr2_response>
GET https://api.github.com/users/hadley
Status: 200 OK
Content-Type: application/json
Body: In memory (1876 bytes)

A 200 status code means success. The response includes headers (like Content-Type) and a body containing the actual data.

To extract the JSON data from the response:

# Parse JSON response
user_data <- resp_body_json(resp)

# Access specific fields
user_data$name
user_data$public_repos

The resp_body_json() function automatically parses the JSON and returns an R list. For other response types, httr2 provides resp_body_html(), resp_body_xml(), and resp_body_string().

Adding query parameters and headers

Most APIs need additional information in your requests. Query parameters go in the URL (like ?page=1&limit=50), while headers provide metadata about your request.

Query parameters

Instead of manually building URL strings, use req_url_query():

req <- request("https://api.github.com/search/repositories") |>
  req_url_query(
    q = "language:R",
    sort = "stars",
    per_page = 10
  )

resp <- req_perform(req)

This automatically handles URL encoding, so you don't have to worry about spaces or special characters breaking your URLs.

Custom headers

Headers are metadata about your request. The most common use case is authentication, but they're also used to specify content types or user agents:

req <- request("https://api.example.com/data") |>
  req_headers(
    Accept = "application/json",
    `User-Agent` = "my-r-script/1.0"
  )

resp <- req_perform(req)

Note the backticks around User-Agent - that's because R doesn't like hyphens in variable names. The backticks tell R to treat it as a literal string.

Seeing what gets sent

Want to debug your request before sending it? Use req_dry_run():

req |> req_dry_run()

This prints out exactly what HTTP request will be sent, including all headers and the request body. It's incredibly useful when something isn't working and you need to see what's actually happening under the hood.

POST requests with body data

GET requests fetch data. POST requests send data to a server. This is how you submit forms, create resources, or send data to an API.

Sending JSON

The most common format for API requests is JSON:

req <- request("https://api.example.com/users") |>
  req_body_json(
    list(
      name = "Alice",
      email = "alice@example.com",
      age = 30
    )
  )

resp <- req_perform(req)

The req_body_json() function automatically:

  • Converts your R list to JSON
  • Sets the Content-Type header to application/json
  • Changes the request method from GET to POST

Sending form data

Some APIs expect form-encoded data instead of JSON:

req <- request("https://api.example.com/login") |>
  req_body_form(
    username = "alice",
    password = "secret123"
  )

resp <- req_perform(req)

This mimics what happens when you submit an HTML form in a browser.

Uploading files

For file uploads, use req_body_multipart():

req <- request("https://api.example.com/upload") |>
  req_body_multipart(
    file = curl::form_file("data.csv"),
    description = "Monthly sales data"
  )

resp <- req_perform(req)

Authentication strategies

Most useful APIs require authentication. httr2 supports all the common methods.

Bearer tokens

The simplest authentication method - just include a token in the headers:

req <- request("https://api.github.com/user/repos") |>
  req_auth_bearer_token("ghp_your_token_here")

resp <- req_perform(req)

This adds an Authorization: Bearer <token> header to your request.

Pro tip: Never hardcode tokens in your scripts. Use environment variables instead:

token <- Sys.getenv("GITHUB_TOKEN")

req <- request("https://api.github.com/user/repos") |>
  req_auth_bearer_token(token)

Set the environment variable in your .Renviron file:

GITHUB_TOKEN=ghp_your_token_here

Basic authentication

Some APIs use username and password:

req <- request("https://api.example.com/data") |>
  req_auth_basic("username", "password")

resp <- req_perform(req)

This sends your credentials encoded in the Authorization header. It's called "basic" because it's simple, but it's only secure over HTTPS.

OAuth 2.0

OAuth is more complex but provides better security for accessing user data. httr2 has built-in support for various OAuth flows:

client <- oauth_client(
  id = "your_client_id",
  secret = "your_client_secret",
  token_url = "https://api.example.com/oauth/token",
  name = "my_app"
)

req <- request("https://api.example.com/data") |>
  req_oauth_auth_code(
    client = client,
    auth_url = "https://api.example.com/oauth/authorize"
  )

resp <- req_perform(req)

The first time you run this, it'll open a browser window for you to authorize the app. After that, it caches the token so you don't have to authorize again.

Error handling and retries

Not every request succeeds. Servers go down, networks have hiccups, and rate limits kick in. httr2 makes it easy to handle these situations gracefully.

Basic error handling

By default, httr2 converts HTTP errors (4xx and 5xx status codes) into R errors:

req <- request("https://api.github.com/users/nonexistent")

tryCatch(
  resp <- req_perform(req),
  error = function(e) {
    message("Request failed: ", e$message)
  }
)

Automatic retries

For transient errors (like rate limiting or temporary server issues), you can automatically retry:

req <- request("https://api.example.com/data") |>
  req_retry(
    max_tries = 3,
    is_transient = \(resp) resp_status(resp) %in% c(429, 500, 503)
  )

resp <- req_perform(req)

This will retry up to 3 times if the server returns a 429 (rate limit), 500, or 503 error. httr2 automatically uses exponential backoff, waiting longer between each retry.

Custom error messages

Want to provide better error messages to your users?

req <- request("https://api.example.com/data") |>
  req_error(
    is_error = \(resp) resp_status(resp) >= 400,
    body = function(resp) {
      json <- resp_body_json(resp)
      paste("API error:", json$error$message)
    }
  )

Now when an error occurs, your custom message gets displayed instead of the generic HTTP error.

Rate limiting and throttling

APIs often have rate limits - restrictions on how many requests you can make per minute or hour. Blast an API with too many requests and you'll get blocked.

httr2's req_throttle() helps you stay within limits:

req <- request("https://api.github.com/users/hadley") |>
  req_throttle(rate = 30 / 60)  # 30 requests per 60 seconds

resp <- req_perform(req)

This ensures you never exceed 30 requests per minute. If you make requests too quickly, httr2 automatically waits before sending the next one.

For more sophisticated rate limiting:

req <- request("https://api.example.com/data") |>
  req_throttle(
    capacity = 100,      # Maximum 100 requests
    fill_time_s = 3600   # Refills over 1 hour
  )

This implements a token bucket algorithm - you start with 100 requests, and the bucket refills at a rate of 1 request every 36 seconds.

Parallel requests for performance

Sequential requests are slow. If you need to fetch data for 100 users, doing it one at a time takes forever. Parallel requests let you make multiple requests simultaneously.

Simple parallel execution

library(httr2)

# Create a list of requests
users <- c("hadley", "jennybc", "jimhester")
reqs <- lapply(users, function(user) {
  request(paste0("https://api.github.com/users/", user)) |>
    req_throttle(rate = 30 / 60)  # Still respect rate limits!
})

# Perform them in parallel
resps <- req_perform_parallel(reqs, max_active = 3)

# Extract data from each response
user_data <- lapply(resps, resp_body_json)

The max_active parameter controls how many requests run simultaneously. Don't set this too high or you'll overwhelm the server (and get blocked).

Important: Always use req_throttle() with parallel requests. Without it, you'll fire off all requests at once and likely get rate limited.

Handling errors in parallel requests

By default, req_perform_parallel() stops on the first error. Use on_error = "continue" to keep going:

resps <- req_perform_parallel(
  reqs,
  on_error = "continue",
  max_active = 5
)

# Check which requests succeeded
successes <- resps_successes(resps)
failures <- resps_failures(resps)

cat("Succeeded:", length(successes), "\n")
cat("Failed:", length(failures), "\n")

Working with paginated APIs

Many APIs return data in pages. Instead of sending all 10,000 results at once, they send 50 at a time and provide a way to request the next page.

httr2 has built-in helpers for common pagination patterns:

Offset-based pagination

req <- request("https://api.example.com/items") |>
  req_url_query(limit = 50)

resps <- req_perform_iterative(
  req,
  next_req = iterate_with_offset(
    param_name = "offset",
    resp_pages = function(resp) {
      json <- resp_body_json(resp)
      json$total_pages
    }
  ),
  max_reqs = Inf
)

# Combine all results
all_items <- unlist(lapply(resps, function(r) {
  resp_body_json(r)$items
}), recursive = FALSE)

This automatically adds offset=0, offset=50, offset=100, etc. to subsequent requests until all pages are fetched.

Cursor-based pagination

Some APIs use cursors instead of offsets:

req <- request("https://api.example.com/items")

resps <- req_perform_iterative(
  req,
  next_req = iterate_with_cursor(
    param_name = "cursor",
    resp_cursor = function(resp) {
      json <- resp_body_json(resp)
      json$next_cursor
    }
  ),
  max_reqs = Inf
)

The iterate_with_cursor() function extracts the next_cursor from each response and uses it in the next request.

Converting curl commands to R

Ever find a working curl command in API documentation and wish you could just use it in R? Good news: httr2 can translate curl commands for you.

library(httr2)

curl_cmd <- 'curl -X GET --header "Accept: application/json" --header "Authorization: Bearer token123" "https://api.example.com/data?limit=10"'

# Convert to httr2 code
curl_translate(curl_cmd)

This outputs R code you can copy and paste:

request("https://api.example.com/data") |>
  req_url_query(limit = "10") |>
  req_headers(
    Accept = "application/json",
    Authorization = "Bearer token123"
  )

It even handles complex curl commands with custom headers, authentication, and request bodies. This is a massive time-saver when you're trying to replicate an API call you found in documentation or copied from your browser's developer tools.

Common pitfalls and how to avoid them

Forgetting to URL-encode parameters

If you build URLs manually, special characters will break them:

# Bad
url <- paste0("https://api.example.com/search?q=", "R programming")  # Space breaks the URL

# Good
req <- request("https://api.example.com/search") |>
  req_url_query(q = "R programming")  # Automatically encoded

Not handling rate limits

Hammering an API without rate limiting is a fast track to getting blocked:

# Bad - no rate limiting
for (i in 1:1000) {
  resp <- request("https://api.example.com/data") |> req_perform()
}

# Good - throttled
req <- request("https://api.example.com/data") |>
  req_throttle(rate = 10 / 60)  # 10 per minute

for (i in 1:1000) {
  resp <- req_perform(req)
  # httr2 automatically waits when needed
}

Hardcoding secrets in scripts

Never put API keys or passwords directly in your code:

# Bad - visible in code
req <- request("https://api.example.com/data") |>
  req_auth_bearer_token("super_secret_token_123")

# Good - from environment variable
token <- Sys.getenv("API_TOKEN")
req <- request("https://api.example.com/data") |>
  req_auth_bearer_token(token)

Set the environment variable in your .Renviron file, and it stays out of version control.

Not using req_dry_run() when debugging

When requests aren't working, looking at the actual HTTP being sent saves hours of frustration:

req <- request("https://api.example.com/data") |>
  req_headers(Authorization = "Bearer token") |>
  req_url_query(limit = 10) |>
  req_dry_run()  # See exactly what will be sent

Ignoring HTTP status codes

Just because a request doesn't error doesn't mean it worked:

resp <- req_perform(req)

# Check the status
if (resp_status(resp) == 200) {
  data <- resp_body_json(resp)
} else {
  warning("Request returned status: ", resp_status(resp))
}

A 404 means "not found," 401 means "unauthorized," 429 means "rate limited," and 500+ means server errors. Handle them appropriately.

Making sequential requests when parallel would work

If requests are independent, run them in parallel:

# Slow - sequential
for (id in ids) {
  resp <- request(paste0("https://api.example.com/items/", id)) |>
    req_perform()
}

# Fast - parallel
reqs <- lapply(ids, function(id) {
  request(paste0("https://api.example.com/items/", id))
})
resps <- req_perform_parallel(reqs, max_active = 10)

For 100 requests, this can be 10x faster or more.

Wrapping up

Making HTTP requests in R with httr2 is straightforward once you understand the basics. Start with a simple request(), add what you need (headers, auth, query params), then req_perform() to execute it.

The key things to remember:

  • Use httr2, not httr - it's the modern standard with better features
  • Always use req_throttle() - especially with parallel requests
  • Keep secrets in environment variables - never hardcode them
  • Use req_dry_run() for debugging - see exactly what you're sending
  • Handle errors gracefully - use req_retry() for transient failures
  • Go parallel when possible - it's often 5-10x faster than sequential

Whether you're pulling data from APIs, building integrations, or automating workflows, these techniques will serve you well. The best way to learn is by doing - pick an API you're interested in and start experimenting.