Making HTTP requests is one of those skills that sounds intimidating but becomes second nature once you get the hang of it. Whether you're pulling data from an API, scraping a website, or building your own R package that talks to the web, you'll need to know how to send requests and handle responses.
In this guide, I'll show you the modern approach to HTTP requests in R using httr2, along with practical examples that solve real problems. By the end, you'll be able to hit any API, handle authentication, deal with rate limits, and even run parallel requests for better performance.
Table of contents
- What are HTTP requests (and why should you care)?
- Getting started with httr2
- Your first GET request
- Adding query parameters and headers
- POST requests with body data
- Authentication strategies
- Error handling and retries
- Rate limiting and throttling
- Parallel requests for performance
- Working with paginated APIs
- Converting curl commands to R
- Common pitfalls and how to avoid them
What are HTTP requests (and why should you care)?
Every time you visit a website, your browser sends an HTTP request to a server asking for data. The server processes that request and sends back an HTTP response containing the webpage, image, or whatever you asked for.
In R, you can do the same thing programmatically. Instead of clicking around in a browser, you write code that sends requests and processes responses. This is incredibly useful for:
- Pulling data from APIs - Most modern web services offer APIs that return structured data (usually JSON)
- Web scraping - When there's no API, you can request HTML pages and extract the data you need
- Building integrations - Connect your R code to external services like Slack, GitHub, or Stripe
- Automating workflows - Schedule scripts to fetch fresh data without manual intervention
The key difference from using a browser? You get to process the responses programmatically, which means you can extract exactly what you need and integrate it directly into your data analysis pipeline.
Getting started with httr2
R has several packages for making HTTP requests, but httr2 is the modern standard. It's a complete rewrite of the older httr package with a cleaner, pipe-friendly interface and better features for working with APIs.
If you've used httr before, httr2 might feel familiar but better. Think of it as httr that learned from five years of real-world usage and fixed all the annoying parts.
Install it like any other package:
install.packages("httr2")
library(httr2)
The core workflow in httr2 is straightforward:
- Create a request object
- Modify it (add headers, authentication, etc.)
- Perform the request
- Extract data from the response
This is different from httr, where you'd do everything in one function call. Having an explicit request object makes it easier to build complex requests step by step and works beautifully with pipes.
Your first GET request
Let's start simple. Here's how to fetch data from an API:
library(httr2)
# Create a request
req <- request("https://api.github.com/users/hadley")
# Perform it and get the response
resp <- req_perform(req)
# Check the status
resp
This returns something like:
<httr2_response>
GET https://api.github.com/users/hadley
Status: 200 OK
Content-Type: application/json
Body: In memory (1876 bytes)
A 200 status code means success. The response includes headers (like Content-Type
) and a body containing the actual data.
To extract the JSON data from the response:
# Parse JSON response
user_data <- resp_body_json(resp)
# Access specific fields
user_data$name
user_data$public_repos
The resp_body_json()
function automatically parses the JSON and returns an R list. For other response types, httr2 provides resp_body_html()
, resp_body_xml()
, and resp_body_string()
.
Adding query parameters and headers
Most APIs need additional information in your requests. Query parameters go in the URL (like ?page=1&limit=50
), while headers provide metadata about your request.
Query parameters
Instead of manually building URL strings, use req_url_query()
:
req <- request("https://api.github.com/search/repositories") |>
req_url_query(
q = "language:R",
sort = "stars",
per_page = 10
)
resp <- req_perform(req)
This automatically handles URL encoding, so you don't have to worry about spaces or special characters breaking your URLs.
Custom headers
Headers are metadata about your request. The most common use case is authentication, but they're also used to specify content types or user agents:
req <- request("https://api.example.com/data") |>
req_headers(
Accept = "application/json",
`User-Agent` = "my-r-script/1.0"
)
resp <- req_perform(req)
Note the backticks around User-Agent
- that's because R doesn't like hyphens in variable names. The backticks tell R to treat it as a literal string.
Seeing what gets sent
Want to debug your request before sending it? Use req_dry_run()
:
req |> req_dry_run()
This prints out exactly what HTTP request will be sent, including all headers and the request body. It's incredibly useful when something isn't working and you need to see what's actually happening under the hood.
POST requests with body data
GET requests fetch data. POST requests send data to a server. This is how you submit forms, create resources, or send data to an API.
Sending JSON
The most common format for API requests is JSON:
req <- request("https://api.example.com/users") |>
req_body_json(
list(
name = "Alice",
email = "alice@example.com",
age = 30
)
)
resp <- req_perform(req)
The req_body_json()
function automatically:
- Converts your R list to JSON
- Sets the
Content-Type
header toapplication/json
- Changes the request method from GET to POST
Sending form data
Some APIs expect form-encoded data instead of JSON:
req <- request("https://api.example.com/login") |>
req_body_form(
username = "alice",
password = "secret123"
)
resp <- req_perform(req)
This mimics what happens when you submit an HTML form in a browser.
Uploading files
For file uploads, use req_body_multipart()
:
req <- request("https://api.example.com/upload") |>
req_body_multipart(
file = curl::form_file("data.csv"),
description = "Monthly sales data"
)
resp <- req_perform(req)
Authentication strategies
Most useful APIs require authentication. httr2 supports all the common methods.
Bearer tokens
The simplest authentication method - just include a token in the headers:
req <- request("https://api.github.com/user/repos") |>
req_auth_bearer_token("ghp_your_token_here")
resp <- req_perform(req)
This adds an Authorization: Bearer <token>
header to your request.
Pro tip: Never hardcode tokens in your scripts. Use environment variables instead:
token <- Sys.getenv("GITHUB_TOKEN")
req <- request("https://api.github.com/user/repos") |>
req_auth_bearer_token(token)
Set the environment variable in your .Renviron
file:
GITHUB_TOKEN=ghp_your_token_here
Basic authentication
Some APIs use username and password:
req <- request("https://api.example.com/data") |>
req_auth_basic("username", "password")
resp <- req_perform(req)
This sends your credentials encoded in the Authorization
header. It's called "basic" because it's simple, but it's only secure over HTTPS.
OAuth 2.0
OAuth is more complex but provides better security for accessing user data. httr2 has built-in support for various OAuth flows:
client <- oauth_client(
id = "your_client_id",
secret = "your_client_secret",
token_url = "https://api.example.com/oauth/token",
name = "my_app"
)
req <- request("https://api.example.com/data") |>
req_oauth_auth_code(
client = client,
auth_url = "https://api.example.com/oauth/authorize"
)
resp <- req_perform(req)
The first time you run this, it'll open a browser window for you to authorize the app. After that, it caches the token so you don't have to authorize again.
Error handling and retries
Not every request succeeds. Servers go down, networks have hiccups, and rate limits kick in. httr2 makes it easy to handle these situations gracefully.
Basic error handling
By default, httr2 converts HTTP errors (4xx and 5xx status codes) into R errors:
req <- request("https://api.github.com/users/nonexistent")
tryCatch(
resp <- req_perform(req),
error = function(e) {
message("Request failed: ", e$message)
}
)
Automatic retries
For transient errors (like rate limiting or temporary server issues), you can automatically retry:
req <- request("https://api.example.com/data") |>
req_retry(
max_tries = 3,
is_transient = \(resp) resp_status(resp) %in% c(429, 500, 503)
)
resp <- req_perform(req)
This will retry up to 3 times if the server returns a 429 (rate limit), 500, or 503 error. httr2 automatically uses exponential backoff, waiting longer between each retry.
Custom error messages
Want to provide better error messages to your users?
req <- request("https://api.example.com/data") |>
req_error(
is_error = \(resp) resp_status(resp) >= 400,
body = function(resp) {
json <- resp_body_json(resp)
paste("API error:", json$error$message)
}
)
Now when an error occurs, your custom message gets displayed instead of the generic HTTP error.
Rate limiting and throttling
APIs often have rate limits - restrictions on how many requests you can make per minute or hour. Blast an API with too many requests and you'll get blocked.
httr2's req_throttle()
helps you stay within limits:
req <- request("https://api.github.com/users/hadley") |>
req_throttle(rate = 30 / 60) # 30 requests per 60 seconds
resp <- req_perform(req)
This ensures you never exceed 30 requests per minute. If you make requests too quickly, httr2 automatically waits before sending the next one.
For more sophisticated rate limiting:
req <- request("https://api.example.com/data") |>
req_throttle(
capacity = 100, # Maximum 100 requests
fill_time_s = 3600 # Refills over 1 hour
)
This implements a token bucket algorithm - you start with 100 requests, and the bucket refills at a rate of 1 request every 36 seconds.
Parallel requests for performance
Sequential requests are slow. If you need to fetch data for 100 users, doing it one at a time takes forever. Parallel requests let you make multiple requests simultaneously.
Simple parallel execution
library(httr2)
# Create a list of requests
users <- c("hadley", "jennybc", "jimhester")
reqs <- lapply(users, function(user) {
request(paste0("https://api.github.com/users/", user)) |>
req_throttle(rate = 30 / 60) # Still respect rate limits!
})
# Perform them in parallel
resps <- req_perform_parallel(reqs, max_active = 3)
# Extract data from each response
user_data <- lapply(resps, resp_body_json)
The max_active
parameter controls how many requests run simultaneously. Don't set this too high or you'll overwhelm the server (and get blocked).
Important: Always use req_throttle()
with parallel requests. Without it, you'll fire off all requests at once and likely get rate limited.
Handling errors in parallel requests
By default, req_perform_parallel()
stops on the first error. Use on_error = "continue"
to keep going:
resps <- req_perform_parallel(
reqs,
on_error = "continue",
max_active = 5
)
# Check which requests succeeded
successes <- resps_successes(resps)
failures <- resps_failures(resps)
cat("Succeeded:", length(successes), "\n")
cat("Failed:", length(failures), "\n")
Working with paginated APIs
Many APIs return data in pages. Instead of sending all 10,000 results at once, they send 50 at a time and provide a way to request the next page.
httr2 has built-in helpers for common pagination patterns:
Offset-based pagination
req <- request("https://api.example.com/items") |>
req_url_query(limit = 50)
resps <- req_perform_iterative(
req,
next_req = iterate_with_offset(
param_name = "offset",
resp_pages = function(resp) {
json <- resp_body_json(resp)
json$total_pages
}
),
max_reqs = Inf
)
# Combine all results
all_items <- unlist(lapply(resps, function(r) {
resp_body_json(r)$items
}), recursive = FALSE)
This automatically adds offset=0
, offset=50
, offset=100
, etc. to subsequent requests until all pages are fetched.
Cursor-based pagination
Some APIs use cursors instead of offsets:
req <- request("https://api.example.com/items")
resps <- req_perform_iterative(
req,
next_req = iterate_with_cursor(
param_name = "cursor",
resp_cursor = function(resp) {
json <- resp_body_json(resp)
json$next_cursor
}
),
max_reqs = Inf
)
The iterate_with_cursor()
function extracts the next_cursor
from each response and uses it in the next request.
Converting curl commands to R
Ever find a working curl command in API documentation and wish you could just use it in R? Good news: httr2 can translate curl commands for you.
library(httr2)
curl_cmd <- 'curl -X GET --header "Accept: application/json" --header "Authorization: Bearer token123" "https://api.example.com/data?limit=10"'
# Convert to httr2 code
curl_translate(curl_cmd)
This outputs R code you can copy and paste:
request("https://api.example.com/data") |>
req_url_query(limit = "10") |>
req_headers(
Accept = "application/json",
Authorization = "Bearer token123"
)
It even handles complex curl commands with custom headers, authentication, and request bodies. This is a massive time-saver when you're trying to replicate an API call you found in documentation or copied from your browser's developer tools.
Common pitfalls and how to avoid them
Forgetting to URL-encode parameters
If you build URLs manually, special characters will break them:
# Bad
url <- paste0("https://api.example.com/search?q=", "R programming") # Space breaks the URL
# Good
req <- request("https://api.example.com/search") |>
req_url_query(q = "R programming") # Automatically encoded
Not handling rate limits
Hammering an API without rate limiting is a fast track to getting blocked:
# Bad - no rate limiting
for (i in 1:1000) {
resp <- request("https://api.example.com/data") |> req_perform()
}
# Good - throttled
req <- request("https://api.example.com/data") |>
req_throttle(rate = 10 / 60) # 10 per minute
for (i in 1:1000) {
resp <- req_perform(req)
# httr2 automatically waits when needed
}
Hardcoding secrets in scripts
Never put API keys or passwords directly in your code:
# Bad - visible in code
req <- request("https://api.example.com/data") |>
req_auth_bearer_token("super_secret_token_123")
# Good - from environment variable
token <- Sys.getenv("API_TOKEN")
req <- request("https://api.example.com/data") |>
req_auth_bearer_token(token)
Set the environment variable in your .Renviron
file, and it stays out of version control.
Not using req_dry_run() when debugging
When requests aren't working, looking at the actual HTTP being sent saves hours of frustration:
req <- request("https://api.example.com/data") |>
req_headers(Authorization = "Bearer token") |>
req_url_query(limit = 10) |>
req_dry_run() # See exactly what will be sent
Ignoring HTTP status codes
Just because a request doesn't error doesn't mean it worked:
resp <- req_perform(req)
# Check the status
if (resp_status(resp) == 200) {
data <- resp_body_json(resp)
} else {
warning("Request returned status: ", resp_status(resp))
}
A 404 means "not found," 401 means "unauthorized," 429 means "rate limited," and 500+ means server errors. Handle them appropriately.
Making sequential requests when parallel would work
If requests are independent, run them in parallel:
# Slow - sequential
for (id in ids) {
resp <- request(paste0("https://api.example.com/items/", id)) |>
req_perform()
}
# Fast - parallel
reqs <- lapply(ids, function(id) {
request(paste0("https://api.example.com/items/", id))
})
resps <- req_perform_parallel(reqs, max_active = 10)
For 100 requests, this can be 10x faster or more.
Wrapping up
Making HTTP requests in R with httr2 is straightforward once you understand the basics. Start with a simple request()
, add what you need (headers, auth, query params), then req_perform()
to execute it.
The key things to remember:
- Use httr2, not httr - it's the modern standard with better features
- Always use req_throttle() - especially with parallel requests
- Keep secrets in environment variables - never hardcode them
- Use req_dry_run() for debugging - see exactly what you're sending
- Handle errors gracefully - use req_retry() for transient failures
- Go parallel when possible - it's often 5-10x faster than sequential
Whether you're pulling data from APIs, building integrations, or automating workflows, these techniques will serve you well. The best way to learn is by doing - pick an API you're interested in and start experimenting.