Eventim is Europe's largest ticket platform, serving millions of concert-goers across Germany, Spain, Italy, and beyond. If you're tracking event data, monitoring ticket availability, or building price comparison tools, you'll need to know how to scrape it effectively.
In this guide, I'll show you multiple methods to extract event data from Eventim. You'll learn how to use the pyventim Python module, hit their public API directly, and handle their anti-bot protections when they kick in.
What is Eventim and Why Scrape It?
Eventim is a ticketing platform operated by CTS Eventim, one of the world's largest ticket distributors. The platform sells tickets for concerts, sports events, theater shows, and festivals primarily across European markets.
Scraping Eventim lets you:
- Track ticket prices for specific events over time
- Monitor when new events go on sale
- Build aggregators that compare prices across platforms
- Collect venue and artist data for analysis
- Get availability alerts for sold-out shows
The challenge? Eventim uses dynamic JavaScript rendering and anti-bot protections. Simple requests calls won't cut it for most use cases.
Let's fix that.
The easiest way to scrape Eventim is using pyventim, a Python module that wraps Eventim's reverse-engineered API.
This module handles pagination, parses HTML responses, and gives you clean Python objects to work with.
Installing pyventim
First, set up your environment:
# Create virtual environment
python -m venv eventim-scraper
source eventim-scraper/bin/activate # Windows: eventim-scraper\Scripts\activate
# Install pyventim
pip install pyventim
Requirements:
- Python 3.10 or higher
- Requests >= 2.31.0
- lxml >= 5.2.2
- Pydantic >= 2.7.0
Searching for Attractions (Artists/Events)
Start by finding attractions using the exploration endpoint:
import pyventim
# Initialize the Eventim client
eventim = pyventim.Eventim()
# Search for an artist or event
attractions = eventim.explore_attractions(search_term="Coldplay")
# Loop through results - pagination is handled automatically
for attraction in attractions:
print(f"ID: {attraction['attractionId']}")
print(f"Name: {attraction['name']}")
print("-" * 40)
The module returns an iterator that fetches pages automatically.
You don't need to worry about pagination limits.
Getting Events for an Attraction
Once you have an attraction ID, fetch its events:
import pyventim
import datetime
eventim = pyventim.Eventim()
# The attraction ID for your target artist
attraction_id = 473431 # Example: Disney's König der Löwen
# Set your date range
now = datetime.datetime.now()
date_from = now.date()
date_to = (now + datetime.timedelta(days=90)).date()
# Fetch events
events = eventim.get_attraction_events(
attraction_id=attraction_id,
date_from=date_from,
date_to=date_to
)
for event in events:
print(event)
This returns events in the schema.org/MusicEvent format.
Note: This endpoint only returns the next 90 events maximum.
Fetching Events via Product Groups
For more comprehensive event data, use the product group endpoint:
import pyventim
eventim = pyventim.Eventim()
# Search and get events through product groups
for product_group in eventim.explore_product_groups(search_term="Rammstein"):
product_group_id = product_group["productGroupId"]
# Get events from the calendar endpoint
events = eventim.get_product_group_events_from_calendar(product_group_id)
for event in events:
print(f"Title: {event['title']}")
print(f"Date: {event['eventDate']}")
print(f"Price: {event['price']}")
print(f"Available: {event['ticketAvailable']}")
print("-" * 50)
This method gives you pricing and availability data directly.
Searching for Venues
You can also search for specific locations:
import pyventim
eventim = pyventim.Eventim()
# Search for venues
locations = eventim.explore_locations("Stage Theater im Hafen Hamburg")
for location in locations:
print(location)
Getting Seatmap Information
For seated events, you can extract seatmap data:
import pyventim
eventim = pyventim.Eventim()
# Event URL path (get this from the event listing)
event_url = "/event/disneys-der-koenig-der-loewen-stage-theater-im-hafen-hamburg-18500464/"
# Get seatmap configuration
seatmap_info = eventim.get_event_seatmap_information(event_url)
if seatmap_info:
# Fetch the actual seatmap with available seats
seatmap = eventim.get_event_seatmap(seatmap_info["seatmapOptions"])
# Loop through blocks and seats
for block in seatmap.get("blocks", []):
print(f"Block: {block['block_name']}")
for row in block.get("block_rows", []):
for seat in row.get("row_seats", []):
print(f" Seat: {seat['seat_code']} - Price Category: {seat['seat_price_category_index']}")
The seatmap data includes coordinates, price categories, and availability.
Standing-only tickets aren't included in seatmap responses.
Method 2: Hitting Eventim's Public API Directly
If you need more control or the pyventim module breaks (APIs change), you can hit Eventim's public endpoints directly.
The Public Search API
Eventim exposes a public search API at:
https://public-api.eventim.com/websearch/search/api/exploration/v1/products
Here's how to query it:
import requests
def search_eventim_events(city=None, search_term=None, page=1):
"""
Search Eventim events using their public API
"""
base_url = "https://public-api.eventim.com/websearch/search/api/exploration/v1/products"
params = {
"webId": "web__eventim-de",
"language": "de",
"page": page,
"retail_partner": "EVE",
"sort": "DateAsc",
"top": 50
}
if city:
params["city_names"] = city
if search_term:
params["search_term"] = search_term
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "application/json",
"Accept-Language": "de-DE,de;q=0.9,en;q=0.8"
}
response = requests.get(base_url, params=params, headers=headers)
response.raise_for_status()
return response.json()
# Example: Search events in Frankfurt
results = search_eventim_events(city="Frankfurt")
for product in results.get("products", []):
print(f"Name: {product.get('name')}")
print(f"Date: {product.get('startDate')}")
print(f"Venue: {product.get('venue', {}).get('name')}")
print("-" * 40)
API Parameters Reference
| Parameter | Description | Example |
|---|---|---|
webId |
Website identifier | web__eventim-de |
language |
Response language | de, en |
page |
Pagination | 1, 2, 3 |
top |
Results per page | 50 (max) |
city_names |
Filter by city | Berlin, München |
sort |
Sort order | DateAsc, DateDesc |
search_term |
Keyword search | Coldplay |
The Private API Endpoints
Some data requires the private API:
import requests
def get_seatmap_data(seatmap_key, timestamp):
"""
Fetch seatmap from private Eventim API
"""
base_url = "https://api.eventim.com/seatmap/api/SeatMapHandler"
params = {
"key": seatmap_key,
"timestamp": timestamp
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"Referer": "https://www.eventim.de/",
"Origin": "https://www.eventim.de"
}
response = requests.get(base_url, params=params, headers=headers)
return response.json()
The seatmap key comes from parsing the event page HTML.
You'll need to extract it from embedded JavaScript.
Method 3: Browser Automation with Playwright
When API methods fail or you need JavaScript-rendered content, use browser automation.
Playwright handles this better than Selenium for modern sites.
Setting Up Playwright
pip install playwright
playwright install chromium
Basic Eventim Scraper with Playwright
from playwright.sync_api import sync_playwright
import json
def scrape_eventim_event(event_url):
"""
Scrape event details using browser automation
"""
with sync_playwright() as p:
# Launch browser with stealth settings
browser = p.chromium.launch(
headless=True,
args=[
'--disable-blink-features=AutomationControlled',
'--no-sandbox'
]
)
context = browser.new_context(
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
)
page = context.new_page()
# Navigate to the event page
page.goto(f"https://www.eventim.de{event_url}", wait_until='networkidle')
# Wait for content to load
page.wait_for_selector('.event-header', timeout=10000)
# Extract event data
event_data = {
'title': page.locator('.event-header h1').inner_text(),
'date': page.locator('.event-date').inner_text(),
'venue': page.locator('.event-venue').inner_text(),
}
# Try to get price info
try:
event_data['price'] = page.locator('.price-info').inner_text()
except:
event_data['price'] = 'Price not available'
# Extract schema.org JSON-LD data
json_ld = page.locator('script[type="application/ld+json"]').all()
for script in json_ld:
try:
data = json.loads(script.inner_text())
if data.get('@type') == 'MusicEvent':
event_data['structured_data'] = data
break
except json.JSONDecodeError:
continue
browser.close()
return event_data
# Example usage
event = scrape_eventim_event("/event/coldplay-music-of-the-spheres-world-tour-18765432/")
print(json.dumps(event, indent=2))
Scraping Search Results
from playwright.sync_api import sync_playwright
def scrape_eventim_search(search_term, max_pages=3):
"""
Scrape search results from Eventim
"""
all_events = []
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context(
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
)
page = context.new_page()
for page_num in range(1, max_pages + 1):
url = f"https://www.eventim.de/search/?affiliate=EVE&searchterm={search_term}&page={page_num}"
page.goto(url, wait_until='networkidle')
# Wait for results to load
page.wait_for_selector('.product-list-item', timeout=15000)
# Extract event cards
events = page.locator('.product-list-item').all()
for event in events:
try:
event_data = {
'title': event.locator('.product-title').inner_text(),
'date': event.locator('.product-date').inner_text(),
'venue': event.locator('.product-venue').inner_text(),
'link': event.locator('a').get_attribute('href')
}
all_events.append(event_data)
except:
continue
# Check for next page
if not page.locator('.pagination-next:not(.disabled)').count():
break
browser.close()
return all_events
# Example
events = scrape_eventim_search("Metallica", max_pages=2)
for event in events:
print(f"{event['title']} - {event['date']}")
Handling Anti-Bot Protection
Eventim uses various anti-bot measures. Here's how to handle them.
Understanding Eventim's Protections
Eventim employs several defense mechanisms:
- Rate limiting: Too many requests trigger temporary blocks
- JavaScript challenges: Some pages require JS execution
- Cookie validation: Sessions must maintain valid cookies
- Header inspection: Missing or suspicious headers get blocked
Implementing Request Delays
Never hammer Eventim with rapid requests:
import time
import random
import requests
class EventimScraper:
def __init__(self):
self.session = requests.Session()
self.session.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'de-DE,de;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive'
})
def get_with_delay(self, url, min_delay=2, max_delay=5):
"""
Make request with random delay
"""
# Random delay between requests
delay = random.uniform(min_delay, max_delay)
time.sleep(delay)
response = self.session.get(url)
return response
Rotating User Agents
Change your user agent periodically:
import random
USER_AGENTS = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Safari/605.1.15'
]
def get_random_headers():
return {
'User-Agent': random.choice(USER_AGENTS),
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language': 'de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7',
'Accept-Encoding': 'gzip, deflate, br',
'DNT': '1',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1'
}
Using Proxy Rotation
For large-scale scraping, rotate your IP addresses:
import requests
from itertools import cycle
class ProxyRotator:
def __init__(self, proxy_list):
"""
Initialize with a list of proxy URLs
Format: ['http://user:pass@ip:port', ...]
"""
self.proxies = cycle(proxy_list)
def get_next_proxy(self):
proxy = next(self.proxies)
return {
'http': proxy,
'https': proxy
}
def make_request(self, url, max_retries=3):
"""
Make request with automatic proxy rotation on failure
"""
for attempt in range(max_retries):
proxy = self.get_next_proxy()
try:
response = requests.get(
url,
proxies=proxy,
headers=get_random_headers(),
timeout=30
)
if response.status_code == 200:
return response
except requests.exceptions.RequestException:
continue
return None
# Usage
proxies = [
'http://user:pass@proxy1.example.com:8080',
'http://user:pass@proxy2.example.com:8080',
# Add more proxies...
]
rotator = ProxyRotator(proxies)
response = rotator.make_request('https://www.eventim.de/search/')
Residential proxies work better than datacenter IPs for Eventim.
Datacenter IPs get flagged quickly.
Handling Cloudflare Challenges
If you encounter Cloudflare protection, use specialized tools:
import cloudscraper
def scrape_with_cloudscraper(url):
"""
Bypass Cloudflare using cloudscraper
"""
scraper = cloudscraper.create_scraper(
browser={
'browser': 'chrome',
'platform': 'windows',
'mobile': False
}
)
response = scraper.get(url)
return response.text
# Example
html = scrape_with_cloudscraper('https://www.eventim.de/artist/coldplay/')
For tougher challenges, combine Playwright with stealth plugins:
from playwright.sync_api import sync_playwright
def stealth_scrape(url):
with sync_playwright() as p:
browser = p.chromium.launch(
headless=False, # Use headed mode for tough protections
args=[
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-first-run',
'--no-default-browser-check'
]
)
context = browser.new_context(
viewport={'width': 1920, 'height': 1080},
locale='de-DE',
timezone_id='Europe/Berlin'
)
# Add stealth scripts
context.add_init_script("""
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
});
""")
page = context.new_page()
page.goto(url, wait_until='networkidle')
content = page.content()
browser.close()
return content
Extracting Specific Data Types
Different use cases need different data. Here's how to get what you need.
Event Listings
import pyventim
def get_upcoming_events(artist_name, days_ahead=90):
"""
Get all upcoming events for an artist
"""
eventim = pyventim.Eventim()
events = []
for product_group in eventim.explore_product_groups(search_term=artist_name):
pg_id = product_group["productGroupId"]
for event in eventim.get_product_group_events_from_calendar(pg_id):
events.append({
'title': event.get('title'),
'date': event.get('eventDate'),
'venue': event.get('venue'),
'city': event.get('city'),
'price_from': event.get('price'),
'available': event.get('ticketAvailable'),
'url': event.get('url')
})
return events
# Example
coldplay_events = get_upcoming_events("Coldplay")
for event in coldplay_events:
print(f"{event['date']} - {event['city']}: {event['title']}")
Price Monitoring
import pyventim
import json
from datetime import datetime
def monitor_prices(event_url, output_file='prices.json'):
"""
Track price changes for a specific event
"""
eventim = pyventim.Eventim()
# Get current seatmap info
seatmap_info = eventim.get_event_seatmap_information(event_url)
if not seatmap_info:
return None
seatmap = eventim.get_event_seatmap(seatmap_info["seatmapOptions"])
# Extract price categories
price_data = {
'timestamp': datetime.now().isoformat(),
'event_url': event_url,
'categories': []
}
for category in seatmap.get('price_categories', []):
price_data['categories'].append({
'id': category['price_category_id'],
'name': category['price_category_name'],
'color': category['price_category_color']
})
# Count available seats per category
seat_counts = {}
for block in seatmap.get('blocks', []):
for row in block.get('block_rows', []):
for seat in row.get('row_seats', []):
cat_idx = seat['seat_price_category_index']
seat_counts[cat_idx] = seat_counts.get(cat_idx, 0) + 1
price_data['available_seats'] = seat_counts
# Save to file
try:
with open(output_file, 'r') as f:
history = json.load(f)
except FileNotFoundError:
history = []
history.append(price_data)
with open(output_file, 'w') as f:
json.dump(history, f, indent=2)
return price_data
Venue Information
import pyventim
def get_venue_events(venue_name, limit=50):
"""
Get all events at a specific venue
"""
eventim = pyventim.Eventim()
locations = list(eventim.explore_locations(venue_name))
if not locations:
return []
venue = locations[0]
events = []
# Search for events at this venue
for product in eventim.explore_product_groups(search_term=venue_name):
if len(events) >= limit:
break
pg_id = product["productGroupId"]
for event in eventim.get_product_group_events_from_calendar(pg_id):
events.append(event)
if len(events) >= limit:
break
return {
'venue': venue,
'events': events
}
Common Errors and Fixes
"Connection refused" or Timeout Errors
Cause: Rate limiting or IP ban
Fix:
import time
def retry_with_backoff(func, max_retries=5):
"""
Retry with exponential backoff
"""
for attempt in range(max_retries):
try:
return func()
except Exception as e:
if attempt == max_retries - 1:
raise e
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Attempt {attempt + 1} failed. Waiting {wait_time:.2f}s...")
time.sleep(wait_time)
Empty Responses
Cause: JavaScript not rendering or blocked request
Fix: Switch to browser automation:
# If pyventim returns empty, try Playwright
from playwright.sync_api import sync_playwright
def fallback_scrape(url):
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto(url, wait_until='networkidle')
content = page.content()
browser.close()
return content
"403 Forbidden" Errors
Cause: Missing headers or detected as bot
Fix: Add complete browser headers:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'de-DE,de;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Referer': 'https://www.eventim.de/',
'DNT': '1',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1'
}
pyventim Module Errors
Cause: Eventim changed their API
Fix: Check for pyventim updates or fall back to direct API:
try:
import pyventim
eventim = pyventim.Eventim()
results = eventim.explore_attractions(search_term="artist")
except Exception as e:
print(f"pyventim failed: {e}")
# Fall back to direct API call
results = search_eventim_events(search_term="artist")
Best Practices for Eventim Scraping
1. Respect Rate Limits
Keep your request rate reasonable.
2-5 second delays between requests prevent most blocks.
2. Cache Responses
Don't re-scrape data you already have:
import hashlib
import json
import os
def cached_request(url, cache_dir='cache', ttl_hours=1):
"""
Cache responses to avoid redundant requests
"""
os.makedirs(cache_dir, exist_ok=True)
url_hash = hashlib.md5(url.encode()).hexdigest()
cache_file = os.path.join(cache_dir, f"{url_hash}.json")
# Check if cache exists and is fresh
if os.path.exists(cache_file):
mtime = os.path.getmtime(cache_file)
age_hours = (time.time() - mtime) / 3600
if age_hours < ttl_hours:
with open(cache_file, 'r') as f:
return json.load(f)
# Fetch fresh data
response = requests.get(url, headers=get_random_headers())
data = response.json()
# Save to cache
with open(cache_file, 'w') as f:
json.dump(data, f)
return data
3. Handle Errors Gracefully
Wrap scraping logic in try-except blocks:
def safe_scrape(scrape_func, *args, **kwargs):
"""
Execute scraping function with error handling
"""
try:
return scrape_func(*args, **kwargs)
except requests.exceptions.RequestException as e:
print(f"Network error: {e}")
return None
except json.JSONDecodeError as e:
print(f"JSON parsing error: {e}")
return None
except Exception as e:
print(f"Unexpected error: {e}")
return None
4. Use Session Objects
Maintain cookies across requests:
session = requests.Session()
session.headers.update(get_random_headers())
# First request establishes cookies
session.get('https://www.eventim.de/')
# Subsequent requests use the same session
response = session.get('https://www.eventim.de/search/?searchterm=test')
5. Monitor for Changes
Eventim updates their site regularly. Monitor for:
- Changed CSS selectors
- New API endpoints
- Updated anti-bot measures
- Altered response formats
Eventim Scraping FAQs
Is scraping Eventim legal?
Scraping publicly available data is generally legal for personal use and research.
However, always review Eventim's Terms of Service and your local laws.
Commercial use may have additional restrictions.
Which method should I use?
- pyventim: Best for most use cases, handles pagination and parsing
- Direct API: When you need specific parameters or pyventim breaks
- Playwright: When dealing with heavy JavaScript or anti-bot protection
How do I scrape ticket prices?
Use the seatmap endpoints to get price categories and availability.
The pyventim module provides get_event_seatmap() for this.
Why am I getting blocked?
Common causes:
- Too many requests too fast
- Missing or suspicious headers
- Datacenter IP addresses
- No cookies/session management
Fix by adding delays, proper headers, and residential proxies.
Can I scrape real-time availability?
Yes, but you'll need to make frequent requests to the seatmap endpoint.
Consider the load this puts on Eventim's servers.
Does pyventim work with all Eventim regions?
The pyventim module targets eventim.de (Germany).
Other regional sites (eventim.es, eventim.it) may have different API structures.
You might need to adjust the base URLs and parameters.
Conclusion
Scraping Eventim requires understanding their dual API structure and handling their anti-bot measures.
The pyventim module handles most cases elegantly.
For tougher situations, direct API calls or Playwright automation will get you through.
Start with the simplest method that works. Only escalate complexity when needed.
Key takeaways:
- Use pyventim for most scraping tasks
- Add delays between requests (2-5 seconds minimum)
- Rotate user agents and use sessions
- Fall back to Playwright for JavaScript-heavy pages
- Cache responses to minimize redundant requests
Now go build something useful with that event data.