OmniScrape
ProductsSolutionsGuidesDocs ↗PricingAbout
ProductsSolutionsGuidesDocs ↗PricingAbout
← All guides
Site-Specific Scrapers

Trustpilot Scraper: TrustScore, Star Distribution, and Review Monitoring

Trustpilot is the de-facto review layer for ecommerce and SaaS — a TrustScore drop of 0.2 points can surface in board-level brand reports within days. Reputation teams scrape Trustpilot to track review velocity, star histogram shifts, unanswered 1-star reviews, and competitor benchmarks. The URL structure is globally consistent: every business profile lives at /review/domain.com, which makes bulk monitoring straightforward once you handle the protection layer.

This guide walks through the full scraping workflow: which fields matter, how to parse the DOM, how to handle Cloudflare and JS-rendered pagination, and how to structure delta alerts. For downstream NLP on review text, connect your pipeline to sentiment analysis web scraping. For local business reputation (Yelp, Google Maps), see the Yelp scraper guide.

On this page

1. Trustpilot fields reputation teams actually track2. Trustpilot URL patterns and pagination3. Trustpilot profile page DOM structure4. Cloudflare protection and JS pagination challenges5. Scrape TrustScore and star histogram from a company profile6. Scrape recent reviews with JS pagination7. Handling Cloudflare on Trustpilot8. TrustScore delta alerting and time-series storage9. Trustpilot Terms of Use and data reuse constraints10. FAQ

1.Trustpilot fields reputation teams actually track

Not every field on a Trustpilot profile is equally actionable. Brand monitoring teams prioritize the composite TrustScore and the 1-star percentage because those are the leading indicators that precede PR escalations. Agencies running competitive benchmarks want the total review count and histogram across multiple domains so they can normalize score comparisons by review volume — a 4.8 from 40 reviews is not the same signal as a 4.8 from 40,000.

Individual review fields matter for qualitative pipelines: extracting recurring complaint themes, flagging reviews without a company reply, or identifying verified-buyer sentiment versus unverified. The reply date relative to review date is a useful proxy for customer service responsiveness — a metric some brands track as a KPI.

  • Business domain and Trustpilot company slug (e.g., www.shopify.com)
  • TrustScore (1.0–5.0 composite, one decimal place)
  • Total review count (used to weight score comparisons)
  • Star distribution histogram — 5★ through 1★ as percentages and raw counts
  • Individual review: title, body text, star rating (1–5)
  • Review date (ISO 8601 from datetime attribute)
  • Reviewer display name, country, and verified-buyer flag
  • Company reply text and reply timestamp
  • Claimed profile badge (indicates active business account)
  • Business categories and registered location
  • Recent review velocity (reviews per day derived from date sequence)
  • Flagged or reported review indicators

2.Trustpilot URL patterns and pagination

Trustpilot's URL structure is one of the most consistent among major review platforms. Every company profile maps to /review/ followed by the business domain. Both www-prefixed and bare domain forms are accepted — Trustpilot redirects to the canonical form it has on record, so always follow the redirect and store the final URL as your canonical key.

Review pagination uses a simple ?page= query parameter. The first page is the default profile URL with no parameter. Pages beyond the first may render review cards via JavaScript, depending on how Cloudflare classifies the request — plan for js_rendering on page 2 and beyond. Country-specific Trustpilot TLDs (trustpilot.de, trustpilot.co.uk) host the same review data but may show localized UI and different cookie consent flows.

  • Company profile: https://www.trustpilot.com/review/www.shopify.com
  • Bare domain form: https://www.trustpilot.com/review/shopify.com
  • Reviews page 2: https://www.trustpilot.com/review/shopify.com?page=2
  • Filtered by star: https://www.trustpilot.com/review/shopify.com?stars=1
  • Sorted by recent: https://www.trustpilot.com/review/shopify.com?sort=recency
  • Category browse: https://www.trustpilot.com/categories/ecommerce
  • UK locale: https://www.trustpilot.co.uk/review/shopify.com
  • German locale: https://www.trustpilot.de/review/shopify.com

3.Trustpilot profile page DOM structure

Trustpilot uses CSS Modules, so class names include a hash suffix that changes on deploys (e.g., styles_reviewCount__q8Q2r). The stable hooks are data-* attributes, which the Trustpilot team uses for analytics and have remained consistent across redesigns. Prefer data-* selectors over class names wherever available.

The TrustScore number lives in a p element with data-rating-typography='true'. The total review count is in a span with class matching styles_reviewCount. The star histogram uses div elements with data-reviews-distribution-row set to '5', '4', '3', '2', or '1' — each row contains both the percentage text and a visual bar width you can ignore.

Individual review cards are wrapped in article elements with data-service-review-card-paper. Inside each card: the review title is in h2[data-service-review-title-typography], the body text in p[data-service-review-text-typography], the star rating in div[data-service-review-rating] (also carries a data attribute with the numeric value), and the review date in a time element with a datetime attribute containing the ISO 8601 timestamp. Company replies are in div[data-service-review-business-reply-text-typography]. The verified-buyer badge is a span with data-review-label containing text like 'Verified'.

4.Cloudflare protection and JS pagination challenges

Trustpilot runs Cloudflare Bot Management across its profile pages — not just the basic Cloudflare challenge page, but behavioral fingerprinting that evaluates TLS fingerprint, HTTP/2 settings, and request cadence. Plain HTTP requests from a datacenter IP will receive a Cloudflare interstitial or a 403 before they reach Trustpilot's servers. The OmniScrape Web Unlocker handles this transparently when you use mode 'auto' with enable_solver: true and a residential proxy.

EU cookie consent is a secondary complication. Trustpilot serves a consent banner on first load for EU-geolocated requests. This banner can obscure DOM elements and, in some cases, block the main content from rendering until consent is registered. Using a non-EU proxy (residential:us) for initial profile scrapes avoids this. If you need EU-localized review data specifically, use a residential EU proxy and add a js_wait_selector that targets content below the banner.

Review pagination beyond page 1 is the most common failure point. The first page of reviews is typically included in the server-rendered HTML. Pages 2 and beyond are often injected by JavaScript after the initial HTML is delivered — a plain HTML fetch returns the page shell with no review cards. Use js_rendering with js_wait_selector targeting article[data-service-review-card-paper] to ensure review cards are present before extraction.

  • Cloudflare Bot Management on all profile pages — requires residential proxy and solver
  • JS-rendered review pagination on page 2 and beyond
  • EU GDPR consent banner blocking DOM on EU-geolocated requests
  • data-* attribute selectors are stable; hashed class names change on deploys
  • Aggressive domain enumeration (thousands of /review/ URLs) triggers rate limits
  • Trustpilot Terms of Service restrict bulk commercial reuse of review content

5.Scrape TrustScore and star histogram from a company profile

The company profile header — TrustScore, review count, and histogram — is server-rendered in the initial HTML response on most requests. Mode 'auto' with enable_solver: true is the right starting point: it tries the fast HTTP path first and escalates to a headless browser only if Cloudflare challenges the request. This keeps costs and latency low for the majority of profile fetches.

Use a residential US proxy to avoid EU consent banner interference. The css_extractor output format runs selector matching server-side and returns only the extracted values — no need to parse full HTML in your application. Results are returned in body.data.css_extracted. Check body.success and metadata.challenge_solved to confirm the request resolved cleanly.

Trustpilot profile — TrustScore and histogram
json
12345678910111213141516171819{
  "url": "https://www.trustpilot.com/review/www.shopify.com",
  "mode": "auto",
  "output_format": "css_extractor",
  "enable_solver": true,
  "proxy": "residential:us",
  "css_selectors": {
    "trustscore": "p[data-rating-typography=true]",
    "review_count": "span[class*='styles_reviewCount']",
    "histogram_5": "[data-reviews-distribution-row='5']",
    "histogram_4": "[data-reviews-distribution-row='4']",
    "histogram_3": "[data-reviews-distribution-row='3']",
    "histogram_2": "[data-reviews-distribution-row='2']",
    "histogram_1": "[data-reviews-distribution-row='1']",
    "company_name": "h1 span",
    "claimed_badge": "[data-claimed-badge]",
    "categories": "[data-business-unit-info] a[href*='/categories/']"
  }
}

6.Scrape recent reviews with JS pagination

For review text extraction, use js_rendering with a js_wait_selector targeting the review card article elements. This ensures the JavaScript that injects review cards has completed before the CSS extractor runs. Set js_wait_timeout to at least 12 seconds — Trustpilot's JS bundle is large and review injection can be slow on first load.

Paginate by incrementing the ?page= parameter. Add a delay of at least 3 seconds between page requests per domain to stay below rate-limit thresholds. For monitoring pipelines, you typically only need the first 1–2 pages (20–40 reviews) to detect recent sentiment shifts — full historical scraping of thousands of reviews per domain is both slower and more legally sensitive.

The response returns extracted values in body.data.css_extracted. Each selector that matches multiple elements returns an array of text values — titles, bodies, ratings, and dates will be parallel arrays you can zip together by index to reconstruct individual review objects.

Trustpilot reviews — JS-rendered with pagination
json
12345678910111213141516171819{
  "url": "https://www.trustpilot.com/review/www.shopify.com?page=1&sort=recency",
  "mode": "js_rendering",
  "output_format": "css_extractor",
  "enable_solver": true,
  "proxy": "residential:us",
  "js_wait_selector": "article[data-service-review-card-paper]",
  "js_wait_timeout": 12000,
  "css_selectors": {
    "titles": "h2[data-service-review-title-typography]",
    "bodies": "p[data-service-review-text-typography]",
    "ratings": "div[data-service-review-rating]",
    "dates": "time[datetime]",
    "reviewer_countries": "[data-consumer-country-typography]",
    "verified_labels": "[data-review-label]",
    "reply_texts": "div[data-service-review-business-reply-text-typography]",
    "reply_dates": "div[data-service-review-business-reply-date-typography]"
  }
}

7.Handling Cloudflare on Trustpilot

When Cloudflare challenges a request, the response body will contain challenge HTML rather than Trustpilot content — body.success will be false and the content will include strings like 'Just a moment' or 'Checking your browser'. With mode 'auto' and enable_solver: true, the OmniScrape Web Unlocker intercepts the challenge, solves it transparently, and retries the request. Check metadata.challenge_solved: true in the response to confirm the solve completed.

If you are seeing persistent Cloudflare blocks despite using the solver, the most common cause is a datacenter proxy leaking through. Ensure proxy is set to 'residential:us' or a residential EU pool if you need localized content. Datacenter IPs are trivially fingerprinted by Cloudflare Bot Management and will fail regardless of challenge solving. See the full Cloudflare bypass guide for advanced configurations including session persistence and TLS fingerprint matching.

Log metadata.method_used, metadata.solver_used, and metadata.challenge_solved for every Trustpilot request in your monitoring pipeline. A sudden drop in challenge_solved rate signals a Cloudflare rule change on Trustpilot's side — you will want to know before your alerting pipeline silently stops receiving data.

8.TrustScore delta alerting and time-series storage

Store a daily snapshot per domain: TrustScore, total review count, and the five histogram percentages. The most actionable alert is a 1-star percentage spike — if the 1★ share rises more than 2 percentage points week-over-week, that is a signal worth routing to a human. A TrustScore drop below a defined threshold (e.g., below 4.0 for brands that market their score) is a secondary alert.

Review velocity is a leading indicator. Compute reviews-per-day from the date sequence on the first page. A sudden velocity spike — even if the scores are not yet degraded — often precedes a score drop by 24–48 hours as the new reviews accumulate. Tracking velocity lets you alert earlier.

For your own brand profile, this monitoring pattern is operationally low-risk and commercially reasonable. For competitive intelligence across dozens of competitor domains, keep request volume moderate and do not store or redistribute the full review text corpus — read the compliance section below before scaling.

9.Trustpilot Terms of Use and data reuse constraints

Trustpilot's Terms of Use explicitly restrict scraping, automated access, and bulk download of review content. The practical boundary most legal teams draw is: monitoring your own business profile and a small number of direct competitors for internal brand intelligence is defensible; building a product that republishes Trustpilot review text or aggregates scores into a competing review directory is not.

If you display Trustpilot scores in a customer-facing application or marketing material, Trustpilot's brand guidelines require attribution — you cannot present a TrustScore as if it came from your own system. Their widget and API programs exist for licensed display use cases.

Public review text is technically accessible, but Trustpilot has pursued legal action against companies that scraped and redistributed review corpora at scale. Keep your use case scoped to monitoring and internal analytics. For any commercial product built on Trustpilot data, consult legal counsel and consider Trustpilot's official API program as an alternative.

Frequently asked questions

Should I use the www-prefixed or bare domain form in the URL?

Both work — Trustpilot redirects to whichever canonical form it has on record for that business. Follow the redirect on your first fetch, store the final URL, and use that consistently. Normalize your internal domain key to lowercase with www stripped for deduplication (e.g., 'shopify.com' as the key regardless of which URL form you request).

How do I paginate through all reviews for a domain?

Increment ?page=2, ?page=3, and so on. Stop when a page returns no article[data-service-review-card-paper] elements after JS rendering — that signals you have passed the last page. Use js_rendering with js_wait_selector for page 2 and beyond. Space requests at least 3 seconds apart per domain to avoid rate limiting. For monitoring pipelines, the first 1–2 pages (20–40 most recent reviews) are usually sufficient.

Why is the TrustScore selector returning empty?

Three common causes: (1) The company has fewer than the minimum reviews required for a TrustScore to be displayed — Trustpilot requires at least a few reviews before showing a composite score. (2) A Cloudflare block returned challenge HTML instead of the profile page — check body.success and look for 'Just a moment' in the content. (3) A CSS Modules class name change — verify the data-rating-typography attribute is still present using mode 'html' output to inspect the raw DOM.

How do I handle the EU cookie consent banner?

Use a US residential proxy (proxy: 'residential:us') for profile scrapes — Trustpilot only serves the GDPR consent banner to EU-geolocated requests. If you specifically need EU-localized review data, use a residential EU proxy and set js_wait_selector to a content element below the banner fold, which forces the headless browser to wait until the page content is accessible.

How do I extract the numeric star rating from each review card?

The div[data-service-review-rating] element carries both a visual star display and a data attribute with the numeric value. When using css_extractor, the selector returns the element's text content. For the numeric value, inspect the raw HTML using output_format: 'html' to identify the exact data attribute name (it has been data-service-review-rating-value in past versions). Alternatively, parse the aria-label on the star SVG, which typically reads '5 stars' or similar.

Can I scrape Trustpilot for competitor benchmarking?

Technically yes with the right proxy and solver configuration. Legally, keep it to internal analytics — do not redistribute the review text or build a product that surfaces Trustpilot data to end users without a license. Monitoring a handful of competitor TrustScores and histograms for internal brand strategy is a different risk profile than bulk-downloading review corpora.

What is the difference between mode 'auto' and 'js_rendering' for Trustpilot?

Use mode 'auto' for the profile header (TrustScore, histogram, company info) — it tries the fast HTTP path first and only escalates to a headless browser if Cloudflare challenges the request, keeping latency low. Use mode 'js_rendering' explicitly for review pages beyond page 1, where review cards are reliably injected by JavaScript and the fast path will return an empty shell. Check metadata.method_used in the response to see which path was actually used.

Related guides

  • Sentiment Analysis Web Scraping: Build a Production Review Pipeline
  • How to Bypass Cloudflare When Web Scraping
  • E-commerce Web Scraping: Catalog Intelligence at Production Scale
  • Web Scraping with Python

Ready to scrape without blocks?

Get your API key in minutes. Test protected URLs from the dashboard — no credit card required to start.

Ready to get started?

Start scraping protected sites today — no credit card required.

OmniScrape

Web scraping infrastructure for developers. One API call to bypass any protection.

All systems operational

Product

  • Web Unlocker
  • Browser-as-a-Service
  • Residential Proxies
  • Pricing

Developers

  • API Reference ↗
  • Quickstart ↗
  • All Guides
  • Use Cases
  • Status

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
  • Refund Policy
  • Cookie Policy
  • Acceptable Use

Solutions

  • E-commerce Web Scraping: Catalog Intelligence at Production Scale
  • Real Estate Web Scraping: Listings, Comps, and Market Data
  • SERP Web Scraping: Agency Rank Tracking Workflow
  • Job Board Web Scraping: HR Tech Pipeline for Labor Market Intelligence
  • Price Monitoring with Web Scraping: A Practical Developer Guide
  • Lead Generation Web Scraping: Compliant Inbound Enrichment for Sales Teams
  • Market Research Web Scraping: Multi-Geo Data Collection for Research Firms
  • Sentiment Analysis Web Scraping: Build a Production Review Pipeline
  • Logistics Web Scraping: Carrier Rates, Port ETAs, and Sailing Schedules
  • Social Media Web Scraping: Brand Mention Monitoring from Public Pages
  • LLM Training Data Scraping: Building Clean Web Corpora
  • Travel Web Scraping: Hotel Rates, Flight Fares & Parity Monitoring

Web Scraping by Language

  • Web Scraping with Python
  • Web Scraping with Node.js: fetch, Cheerio, and the OmniScrape API
  • Web Scraping with Java: HttpClient, Jsoup, and OmniScrape API
  • Web Scraping with PHP
  • Web Scraping with Go (Golang)
  • Web Scraping with Ruby: Faraday, Nokogiri, Sidekiq & OmniScrape
  • Web Scraping with C#: HttpClient, AngleSharp, and OmniScrape API
  • Web Scraping with Rust
  • Web Scraping with R: httr2, rvest, and the OmniScrape API
  • Web Scraping with C++
  • Web Scraping with Elixir
  • Web Scraping with Perl: Mojo::UserAgent, Mojo::DOM, and OmniScrape

Anti-Bot Bypass

  • How to Bypass Cloudflare When Web Scraping
  • How to Bypass DataDome When Web Scraping
  • How to Bypass Akamai Bot Manager When Web Scraping
  • How to Bypass PerimeterX (HUMAN Security) When Web Scraping
  • Bypassing AWS WAF When Web Scraping: Rate Rules, Bot Control, and Residential Proxies
  • How to Bypass Imperva (Incapsula) When Web Scraping
  • How to Bypass Kasada Bot Protection When Web Scraping
  • How to Bypass F5 BIG-IP Bot Defense When Web Scraping
  • How to Bypass Distil Networks When Web Scraping
  • How to Bypass reCAPTCHA When Web Scraping

Scraping Tools

  • Playwright Web Scraping: Practical Patterns for Protected Sites
  • Puppeteer Web Scraping: Patterns, Anti-Bot Limits, and BaaS Integration
  • Selenium Web Scraping: Practical Patterns for Real-World Projects
  • Scrapy Web Scraping with OmniScrape: Download Middleware, Pipelines, and Scale
  • Beautiful Soup Web Scraping: A Practical Guide
  • cURL Web Scraping: Shell-Native Patterns with OmniScrape
  • HTTPX Web Scraping: Async Python with OmniScrape
  • Cheerio Web Scraping: A Practical Guide

Site-Specific Scrapers

  • Amazon Scraper: Product Data, Buy Box, Reviews, and Multi-Marketplace
  • Google Search Scraper: Extract SERP Rankings and Features
  • Google Maps Scraper: Extract Business Listings and Place Data
  • LinkedIn Scraper: Companies, Jobs, and Public Profiles
  • Walmart Scraper: Prices, Stock, Rollback Deals, and Fulfillment Data
  • eBay Scraper: Extract Listings, Auctions, and Sold Prices
  • Shopify Scraper: Products, Variants, and JSON Endpoints
  • Indeed Scraper: Extract Job Listings, Salaries, and Company Data
  • Zillow Scraper: Extract Listings, Zestimates, and Price History
  • Reddit Scraper: Posts, Comments, and Subreddit Data
  • X (Twitter) Scraper: Tweets, Profiles, and Hashtags
  • Instagram Scraper: Posts, Reels, and Profile Metrics
  • TikTok Scraper: Extract Videos, Hashtags, and Trend Data
  • YouTube Scraper: Extract Video Metadata, Comments, and Channel Stats
  • Booking.com Scraper: Hotel Rates, Room Types, and Availability
  • Airbnb Scraper: Listings, Calendars, and Nightly Rates
  • Crunchbase Scraper: Extract Funding Rounds, Companies, and Investors
  • Yelp Scraper: Extract Business Listings, Ratings, and Reviews
  • Glassdoor Scraper: Employer Ratings, Salaries, and Review Data
  • Trustpilot Scraper: TrustScore, Star Distribution, and Review Monitoring

How We Compare

  • OmniScrape vs ScrapingBee
  • OmniScrape vs ZenRows
  • OmniScrape vs ScraperAPI: A Practical Developer Comparison
  • OmniScrape vs Bright Data: Which Web Scraping Platform Fits Your Team?
  • OmniScrape vs Oxylabs
  • OmniScrape vs Smartproxy
  • OmniScrape vs Crawlbase: API Design, Observability, and Migration Guide
  • OmniScrape vs Apify

Web Scraping Guides

  • Web Scraping Without Getting Blocked
  • Web Scraping Proxy Guide: Types, Sessions, Geo, and OmniScrape Integration
  • Solve CAPTCHAs While Web Scraping
  • Web Scraping vs Web Crawling: Architecture, Patterns, and When to Use Each
  • Headless Browser Scraping: When to Use It and How to Do It Right
  • Web Scraping API: Endpoint, Modes, Output Formats & Integration Patterns
  • Rotating Proxies for Web Scraping: Policies, Session Binding, and Geo Pools
  • Scrape JavaScript-Rendered Pages: SPAs, Hydration, and Hidden APIs

© 2026 OmniScrape. All rights reserved.

PrivacyTermsRefundsAcceptable Use