OmniScrape
ProductsSolutionsGuidesDocs ↗PricingAbout
ProductsSolutionsGuidesDocs ↗PricingAbout
← All guides
Web Scraping by Language

Web Scraping with Rust

Rust shows up in scrape orchestration when teams want memory safety and predictable tail latency — security scanners, fintech monitors, ad verification pipelines. reqwest handles HTTPS; the scraper crate parses HTML with CSS selectors on top of html5ever.

What Rust will not spare you is Cloudflare. Building TLS fingerprint mimicry in native code is a full-time job. POST to the OmniScrape API, deserialize JSON with serde, feed HTML to scraper. The Python scraping guide uses the same endpoint if you are prototyping in two languages.

On this page

1. Cargo.toml dependencies2. Blocking fetch for CLI tools3. Parse with scraper4. Model the API response5. Async OmniScrape with reqwest6. Concurrent fetches with Tokio7. Skip scraper with css_extractor8. js_rendering for client-rendered HTML9. Result types in production10. FAQ

1.Cargo.toml dependencies

reqwest with json feature, tokio for async, scraper for DOM queries, serde for API responses.

Cargo.toml
toml
123456[dependencies]
reqwest = { version = "0.12", features = ["json"] }
tokio = { version = "1", features = ["full"] }
scraper = "0.20"
serde = { version = "1", features = ["derive"] }
serde_json = "1"

2.Blocking fetch for CLI tools

A synchronous reqwest::blocking client is fine for one-shot binaries. Production services usually move to async Tokio.

fetch.rs
rust
12345678910let client = reqwest::blocking::Client::builder()
    .timeout(std::time::Duration::from_secs(30))
    .build()?;

let html = client
    .get("https://books.toscrape.com/catalogue/page-1.html")
    .send()?
    .text()?;

println!("fetched {} bytes", html.len());

3.Parse with scraper

Html::parse_document builds a DOM. select with a Selector parses CSS at startup — compile selectors once outside hot loops.

parse.rs
rust
1234567891011121314151617181920use scraper::{Html, Selector};

let document = Html::parse_document(&html);
let card_sel = Selector::parse("article.product_pod").unwrap();
let title_sel = Selector::parse("h3 a").unwrap();
let price_sel = Selector::parse(".price_color").unwrap();

let mut books = Vec::new();
for card in document.select(&card_sel) {
    let title = card.select(&title_sel).next()
        .and_then(|el| el.value().attr("title"))
        .unwrap_or("")
        .to_string();
    let price = card.select(&price_sel).next()
        .map(|el| el.text().collect::<String>().trim().to_string())
        .unwrap_or_default();
    books.push((title, price));
}

println!("found {} books", books.len());

4.Model the API response

serde structs force you to think about optional fields — good when OmniScrape adds metadata. Never unwrap() on production JSON without a fallback.

types.rs
rust
1234567891011121314151617181920212223#[derive(Debug, Deserialize)]
struct ScrapeResponse {
    success: bool,
    data: Option<ScrapeData>,
    metadata: Option<Metadata>,
    billing: Option<Billing>,
}

#[derive(Debug, Deserialize)]
struct ScrapeData {
    content: Option<String>,
    css_extracted: Option<serde_json::Value>,
}

#[derive(Debug, Deserialize)]
struct Metadata {
    method_used: String,
}

#[derive(Debug, Deserialize)]
struct Billing {
    charged: f64,
}

5.Async OmniScrape with reqwest

Tokio + reqwest scales concurrent API calls. Use a Semaphore from tokio::sync to cap in-flight js_rendering jobs.

When direct fetch fails on protected sites, this replaces your GET entirely — see Cloudflare bypass for why.

omniscrape.rs
rust
12345678910111213141516171819202122232425262728293031323334#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = reqwest::Client::new();
    let api_key = std::env::var("OMNISCRAPE_KEY")?;

    let res = client
        .post("https://api.omniscrape.io/v1/scrape")
        .header("X-API-Key", api_key)
        .json(&serde_json::json!({
            "url": "https://protected-shop.com/item/99",
            "mode": "auto",
            "output_format": "html",
        }))
        .timeout(std::time::Duration::from_secs(120))
        .send()
        .await?;

    let body: ScrapeResponse = res.json().await?;
    if !body.success {
        anyhow::bail!("scrape failed");
    }

    let html = body.data.and_then(|d| d.content).unwrap_or_default();
    let document = Html::parse_document(&html);
    let price_sel = Selector::parse(".product-price").unwrap();
    let price = document.select(&price_sel).next()
        .map(|el| el.text().collect::<String>());

    println!("price: {:?}", price);
    if let Some(m) = body.metadata {
        println!("via {}", m.method_used);
    }
    Ok(())
}

6.Concurrent fetches with Tokio

futures::future::join_all or a stream with buffer_unordered processes URL lists. Handle per-URL Err without aborting the batch.

pool.rs
rust
12345678910111213141516171819202122use tokio::sync::Semaphore;
use std::sync::Arc;

let sem = Arc::new(Semaphore::new(5));
let urls = vec!["https://example.com/a", "https://example.com/b"];

let handles: Vec<_> = urls.into_iter().map(|url| {
    let client = client.clone();
    let sem = sem.clone();
    let key = api_key.clone();
    tokio::spawn(async move {
        let _permit = sem.acquire().await.unwrap();
        scrape_one(&client, &key, url).await
    })
}).collect();

for h in handles {
    match h.await? {
        Ok(data) => println!("ok: {:?}", data),
        Err(e) => eprintln!("err: {}", e),
    }
}

7.Skip scraper with css_extractor

When you only need a few fields, deserialize css_extracted into a struct and skip DOM walking entirely.

structured.json
rust
123456789.json(&serde_json::json!({
    "url": target,
    "mode": "auto",
    "output_format": "css_extractor",
    "css_selectors": {
        "title": "h1",
        "price": ".price"
    }
}))

8.js_rendering for client-rendered HTML

scraper parses static trees only. React SPAs need js_rendering with js_wait_selector — scraping JavaScript-rendered pages covers when to use it.

9.Result types in production

Map API failures to actionable variants instead of panicking:

  • 401 — config error, return early
  • 402 — budget exhausted, stop scheduler
  • 429 — sleep with jitter, retry
  • 502 — retry with cap
  • success:false — log URL to dead-letter store

Frequently asked questions

reqwest blocking or async?

Blocking for CLIs and quick tools. Async Tokio for services fanning out hundreds of OmniScrape calls.

scraper or select.rs?

scraper is the common choice with familiar CSS selectors. select.rs is lighter if you only need a few queries.

Should I build anti-bot bypass in Rust?

Only if bypass engineering is your product. Otherwise OmniScrape keeps your Rust code focused on parsing and storage.

How do I avoid compiling selectors every iteration?

Parse Selector::parse once at startup, clone into tasks, or use lazy_static/OnceLock.

hyper directly instead of reqwest?

hyper for maximal control. reqwest is ergonomic for JSON APIs like OmniScrape.

Related guides

  • Web Scraping with Python
  • How to Bypass Cloudflare When Web Scraping
  • Scrape JavaScript-Rendered Pages: SPAs, Hydration, and Hidden APIs
  • Web Scraping API: Endpoint, Modes, Output Formats & Integration Patterns

Ready to scrape without blocks?

Get your API key in minutes. Test protected URLs from the dashboard — no credit card required to start.

Ready to get started?

Start scraping protected sites today — no credit card required.

OmniScrape

Web scraping infrastructure for developers. One API call to bypass any protection.

All systems operational

Product

  • Web Unlocker
  • Browser-as-a-Service
  • Residential Proxies
  • Pricing

Developers

  • API Reference ↗
  • Quickstart ↗
  • All Guides
  • Use Cases
  • Status

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
  • Refund Policy
  • Cookie Policy
  • Acceptable Use

Solutions

  • E-commerce Web Scraping: Catalog Intelligence at Production Scale
  • Real Estate Web Scraping: Listings, Comps, and Market Data
  • SERP Web Scraping: Agency Rank Tracking Workflow
  • Job Board Web Scraping: HR Tech Pipeline for Labor Market Intelligence
  • Price Monitoring with Web Scraping: A Practical Developer Guide
  • Lead Generation Web Scraping: Compliant Inbound Enrichment for Sales Teams
  • Market Research Web Scraping: Multi-Geo Data Collection for Research Firms
  • Sentiment Analysis Web Scraping: Build a Production Review Pipeline
  • Logistics Web Scraping: Carrier Rates, Port ETAs, and Sailing Schedules
  • Social Media Web Scraping: Brand Mention Monitoring from Public Pages
  • LLM Training Data Scraping: Building Clean Web Corpora
  • Travel Web Scraping: Hotel Rates, Flight Fares & Parity Monitoring

Web Scraping by Language

  • Web Scraping with Python
  • Web Scraping with Node.js: fetch, Cheerio, and the OmniScrape API
  • Web Scraping with Java: HttpClient, Jsoup, and OmniScrape API
  • Web Scraping with PHP
  • Web Scraping with Go (Golang)
  • Web Scraping with Ruby: Faraday, Nokogiri, Sidekiq & OmniScrape
  • Web Scraping with C#: HttpClient, AngleSharp, and OmniScrape API
  • Web Scraping with Rust
  • Web Scraping with R: httr2, rvest, and the OmniScrape API
  • Web Scraping with C++
  • Web Scraping with Elixir
  • Web Scraping with Perl: Mojo::UserAgent, Mojo::DOM, and OmniScrape

Anti-Bot Bypass

  • How to Bypass Cloudflare When Web Scraping
  • How to Bypass DataDome When Web Scraping
  • How to Bypass Akamai Bot Manager When Web Scraping
  • How to Bypass PerimeterX (HUMAN Security) When Web Scraping
  • Bypassing AWS WAF When Web Scraping: Rate Rules, Bot Control, and Residential Proxies
  • How to Bypass Imperva (Incapsula) When Web Scraping
  • How to Bypass Kasada Bot Protection When Web Scraping
  • How to Bypass F5 BIG-IP Bot Defense When Web Scraping
  • How to Bypass Distil Networks When Web Scraping
  • How to Bypass reCAPTCHA When Web Scraping

Scraping Tools

  • Playwright Web Scraping: Practical Patterns for Protected Sites
  • Puppeteer Web Scraping: Patterns, Anti-Bot Limits, and BaaS Integration
  • Selenium Web Scraping: Practical Patterns for Real-World Projects
  • Scrapy Web Scraping with OmniScrape: Download Middleware, Pipelines, and Scale
  • Beautiful Soup Web Scraping: A Practical Guide
  • cURL Web Scraping: Shell-Native Patterns with OmniScrape
  • HTTPX Web Scraping: Async Python with OmniScrape
  • Cheerio Web Scraping: A Practical Guide

Site-Specific Scrapers

  • Amazon Scraper: Product Data, Buy Box, Reviews, and Multi-Marketplace
  • Google Search Scraper: Extract SERP Rankings and Features
  • Google Maps Scraper: Extract Business Listings and Place Data
  • LinkedIn Scraper: Companies, Jobs, and Public Profiles
  • Walmart Scraper: Prices, Stock, Rollback Deals, and Fulfillment Data
  • eBay Scraper: Extract Listings, Auctions, and Sold Prices
  • Shopify Scraper: Products, Variants, and JSON Endpoints
  • Indeed Scraper: Extract Job Listings, Salaries, and Company Data
  • Zillow Scraper: Extract Listings, Zestimates, and Price History
  • Reddit Scraper: Posts, Comments, and Subreddit Data
  • X (Twitter) Scraper: Tweets, Profiles, and Hashtags
  • Instagram Scraper: Posts, Reels, and Profile Metrics
  • TikTok Scraper: Extract Videos, Hashtags, and Trend Data
  • YouTube Scraper: Extract Video Metadata, Comments, and Channel Stats
  • Booking.com Scraper: Hotel Rates, Room Types, and Availability
  • Airbnb Scraper: Listings, Calendars, and Nightly Rates
  • Crunchbase Scraper: Extract Funding Rounds, Companies, and Investors
  • Yelp Scraper: Extract Business Listings, Ratings, and Reviews
  • Glassdoor Scraper: Employer Ratings, Salaries, and Review Data
  • Trustpilot Scraper: TrustScore, Star Distribution, and Review Monitoring

How We Compare

  • OmniScrape vs ScrapingBee
  • OmniScrape vs ZenRows
  • OmniScrape vs ScraperAPI: A Practical Developer Comparison
  • OmniScrape vs Bright Data: Which Web Scraping Platform Fits Your Team?
  • OmniScrape vs Oxylabs
  • OmniScrape vs Smartproxy
  • OmniScrape vs Crawlbase: API Design, Observability, and Migration Guide
  • OmniScrape vs Apify

Web Scraping Guides

  • Web Scraping Without Getting Blocked
  • Web Scraping Proxy Guide: Types, Sessions, Geo, and OmniScrape Integration
  • Solve CAPTCHAs While Web Scraping
  • Web Scraping vs Web Crawling: Architecture, Patterns, and When to Use Each
  • Headless Browser Scraping: When to Use It and How to Do It Right
  • Web Scraping API: Endpoint, Modes, Output Formats & Integration Patterns
  • Rotating Proxies for Web Scraping: Policies, Session Binding, and Geo Pools
  • Scrape JavaScript-Rendered Pages: SPAs, Hydration, and Hidden APIs

© 2026 OmniScrape. All rights reserved.

PrivacyTermsRefundsAcceptable Use