1.When to use httpx over requests
httpx is the right choice when your scraping workload lives inside an async Python service — FastAPI, Starlette, or a raw asyncio pipeline — and you need to fan out many concurrent OmniScrape requests without spawning threads or processes. It gives you asyncio.gather semantics over a shared connection pool, HTTP/2 multiplexing, and a sync API that is identical to the async one, which makes unit testing straightforward.
If you are writing a one-off script or running Celery prefork workers, the synchronous httpx.Client (or plain requests) is perfectly adequate. The async value shows up when you are managing hundreds of in-flight scrape jobs inside a single event loop and want to avoid the overhead of a full framework like Scrapy.
- FastAPI / Starlette scrape endpoints with lifespan-managed client
- asyncio.gather on large URL batches with semaphore-bounded concurrency
- Shared AsyncClient with connection pooling across request handlers
- HTTP/2 to OmniScrape API for multiplexed connections
- Sync AsyncClient parity — same code paths in tests and production
- Streaming large HTML responses without buffering the full body
2.Where httpx breaks on protected sites
httpx's async concurrency solves throughput — it does not solve anti-bot detection. When you send httpx directly to a Cloudflare-protected retailer, the TLS ClientHello fingerprint identifies your client as a Python process within milliseconds. HTTP/2 support does not change this; the fingerprint is still detectable at the TLS handshake layer before any HTTP frames are exchanged.
Beyond bot detection, there are several operational failure modes to design around. A missing semaphore lets asyncio.gather fire all requests simultaneously, which saturates your connection pool and triggers OmniScrape 429 rate limiting. js_rendering jobs can take 30–90 seconds each; holding those connections open inside a FastAPI handler blocks the event loop slot for the duration. And httpx has no built-in HTML parser — you need selectolax or BeautifulSoup4 downstream, or you use OmniScrape's css_extractor output format to avoid parsing entirely.
- Direct GET to Cloudflare-protected retailers — TLS fingerprint blocked immediately
- HTTP/2 fingerprinting still detectable without OmniScrape proxy rotation
- Connection pool exhaustion when semaphore is missing from gather loops
- No built-in HTML parser — pair with selectolax or use css_extractor
- Long js_rendering jobs (30–90 s) held inside FastAPI handlers starve the event loop
- Async concurrency improves throughput but does not bypass anti-bot middleware
3.Pattern A — async batch fetch with bounded concurrency
The core pattern: one shared AsyncClient for the process lifetime, one asyncio.Semaphore to cap in-flight requests, and asyncio.gather with return_exceptions=True so a single failed URL does not abort the entire batch. The semaphore value should match your OmniScrape plan's concurrency limit — start at 5 and raise it only after confirming you are not hitting 429 responses.
Using css_extractor as the output format means OmniScrape performs the field extraction server-side and returns a clean dictionary. This avoids shipping full HTML payloads over the wire and removes the need for a local parsing step. The response shape is body['data']['css_extracted'] — a dict keyed by your selector names.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162import asyncio
import os
import httpx
API_KEY = os.environ["OMNISCRAPE_KEY"]
URLS = [
"https://example.com/p/1",
"https://example.com/p/2",
"https://example.com/p/3",
]
async def scrape_one(client: httpx.AsyncClient, url: str) -> dict:
r = await client.post(
"https://api.omniscrape.io/v1/scrape",
headers={"X-API-Key": API_KEY},
json={
"url": url,
"mode": "auto",
"output_format": "css_extractor",
"css_selectors": {
"title": "h1",
"price": "[data-price]",
"sku": ".product-sku",
},
"enable_solver": True,
},
timeout=120.0,
)
r.raise_for_status()
body = r.json()
if not body["success"]:
raise RuntimeError(f"scrape failed for {url}: {body}")
return {
"url": url,
"fields": body["data"].get("css_extracted", {}),
"method_used": body["metadata"]["method_used"],
"solver_used": body["metadata"].get("solver_used", False),
"charged": body["billing"]["charged"],
"balance_after": body["billing"]["balance_after"],
}
async def main():
sem = asyncio.Semaphore(5) # match your OmniScrape concurrency limit
async def bounded(url: str):
async with sem:
return await scrape_one(client, url)
async with httpx.AsyncClient(
limits=httpx.Limits(max_connections=20, max_keepalive_connections=10)
) as client:
results = await asyncio.gather(
*[bounded(u) for u in URLS], return_exceptions=True
)
for url, res in zip(URLS, results):
if isinstance(res, Exception):
print(f"FAILED {url}: {res}")
else:
print(f"OK {url}: {res['fields']} | cost={res['charged']}")
asyncio.run(main())
4.FastAPI service wrapper with lifespan client
The single most common mistake in FastAPI scrape services is creating a new httpx.AsyncClient per request. Each client instantiation opens a new connection pool, burns file descriptors, and bypasses HTTP keep-alive entirely. The correct pattern is to create one AsyncClient in the application lifespan and attach it to app.state — every request handler then reads the shared client from the request object.
The lifespan context manager ensures the client is properly closed on shutdown, draining in-flight requests and releasing connections. Without this, you will see ResourceWarning noise in logs and occasional connection leaks under load.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException, Request
import httpx
import os
KEY = os.environ["OMNISCRAPE_KEY"]
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup: create shared client once
app.state.client = httpx.AsyncClient(
timeout=120.0,
limits=httpx.Limits(max_connections=20, max_keepalive_connections=10),
)
yield
# Shutdown: close cleanly
await app.state.client.aclose()
app = FastAPI(lifespan=lifespan)
@app.post("/internal/scrape")
async def scrape(request: Request, url: str, mode: str = "auto"):
client: httpx.AsyncClient = request.app.state.client
r = await client.post(
"https://api.omniscrape.io/v1/scrape",
headers={"X-API-Key": KEY},
json={
"url": url,
"mode": mode,
"output_format": "html",
"enable_solver": True,
},
)
if r.status_code == 401:
raise HTTPException(401, "Invalid OmniScrape API key")
if r.status_code == 402:
raise HTTPException(402, "OmniScrape balance exhausted — top up required")
if r.status_code == 429:
raise HTTPException(429, "OmniScrape rate limit — reduce concurrency")
r.raise_for_status()
body = r.json()
if not body.get("success"):
raise HTTPException(502, detail=body)
return {
"content": body["data"]["content"],
"method_used": body["metadata"]["method_used"],
"charged": body["billing"]["charged"],
}
5.Structured logging for cost attribution
Every OmniScrape response carries billing and metadata fields that are worth logging in structured form. billing.charged tells you the credit cost of each individual scrape; billing.balance_after lets you build a low-balance alert without a separate API call. metadata.method_used tells you whether the job ran via fast HTTP or js_rendering — a useful signal for tuning mode selection and understanding why costs are higher than expected.
Log these fields as structured JSON alongside the URL and a timestamp. In a FastAPI service, attach them to the request context so they appear in every log line for that request. This feeds cost attribution dashboards and is directly referenced in the price monitoring guide for per-SKU cost tracking.
12345678910111213141516import logging
import json
logger = logging.getLogger("omniscrape")
def log_scrape_result(url: str, body: dict) -> None:
record = {
"url": url,
"success": body.get("success"),
"method_used": body.get("metadata", {}).get("method_used"),
"solver_used": body.get("metadata", {}).get("solver_used"),
"challenge_solved": body.get("metadata", {}).get("challenge_solved"),
"charged": body.get("billing", {}).get("charged"),
"balance_after": body.get("billing", {}).get("balance_after"),
}
logger.info(json.dumps(record))
6.Pattern B — httpx + Playwright BaaS for interactive pages
Some workflows require genuine browser interaction: clicking through cookie banners, selecting dropdown filters, or triggering lazy-loaded content before extraction. For these cases, httpx handles the job queue and result storage while a Playwright async client connects to OmniScrape's Browser-as-a-Service endpoint over CDP. The browser session runs remotely — your process only sends commands and receives DOM snapshots.
Critical operational constraint: do not run BaaS sessions inside a FastAPI request handler. A single session can hold a CDP connection open for 30–120 seconds. Running that inside an async handler starves other requests on the same event loop. Dispatch BaaS jobs to a dedicated background worker process — use asyncio.Queue, Celery, or ARQ to decouple the FastAPI response from the browser session lifetime.
1234567891011121314151617181920212223242526272829303132333435import os
import asyncio
from playwright.async_api import async_playwright
OMNISCRAPE_KEY = os.environ["OMNISCRAPE_KEY"]
async def interactive_scrape(url: str, wait_selector: str = ".price") -> str:
"""
Connect to OmniScrape BaaS via CDP, navigate to url,
wait for wait_selector, and return the full page HTML.
Run this in a dedicated worker — not inside a FastAPI handler.
"""
cdp_endpoint = (
f"wss://browser.omniscrape.io"
f"?apikey={OMNISCRAPE_KEY}"
f"&render_media=false"
)
async with async_playwright() as p:
browser = await p.chromium.connect_over_cdp(cdp_endpoint)
context = await browser.new_context()
page = await context.new_page()
try:
await page.goto(url, wait_until="domcontentloaded", timeout=60_000)
await page.wait_for_selector(wait_selector, timeout=30_000)
html = await page.content()
finally:
await browser.close()
return html
# Example: run from a background worker
if __name__ == "__main__":
html = asyncio.run(
interactive_scrape("https://example.com/product/42", wait_selector=".price")
)
print(html[:500])
7.OmniScrape async endpoint for long renders
For js_rendering jobs that consistently take more than 30 seconds — complex SPAs, pages with heavy third-party scripts, or sites that require multiple navigation steps — holding an httpx connection open for the full duration is wasteful. Each open connection occupies a slot in your pool and a file descriptor on the OS. If your plan supports it, POST to /v1/scrape/async instead: OmniScrape queues the job, returns a job ID immediately, and you poll for completion.
The polling loop below uses exponential backoff with a cap. Do not poll faster than every 2 seconds — the job status endpoint is rate-limited and rapid polling wastes credits. Store the job ID in Redis or a database if you need durability across worker restarts.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152import asyncio
import httpx
import os
API_KEY = os.environ["OMNISCRAPE_KEY"]
BASE = "https://api.omniscrape.io/v1"
async def submit_async_job(client: httpx.AsyncClient, url: str) -> str:
r = await client.post(
f"{BASE}/scrape/async",
headers={"X-API-Key": API_KEY},
json={
"url": url,
"mode": "js_rendering",
"output_format": "html",
"js_wait_selector": ".product-loaded",
"js_wait_timeout": 30000,
},
timeout=30.0,
)
r.raise_for_status()
return r.json()["job_id"]
async def poll_job(client: httpx.AsyncClient, job_id: str, max_wait: int = 120) -> dict:
delay = 2.0
elapsed = 0.0
while elapsed < max_wait:
await asyncio.sleep(delay)
elapsed += delay
r = await client.get(
f"{BASE}/scrape/async/{job_id}",
headers={"X-API-Key": API_KEY},
timeout=10.0,
)
r.raise_for_status()
body = r.json()
if body.get("status") == "complete":
return body
if body.get("status") == "failed":
raise RuntimeError(f"async job {job_id} failed: {body}")
delay = min(delay * 1.5, 10.0) # cap at 10 s between polls
raise TimeoutError(f"job {job_id} did not complete within {max_wait}s")
async def main():
async with httpx.AsyncClient() as client:
job_id = await submit_async_job(client, "https://example.com/heavy-spa")
print(f"Job submitted: {job_id}")
result = await poll_job(client, job_id)
content = result["data"]["content"]
print(f"Done — {len(content)} bytes")
asyncio.run(main())
8.Error handling matrix
Centralise all OmniScrape error handling in one client wrapper function. Scattering status code checks across handlers makes it impossible to add global retry logic or alerting later. The matrix below covers every HTTP status and success:false case you will encounter in production.
The most important distinction: 429 is retryable with backoff; 402 is not retryable and requires human intervention (top up balance). success:false with a 200 HTTP status means OmniScrape reached the target but the scrape itself failed — dead-letter these for inspection rather than retrying blindly, as they often indicate a selector change or a new anti-bot challenge type.
- 401 — Invalid or missing API key. Fix the key. No retry.
- 402 — Account balance exhausted. Pause all workers, alert on-call, top up balance.
- 429 — Rate limit exceeded. Exponential backoff with jitter (2 s base, 60 s cap). Reduce semaphore value.
- 500/503 — OmniScrape internal error. Retry up to 3× with 5 s delay between attempts.
- 502 from your own service — upstream OmniScrape timeout. Retry once; if persistent, switch to async job endpoint.
- success: false, HTTP 200 — Scrape reached target but extraction failed. Dead-letter the URL for manual inspection. Do not retry blindly.
- httpx.TimeoutException — Your 120 s client timeout fired before OmniScrape responded. Retry once; if consistent, switch that URL to async job flow.
9.Connection pool limits and semaphore tuning
httpx.Limits controls the AsyncClient's connection pool. max_connections is the hard ceiling on simultaneous open connections; max_keepalive_connections controls how many idle connections are held open for reuse. For a scrape service posting to OmniScrape, a reasonable starting configuration is max_connections=20, max_keepalive_connections=10. Raise these only if profiling shows your semaphore is releasing slots faster than the pool can open new connections.
The semaphore value and the pool size are separate concerns. The semaphore caps how many OmniScrape jobs you have in flight at once (to respect rate limits); the pool size caps how many TCP connections httpx maintains. Set the semaphore to your OmniScrape plan's concurrency limit. Set max_connections to at least the semaphore value — if max_connections is lower than the semaphore, you will see connection acquisition delays that look like latency spikes but are actually pool contention.
1234567891011121314151617import httpx
# Recommended starting configuration for a scrape microservice
client = httpx.AsyncClient(
timeout=httpx.Timeout(
connect=10.0, # TCP connect timeout
read=120.0, # response read timeout — must cover js_rendering
write=10.0,
pool=5.0, # time to wait for a connection from the pool
),
limits=httpx.Limits(
max_connections=20,
max_keepalive_connections=10,
keepalive_expiry=30.0, # close idle connections after 30 s
),
http2=True, # enable HTTP/2 for multiplexing
)
10.Production readiness checklist
Run through this list before deploying a new httpx-based scrape service. Most production incidents with OmniScrape integrations trace back to missing items here — particularly the missing semaphore, the per-request client, and the lack of low-balance alerting.
- Semaphore value matches OmniScrape plan concurrency limit — not higher
- AsyncClient created once in lifespan, not per request or per URL
- timeout=120.0 (or Timeout object) on all scrape POSTs — covers js_rendering
- return_exceptions=True in asyncio.gather — partial batch success, not all-or-nothing
- Structured logging of billing.charged and metadata.method_used per URL
- Low-balance alert on billing.balance_after falling below threshold
- BaaS / Pattern B sessions run in a dedicated worker process, not inside FastAPI handlers
- Error handling centralised in one wrapper — 401/402/429/5xx/success:false all covered
- Load test at production concurrency before launch — verify 429 rate and pool behaviour
- Dead-letter queue for success:false responses — inspect before retrying
Frequently asked questions
When should I use httpx instead of requests for OmniScrape calls?
Use httpx when your scraping workload runs inside an async Python service — FastAPI, Starlette, or a raw asyncio pipeline. The async AsyncClient lets you fan out many concurrent OmniScrape POSTs over a shared connection pool without threads. For one-off scripts or Celery prefork workers, synchronous requests is simpler and equally capable — the JSON request body and response shape are identical.
Does enabling HTTP/2 on the httpx client improve scraping performance?
Marginally. HTTP/2 multiplexes multiple requests over a single TCP connection, which reduces connection overhead. But the dominant latency in OmniScrape workflows is render time — fast mode jobs take 1–3 s, js_rendering jobs take 10–90 s. The difference between HTTP/1.1 and HTTP/2 on the client-to-API leg is rarely more than a few hundred milliseconds. Enable it (http2=True) if you want, but do not expect it to meaningfully change throughput.
How do I share the AsyncClient across FastAPI request handlers safely?
Use the FastAPI lifespan context manager to create one AsyncClient on startup and attach it to app.state. Each handler reads the client from request.app.state.client. This gives you a single connection pool for the process lifetime. Never create a new AsyncClient inside a request handler — it opens a new pool, bypasses keep-alive, and leaks file descriptors under load. Always call await client.aclose() in the lifespan shutdown block.
Can I use synchronous httpx in Celery workers?
Yes. httpx.Client (the sync variant) works fine in Celery prefork workers. Use it exactly like requests — httpx.post() with the same JSON body. Async is not required and adds complexity in a sync worker context. If you are using Celery with the gevent or eventlet pool, you can use AsyncClient, but you need to ensure the event loop is managed correctly per worker.
What is the right way to parse the OmniScrape response without BeautifulSoup?
Use output_format: css_extractor with a css_selectors map. OmniScrape performs the extraction server-side and returns body['data']['css_extracted'] — a plain Python dict keyed by your selector names. Validate it with a Pydantic model before inserting into your database. This avoids shipping full HTML payloads and eliminates the local parsing step entirely.
How do I handle OmniScrape 429 rate limit errors in an async batch?
First, lower your semaphore value — if you are hitting 429, your concurrency exceeds the plan limit. For transient 429s, implement exponential backoff with jitter: start at 2 seconds, multiply by 1.5 on each retry, cap at 60 seconds, add ±20% random jitter to avoid thundering herd. Do not retry more than 5 times. If 429s persist after reducing concurrency, contact OmniScrape support to confirm your plan's concurrency limit.
When should I switch from the synchronous scrape endpoint to the async job endpoint?
Switch to the async job endpoint (/v1/scrape/async) when js_rendering jobs consistently take more than 30 seconds. Holding httpx connections open for 60–120 seconds occupies pool slots and event loop resources. The async endpoint returns a job ID immediately; you poll for completion with exponential backoff. This is especially important in FastAPI services where long-held connections starve other handlers. Store job IDs in Redis for durability across worker restarts.
Related guides