ScraperAPI

General-purpose web scraping with the api_key injected for you

View as Markdown

Proxies ScraperAPI’s synchronous endpoint, injecting the secret api_key query parameter.

GET|POST|PUT /scraperapi → https://api.scraperapi.com/?api_key=<secret>&<your query>
  • GET scrapes the target URL.
  • POST / PUT submit a form/body to the target; the body is forwarded unchanged.

All controls travel in the query string — the target url and any ScraperAPI parameter (render, country_code, device_type, premium, …). Your query string is passed through byte-for-byte.

What the gateway injects and strips

  • api_key is injected first, ahead of your parameters — ScraperAPI wants its own parameters before url so a partially-encoded target URL can’t swallow them.
  • A client-supplied api_key query param is dropped (in any encoding), so the injected key can’t be overridden.
  • The x-sapi-api_key header is dropped — the header spelling of the key.
  • The Authorization header is stripped. This matters here: with keep_headers=true, ScraperAPI forwards your headers to the scraped site, so leaving the gateway token on would leak it to a third party.

All other headers (including x-sapi-* controls like x-sapi-render) pass through untouched.

Examples

$curl "$GATEWAY/scraperapi?url=https://example.com" \
> -H "Authorization: Bearer $TOKEN"

Scrapes can be slow — ScraperAPI recommends allowing ~70 seconds. The gateway just awaits the upstream, so your own client timeout is the binding one. Responses can also be large (ScraperAPI allows up to ~50MB); the gateway streams them back in full.

The response is ScraperAPI’s, returned unchanged. For the full parameter set, see ScraperAPI’s documentation.