PHP Curl site is responding javascript not enabled. How does it know? -
$curl = curl_init("http://example.com/"); curl_setopt($curl, curlopt_returntransfer, 1); curl_setopt($curl, curlopt_cookiejar, 'cookie.txt'); curl_setopt($curl, curlopt_httpheader, array("host: example.com", "connection: keep-alive", "upgrade-insecure-requests: 1", "user-agent: mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, gecko) chrome/52.0.2743.116 safari/537.36", "accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8", "accept-language: en-us,en;q=0.8")); curl_setopt($curl, curlopt_verbose, true); $result = curl_exec ($curl); echo $result;
the response
<html><title>you being redirected...</title> <noscript>javascript required. please enable javascript before allowed see page.</noscript>
i'm reusing headers browser sending site.
how can site know not real browser? error occurs when loading main page it's not there authentication going on.
in fact, javascript not needed majority of page's content. can it's loaded standard html, reason if not enabled entire page doesn't load.
any ideas? (sorry, can't share real site name).
to knowledge, mininum of 2 requests needed know if client has javascript enabled or not. since curl, , can setup "original" request response not make sense unless website checks request headers hound dog.
as @zerkms mentioned, chrome send more headers curl request:
accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 accept-encoding:gzip, deflate, sdch accept-language:en-us,en;q=0.8,nl;q=0.6 cache-control:max-age=0 connection:keep-alive cookie:cookiedata dnt:1 host:example.com upgrade-insecure-requests:1 user-agent:mozilla/5.0 (linux; android 6.0; nexus 5 build/mra58n) applewebkit/537.36 (khtml, gecko) chrome/46.0.2490.76 mobile safari/537.36
there couple of mismatches, host:example.com
not has space. secondly, curl take care of curl_init()
function. i'm missing dnt, cache-control, accept-encoding/languages.
in theory, server cannot detect client settings can detect every header.
if example build software, accumulate enough data detect normal browser headers. if data missing detect if real user request or not.
Comments
Post a Comment