Short question:
I'm getting inconsistent 403 forbidden errors when trying to read
wikipedia.org content via PHP's file_get_contents().
Getting the same result:
==================================
root@bling:~# php --run
'print(file_get_contents("http://en.wikipedia.org/wiki/Outer_Space"));'
Warning:
file_get_contents(http://en.wikipedia.org/wiki/Outer_Space): failed to open
stream: HTTP request failed! HTTP/1.0 403
Forbidden
in Command line code on line 1
Call Stack:
0.0002 69840 1. {main}() Command line code:0
0.0002 69912 2. file_get_contents() Command line code:1
root@bling:~#
==================================
Hmmmm.
My suspicion is that it's just wanting a User-Agent header. In which case a) check
that you're operating within the guidelines, and
b) if you believe you are, just give it one that accurately describes what you're
doing. For example this works for me:
==================================
root@bling:~# php --run
'print(file_get_contents("http://en.wikipedia.org/wiki/Outer_Space",FALSE,stream_context_create(array("http"
=> array ("header"=> "User-Agent: Casey Bisson search
API")))));'
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html
xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
lang="en" dir="ltr">
<head>
[...snip...]
</div>
<!-- Served by srv28 in 0.325 secs. --></body></html>
root@bling:~#
==================================
All the best,
Nick.