Short question:
I'm getting inconsistent 403 forbidden errors when trying to read wikipedia.org content via PHP's file_get_contents().
Getting the same result: ================================== root@bling:~# php --run 'print(file_get_contents("http://en.wikipedia.org/wiki/Outer_Space"));'
Warning: file_get_contents(http://en.wikipedia.org/wiki/Outer_Space): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in Command line code on line 1
Call Stack: 0.0002 69840 1. {main}() Command line code:0 0.0002 69912 2. file_get_contents() Command line code:1 root@bling:~# ==================================
Hmmmm.
My suspicion is that it's just wanting a User-Agent header. In which case a) check that you're operating within the guidelines, and b) if you believe you are, just give it one that accurately describes what you're doing. For example this works for me: ================================== root@bling:~# php --run 'print(file_get_contents("http://en.wikipedia.org/wiki/Outer_Space%22,FALSE,stream_context_create(arra..." => array ("header"=> "User-Agent: Casey Bisson search API")))));'
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr"> <head> [...snip...] </div> <!-- Served by srv28 in 0.325 secs. --></body></html>
root@bling:~# ==================================
All the best, Nick.