No wonder my wiki is so popular when measured according to bandwidth needs. It's all brand name searchengines sniffing up those non-existent page links. I'm not sure what ought to be done.
==> logs/radioscanningtw.jidanni.org/http.2880196/access.log <== 74.6.20.76 - - [05/Jul/2007:19:35:33 -0700] "GET /index.php?title=Talk:%E5%8F%B0%E4%B8%AD%E7%B8%A3%E8%AD%A6%E5%AF%9F%E5%B1%80%E6%9D%B1%E5%8B%A2%E5%88%86%E5%B1%80&action=edit HTTP/1.0" 200 4896 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
On 7/5/07, jidanni@jidanni.org jidanni@jidanni.org wrote:
No wonder my wiki is so popular when measured according to bandwidth needs. It's all brand name searchengines sniffing up those non-existent page links. I'm not sure what ought to be done.
==> logs/radioscanningtw.jidanni.org/http.2880196/access.log <== 74.6.20.76 - - [05/Jul/2007:19:35:33 -0700] "GET /index.php?title=Talk:%E5%8F%B0%E4%B8%AD%E7%B8%A3%E8%AD%A6%E5%AF%9F%E5%B1%80%E6%9D%B1%E5%8B%A2%E5%88%86%E5%B1%80&action=edit HTTP/1.0" 200 4896 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
robots.txt:
Disallow: /index.php
Assuming you're using URL rewriting, this will have the robots request only actual wiki pages, not edit pages/history pages/etc. You may also wish to blacklist Special:Random.
"S" == Simetrical writes:
S> robots.txt: S> Disallow: /index.php
S> Assuming you're using URL rewriting, this will have the robots S> request only actual wiki pages, not edit pages/history pages/etc.
Nice solution. (But sites that linked to your pages before rewriting will now have those links not indexed in search engines (minor issue). Also if rewriting was so cool, then it would be the default and one wouldn't need to follow instructions to implement it.) But mainly: those links should have nofollow in their <a> tags. OK, thanks.
On 7/9/07, jidanni@jidanni.org jidanni@jidanni.org wrote:
Also if rewriting was so cool, then it would be the default and one wouldn't need to follow instructions to implement it.
It would be the default except it requires configuration that we can't necessarily do. We can't, for instance, predict what target directory you may want to use. If you install and /w/ and rewrite to /wiki/, great, but maybe you install to / or /mywiki/ or /wiki/ or want to rewrite to / or /mywiki/ or /w/. All we can do, which we *do* do if possible, is use index.php/ as the pseudo-directory to rewrite to, because we can guarantee that exists and should be unused by anything else.
rel="nofollow" in those links might be reasonable anyway, of course.
On 09/07/07, Simetrical Simetrical+wikilist@gmail.com wrote:
On 7/9/07, jidanni@jidanni.org jidanni@jidanni.org wrote:
Also if rewriting was so cool, then it would be the default and one wouldn't need to follow instructions to implement it.
It would be the default except it requires configuration that we can't necessarily do. We can't, for instance, predict what target directory you may want to use. If you install and /w/ and rewrite to /wiki/, great, but maybe you install to / or /mywiki/ or /wiki/ or want to rewrite to / or /mywiki/ or /w/. All we can do, which we *do* do if possible, is use index.php/ as the pseudo-directory to rewrite to, because we can guarantee that exists and should be unused by anything else.
Not to mention that the best approach, using an Alias, normally requires access to httpd.conf, and I would hope that MediaWiki couldn't write to *that*, nor perform the restart required to make it take effect.
Rob Church
wikitech-l@lists.wikimedia.org