[Foundation-l] Re: Research access to logs
Jerome Jamnicky
jeronimwp at yahoo.com.au
Tue Aug 16 04:55:59 UTC 2005
Tobias Denninger wrote:
> Hello..
>
> ..I think it could be useful to compute the probability an article B is read on
> condition that another article A is read whithin a short timeframe before from a
> specific reader. Based on this probabilities suggestions could be made to the
> reader of a specific article which articles could be also interesting (maybe a
> kind of collaborative filtering or Amazon's "Customers who bought this book also
> bought.."). Subscribed user could offered personalized recommendations based on
> the computation how probable it is that an article is of interest to that
> specific user who read those articles. I'd be interested in implementing that
> idea, so as a first step I'd be interested in a sample log file with a size of
> some Megabyte.
>
> Tobias
>
If you haven't already, please read the privacy policy carefully, and
also this thread, where somebody made a similar request for a similar
purpose:
http://mail.wikipedia.org/pipermail/wikitech-l/2005-July/thread.html#30917
A single line from one of the squid server log files looks like this:
1124167686.523 210 12.34.56.78 TCP_MISS/200 2962 GET
http://en.wikipedia.org/wiki/Special:Search?search=Potato&go=Go -
PARENT_HIT/207.142.131.200 text/html [Host:
en.wikipedia.org\r\nUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1;
en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6\r\nAccept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\r\nAccept-Language:
en-us,en;q=0.5\r\nAccept-Encoding: gzip,deflate\r\nAccept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nKeep-Alive: 300\r\nConnection:
keep-alive\r\nReferer: http://en.wikipedia.org/wiki/Esoterica\r\n]
[HTTP/1.0 200 OK\r\nDate: Tue, 16 Aug 2005 04:48:06 GMT\r\nServer:
Apache\r\nX-Powered-By: PHP/4.3.11\r\nContent-language: en\r\nVary:
Accept-Encoding,Cookie\r\nExpires: -1\r\nCache-Control: private,
must-revalidate, max-age=0\r\nContent-Encoding: gzip\r\nConnection:
close\r\nContent-Type: text/html; charset=utf-8\r\n\r]
Note that this format may change in the future. Is there anything else
you need to know?
-Jerome
----
Send instant messages to your online friends http://au.messenger.yahoo.com
More information about the foundation-l
mailing list