hello,
there will be some downtime Tuesday morning as changes are being made to
the knams network. the toolserver IPs will also change during this time.
- river.
Hello all,
all active accounts will expire at 7th December. If you are still interrested
in your account, please send a eMail to ts-not-expire(a)daniel.baur4.info; NOT
to this mailinglist! (if you send it to here, you will lost your account
instantly ;)). Please name your accountname in this email. If you like, you
can send the poll (see below) with it - but that is optional.
I will send a list of non-confirmed accounts every monday.
So if you have a moment, please fill in this little poll. Of corse your data
will only used in anonymous form.
Sincerly,
DaB.
#START
*I'm
[ ] younger than 18
[ ] between 18 and 25
[ ] between 25 and 35
[ ] older than 34
*I'm
[ ] female
[ ] male
*I live in
(please put country-name here)
* I use my account for
(more then one "x" possible)
[ ] running a bot
[ ] make statistics
[ ] doing research
[ ] helping conduct commons
[ ] other
* I can programm in
(more then one "x" possible)
[ ] php
[ ] perl
[ ] python
[ ] c (or c++ or c#)
[ ] java
[ ] bash
[ ] html
[ ] javascript
*I use
(for my tools)
(more then one "x" possible)
[ ] subversion
[ ] jira
[ ] fisheye
[ ] mysql
*I work mainly in
[ ] wikipedia
[ ] wiktionary
[ ] wikinews
[ ] wikisource
[ ] wikiquote
[ ] wikiversity
[ ] commons
#END
--
Diese eMail sollte mit dem PGP-Schlüssel 0x2D3EE2D42B255885 digital signiert
sein. Bitte beachten Sie, das unsignierte eMails beliebig gefälscht sein
können. Achten Sie daher auf Signaturen.
--
wp-blog.de
Hello gurus,
we (Kolossos and I) work an a new update for Wikipedia-World and
Vorlagenauswertung (Templatetiger)
http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikipe…http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Vorlagenauswertung/en
In the last years we use the dump from Wikimedia Foundation. We download
this dump for the main languages and unpack this big files at the
toolserver and scan this file with a perl-script for geocoordinates and
templates.
Problem: This need many time, make many traffic and need many disk space
at the toolserver. Also the update is only possible if a new dump is
available.
Now we want create a new workflow. We use a up-to-date list of articles
of a language and the perl-script get the text of a any article from the
toolserver.
It work very well, but we need at the moment for reading the text of an
article 2 seconds. If we get all 677000 German articles with 2 seconds
we need 15.6 days. This is too much! With the old workflow I need not
more then 30 Minutes to scan the complete DE-Dump.
I try different ways to get the text of an article with my perl-script.
1.)
http://localhost/~daniel/WikiSense/WikiProxy.php?wiki=de.wikipedia.org&titl…
2.)
http://tools.wikimedia.de/~kolossos/rss-test/fopen.php?pro=wikipedia&lang=d…
but every time the perl-script need average 2 seconds. With the second
PHP-Script I see in a browser the time for getting the text and this
time is mostly 0.2 seconds (factor 10 lower !!!) In this time we need
only 1.5 days for scanning. This will be ok!
Here is the perl-code of my script, which get the article test. I hope
one perl guru can help us to reduce the time.
Thank´s for every help!
Stefan Kühn
sub get_article_text_from_web {
my $title = $_[0];
my $page_id = 0;
my $revision_id = 0;
my $revision_time = 0;
my $text = "";
print "get\t".$title."\n";
my $test1 = time();
print localtime($test1)."\n";
use URI::Escape;
use LWP::UserAgent;
#
http://localhost/~daniel/WikiSense/WikiProxy.php?wiki=$lang.wikipedia.org&t…
my $url =
'http://localhost/~daniel/WikiSense/WikiProxy.php?wiki=de.wikipedia.org&titl…'.$title;
#http://www.webkuehn.de
uri_escape($url);
my $ua = LWP::UserAgent->new;
my $response = $ua->get( $url );
$response->is_success or die "$url: ", $response->status_line;
my $result = $response->content;
if ( $result ) {
$text = $result;
#print "$result\n";
print "ok\t".$title."\n";
my $test2 = time();
my $test3 = $test2 - $test1;
print localtime($test2)."\n";
print "second\t".$test3."\n\n";
} else {
print "No result $title\n";
}
#print "ok2\n";
return($title, $page_id, $revision_id, $revision_time, $text);
}
> >River Tarnell wrote:
>
> we have bought the domain toolserver.org with a view to using it for the
>
> toolserver, e.g. http://toolserver.org/~username/. if anyone has any
>
> comments (for/against), please say now.
>
Woohooo ^_^ this looks nice
> >From: Martin Peeks <martinp23(a)googlemail.com>
> Hemlock is heavily overloaded
>
Would disabling use of interwiki.py on the TS help reduce the load ?
it's known to be a memory whore.
(and don't tell me that script can't be run on an user's computer, I've
done that for months)
--DarkoNeko
hello,
we have bought the domain toolserver.org with a view to using it for the
toolserver, e.g. http://toolserver.org/~username/. if anyone has any
comments (for/against), please say now.
- river.
so not many people seem interested in moving their projects to the stable
server. are the requirements too strict, or do people just not see the need?
- river.
> so not many people seem interested in moving their projects to the stable
> server. are the requirements too strict, or do people just not see the need?
>
> - river.
I do like the idea, especially of having maintiner teams.
Yet, currently, I have no offers, and no suggestions to make.
... which may change in some months ...
Greetings - Purodha