FYI, I am working on a blame tool for wikipedia http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Also, are you aware of the WikiTrust extension http://wikitrust.soe.ucsc.edu/ ?
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com wrote:
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the wikiblame.
my purpose is to port the wikipedia over to git...
mike
There is a student at UiO looking into alternate trust coloring schemes. John Erling /jeblad
jamesmikedupont@googlemail.com wrote:
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com wrote:
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the wikiblame.
my purpose is to port the wikipedia over to git...
mike
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
I was not able to find any examples. I think that such a blame and trust tool belongs in git, not in wikipedia because there are many other usages for it. mike
On Sat, Oct 17, 2009 at 8:33 PM, John Erling Blad john.erling.blad@jeb.no wrote:
There is a student at UiO looking into alternate trust coloring schemes. John Erling /jeblad
jamesmikedupont@googlemail.com wrote:
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com wrote:
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the wikiblame.
my purpose is to port the wikipedia over to git...
mike
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
On Sat, Oct 17, 2009 at 12:40 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
I was not able to find any examples. I think that such a blame and trust tool belongs in git, not in wikipedia because there are many other usages for it. mike
On Sat, Oct 17, 2009 at 8:33 PM, John Erling Blad john.erling.blad@jeb.no wrote:
There is a student at UiO looking into alternate trust coloring schemes. John Erling /jeblad
jamesmikedupont@googlemail.com wrote:
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com
wrote:
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia
http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht...
thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the
wikiblame.
my purpose is to port the wikipedia over to git...
mike
So far all of the implementations of blame tools for the full history dump of a wiki do not have the features of an ideal blame tool.
Given an arbitrary string of text an ideal blame tool can scan Wikipedia's entire history - and ideally the history of all WMF wikis - and tell you the authors of that text.
The design of such a system is essentially a search engine where each revision is a page with an associated author. The engine works iteratively, first finding all page blobs (where a page blob contains all text across all revisions for an article) that contain all of the words being searched for, and then iteratively working forwards in time on the revisions of that article in an effort to find the earliest authors. This isn't a complete spec, but it gives the general idea.
On Sat, Oct 17, 2009 at 9:50 PM, Brian J Mingus Brian.Mingus@colorado.edu wrote:
On Sat, Oct 17, 2009 at 12:40 PM, jamesmikedupont@googlemail.com jamesmikedupont@googlemail.com wrote:
I was not able to find any examples. I think that such a blame and trust tool belongs in git, not in wikipedia because there are many other usages for it. mike
On Sat, Oct 17, 2009 at 8:33 PM, John Erling Blad john.erling.blad@jeb.no wrote:
There is a student at UiO looking into alternate trust coloring schemes. John Erling /jeblad
jamesmikedupont@googlemail.com wrote:
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com wrote:
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia
http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the wikiblame.
my purpose is to port the wikipedia over to git...
mike
So far all of the implementations of blame tools for the full history dump of a wiki do not have the features of an ideal blame tool.
Given an arbitrary string of text an ideal blame tool can scan Wikipedia's entire history - and ideally the history of all WMF wikis - and tell you the authors of that text.
The design of such a system is essentially a search engine where each revision is a page with an associated author. The engine works iteratively, first finding all page blobs (where a page blob contains all text across all revisions for an article) that contain all of the words being searched for, and then iteratively working forwards in time on the revisions of that article in an effort to find the earliest authors. This isn't a complete spec, but it gives the general idea.
Nice description.
Well imagine a problem of finding and removing some copyrighted code from linux kernel, or some bug from software. We need to make sure that git has these features.
The implementation we have in WikiTrust is that when you click on a word of a version, you are sent to the diff where the word originates.
We have two demos up, and we are adding other Wikipedias to these demos as our servers permit...
To see the demos, go to https://addons.mozilla.org/en-US/firefox/addon/11087and install the WikiTrust add-on. Then, you can browse the Italian or Portuguese Wikipedia, and it should work (it.wikimedia.org and pt.wikimedia.org).
Some notes on this demo:
It is slow, because there is some back-and-forth between UCSC and Wikimedia Foundation servers. If this were running at the WMF, it would be way faster.
Again, as it is not running at the Foundation, we are not notified of new edits. So sometimes, we don't have an analysis for the revision you request. In that case, we fetch what we need and we analyze the revision. You should be able to see the results of the analysis after ten seconds or so.
Luca
On Sat, Oct 17, 2009 at 12:50 PM, Brian J Mingus Brian.Mingus@colorado.eduwrote:
On Sat, Oct 17, 2009 at 12:40 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
I was not able to find any examples. I think that such a blame and trust tool belongs in git, not in wikipedia because there are many other usages for it. mike
On Sat, Oct 17, 2009 at 8:33 PM, John Erling Blad john.erling.blad@jeb.no wrote:
There is a student at UiO looking into alternate trust coloring schemes. John Erling /jeblad
jamesmikedupont@googlemail.com wrote:
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com
wrote:
jamesmikedupont@googlemail.com wrote:
FYI, I am working on a blame tool for wikipedia
http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht...
thanks, mike
Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the
wikiblame.
my purpose is to port the wikipedia over to git...
mike
So far all of the implementations of blame tools for the full history dump of a wiki do not have the features of an ideal blame tool.
Given an arbitrary string of text an ideal blame tool can scan Wikipedia's entire history - and ideally the history of all WMF wikis - and tell you the authors of that text.
The design of such a system is essentially a search engine where each revision is a page with an associated author. The engine works iteratively, first finding all page blobs (where a page blob contains all text across all revisions for an article) that contain all of the words being searched for, and then iteratively working forwards in time on the revisions of that article in an effort to find the earliest authors. This isn't a complete spec, but it gives the general idea.
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Thank you very much, this is quite interesting.
Now imagine if you had a faster way to get the realtime changes tot he database, git.
mike
On Sun, Oct 18, 2009 at 12:37 AM, Luca de Alfaro luca@dealfaro.org wrote:
The implementation we have in WikiTrust is that when you click on a word of a version, you are sent to the diff where the word originates.
We have two demos up, and we are adding other Wikipedias to these demos as our servers permit...
To see the demos, go to https://addons.mozilla.org/en-US/firefox/addon/11087 and install the WikiTrust add-on. Then, you can browse the Italian or Portuguese Wikipedia, and it should work (it.wikimedia.org and pt.wikimedia.org).
Some notes on this demo:
It is slow, because there is some back-and-forth between UCSC and Wikimedia Foundation servers. If this were running at the WMF, it would be way faster.
Again, as it is not running at the Foundation, we are not notified of new edits. So sometimes, we don't have an analysis for the revision you request. In that case, we fetch what we need and we analyze the revision. You should be able to see the results of the analysis after ten seconds or so.
Luca
On Sat, Oct 17, 2009 at 12:50 PM, Brian J Mingus Brian.Mingus@colorado.edu wrote:
On Sat, Oct 17, 2009 at 12:40 PM, jamesmikedupont@googlemail.com jamesmikedupont@googlemail.com wrote:
I was not able to find any examples. I think that such a blame and trust tool belongs in git, not in wikipedia because there are many other usages for it. mike
On Sat, Oct 17, 2009 at 8:33 PM, John Erling Blad john.erling.blad@jeb.no wrote:
There is a student at UiO looking into alternate trust coloring schemes. John Erling /jeblad
jamesmikedupont@googlemail.com wrote:
On Sat, Oct 17, 2009 at 7:39 PM, Platonides platonides@gmail.com wrote:
jamesmikedupont@googlemail.com wrote:
> FYI, > I am working on a blame tool for wikipedia > > http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... > thanks, > mike > Importing an article history into git for using git blame doesn't seem like a good method...
Well importing it just for blame is bad. I agree. I read about the wikiblame.
my purpose is to port the wikipedia over to git...
mike
So far all of the implementations of blame tools for the full history dump of a wiki do not have the features of an ideal blame tool.
Given an arbitrary string of text an ideal blame tool can scan Wikipedia's entire history - and ideally the history of all WMF wikis - and tell you the authors of that text.
The design of such a system is essentially a search engine where each revision is a page with an associated author. The engine works iteratively, first finding all page blobs (where a page blob contains all text across all revisions for an article) that contain all of the words being searched for, and then iteratively working forwards in time on the revisions of that article in an effort to find the earliest authors. This isn't a complete spec, but it gives the general idea.
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
We too! http://wikitrust.soe.ucsc.edu/
You can download it and use it on your own wiki, or in a couple of weeks we will have demos up for various Wikipedias.
Best,
Luca
On Sat, Oct 17, 2009 at 9:50 AM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
FYI, I am working on a blame tool for wikipedia
http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.ht... thanks, mike
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
wikiquality-l@lists.wikimedia.org