The message below was sent to the Board today. Would implementing some
sort of automatic copyvio checker be feasible?
The second part of the email suggests it is too difficult to contact
us about copyright violations. With the addition of the "contact us"
link in the sidebar, I thought this would stop being a problem. Is
there any other way of making it easier?
Angela.
---- Forwarded message ----
> In regards to the continuing copyright issues because some members do
> not
> respect copyrights, I might recommend implementing something like what
> http://copyscape.com uses. From what I can tell, they use a Google API
> to do a
> search of text found in one page to see what other pages have the same
> text.
> Using a similar methodology, you could flag new pages that are
> substantially
> like pages that exist on the Internet for further review. While this
> wouldn't
> tackle all of the copyright violations, it would go a long way towards
> making
> it easier to weed out blatant violations like the one I reported.
>
> The issue of some individuals having absolutely no respect for
> copyrights and
> plagiarism is a serious problem that Wikipedia needs to address. Some
> people
> seem to think that Wikipedia is their personal means of bringing down
> copyright
> laws and "freeing" content. This is a shame because these individuals
> threaten
> the long term possibilities for Wikipedia.
>
> On a related note, it should be easier to report copyright violations on
> the
> Wikipedia website. The current set up is tremendously burdensome to
> figure how
> to report copyright violations. There needs to be a simple link from
> all
> pages to a simple contact form that allows one to report a violation
> without
> having any knowledge of how Wikipedia works. Doing this would put
> members on
> notice that Wikipedia isn't a rogue operation where anything goes and
> that it
> takes copyright issues seriously.
>
I'd like to request that selected parts of the user table be included
in the dumps for statisticical purposes, namely the user_id, user_name
and user_options fields. It would be useful to have this data to
collect statistics on what parts of the preferences are actually being
used, and being able to compare this with the user_id (and user_name)
field would enable checking for what settings regular editors have and
how many change their default settings and so on.
And before people get all up in arms about this then no, this does not
include your email or the hash of your password (those are in
user_email and user_password respectively).
I think our MediaZilla configuration could use some reviewing, I'm
able to change it myself since Tim made me one of the admins of it
recently (yay!), but I'd like to get some input first before I do any
major re-arrangements.
First of all I'd like to remove the patch keyword, instead we can use
advanced search to search for
* Attachment is patch => is equal to => 1
* Attachment is obsolete => is not equal to => 1
This would of course mean that we would have to start marking patches
as obsolete (and explaining why) that don't fit our requirements for
inclusion.
Second, I think our products/component setup is a bit of a mess, I
already split Images/Uploading into two components (Images and
Uploading) but alot more could be done, furthermore they're a bit
confusing, for example if there are problems with search on the
Wikimedia sites it should not go to MediaWiki => Search but MediaWiki
extensions => General/Unknown since Lucene search is an extension. I
don't really have any ideas about how to sort the whole thing out
though.
Just a note; the 'validation' feature will most likely not be turned on
on en.wikipedia.org when we upgrade, since it's not currently in a
usable state.
Problems include, but are not limited to:
* In various places it tries to load metadata for *every* revision of
the page. This would be fatal on the actual Wikipedia, where there are
pages with tens of thousands of revisions. There are likely other severe
performance and scalability issues with it.
* The 'management' interface for defining survey options is not locked
off properly, and is very hard to use if you do get to it.
* Lack of HTML-safety on the UI interface: as a quick hack I added
htmlspecialchars() guards, but things really should be changed to use
wikitext where appropriate; several of the UI messages are currently
displaying raw HTML tags.
If anybody would like to work on it further that would be spiffy;
otherwise it will remain in limbo indefinitely.
-- brion vibber (brion @ pobox.com)
Sorry, pressed shift-enter accidently :)
> * I wnated to get it fixed up last week, but I had an important talk to give, and my harddrive crashed, and I got a PowerBook, which together turned out to be rather
I just wanted to congratulate you on joining our corporate policy with Apple :) Your Sony looked so dead in Berlin already ;-)
Domas
> * I wnated to get it fixed up last week, but I had an important talk to give, and my harddrive crashed, and I got a PowerBook, which together turned out to be rather distracting ;-)
> Problems include, but are not limited to:
>
> * In various places it tries to load metadata for *every* revision of
> the page. This would be fatal on the actual Wikipedia, where there are
> pages with tens of thousands of revisions. There are likely other
> severe performance and scalability issues with it.
Yup. It's still in my famous "OK-it-kinda-works-now-we-wait" stage.
> * The 'management' interface for defining survey options is not locked
> off properly, and is very hard to use if you do get to it.
I wan't sure who should get access to it. It is a single line in the code where to limit access. For it being "hard to use", it basically is used *once* to set up topics, and then not at all (ideally) or very sparsly (to add/delete topics). I don't see that as a reason to keep it "in limbo".
> * Lack of HTML-safety on the UI interface: as a quick hack I added
> htmlspecialchars() guards, but things really should be changed to use
> wikitext where appropriate; several of the UI messages are currently
> displaying raw HTML tags.
Well, they didn't show that raw HTML when I checked 'em in. I'm pretty sure of that one. I'll see if I can fix that.
> If anybody would like to work on it further that would be spiffy;
> otherwise it will remain in limbo indefinitely.
I should have some time during this week, although I'll have to setup mysql/apache/yadayada on my new HDD.
Anyway, we can turn this on with a few days (weeks?) delay, just in case. No need to rush, at least not a technical one ;-)
Magnus
To whom it may concern:
I moved the Enotif helpdesk page from FiverAlpha Wiki to its new location
http://www.wikipage.de/en/index.php/Enotif
E-mail notification 3.x is working at its best on this wiki.
This means, you can play with **all** features as mentioned on my
documentation page http://meta.wikipedia.org/wiki/Enotif
Tim -not Tim Starling- :
thank you for setting up the software, solving the php mail() envelope
problem, and hosting the wikis.
Tom
I've removed the automatic substitution of – and — for
hyphen sequences from 1.5, as it seems to simply cause a neverending
sequence of breakage of links and markup. The special cases that were
added to try to keep it from breaking (some) links and (some) markup of
course made it behave fairly inconsistently, and on the whole it seems
to have been causing trouble far outweighing the utility of making some
dashes slightly more attractive.
Perhaps some future parser that operates in a clean fashion instead of
layering regexes on top of each other will be able to do this in a
consistent, non-breaking manner. For now it doesn't seem worth it.
-- brion vibber (brion @ pobox.com)