Wikimedia-l August 2009

wikimedia-l@lists.wikimedia.org

160 participants
115 discussions

[Foundation-l] Fwd: Election vote strikes

by Austin Hair

Forwarded from Betsy Megas, who's subscribed under a different address. Please read. Austin ---------- Forwarded message ---------- From: Betsy Megas <Betsy(a)strideth.com> Date: Tue, Aug 11, 2009 at 10:26 PM Subject: Election vote strikes To: "foundation-l-owner(a)lists.wikimedia.org" <foundation-l-owner(a)lists.wikimedia.org> Due to an error in a script that was used to generate the list of authorized voters for this election, roughly 300 votes were cast by users who were not qualified based on the posted election rules (requiring that voters have made at least 600 edits before 01 June 2009 across Wikimedia wikis and have made at least 50 edits between 01 January and 01 July 2009). Those votes will be removed by the election committee prior to the election being tallied by Software in the Public Interest. Once this is completed, the election results will be tallied and announced shortly thereafter. Questions regarding why a vote was struck can be addressed to board-elections(a)lists.wikimedia.org. For the committee, Dvortygirl

14 years, 9 months

[Foundation-l] Hotlinked images Was: GLAM-WIKI report

by Gregory Maxwell

On Wed, Aug 12, 2009 at 3:58 AM, Tim Starling<tstarling(a)wikimedia.org> wrote: [snip] > Brianna Laugher was receptive to the idea of having > Wikimedia projects hotlink or cache images from galleries. So there have been a number of statements against doing something like this, but (unsurprisingly) I don't think they have been strong enough stated or hit all the arguments that I think are important. So please humour me for a moment. I think hotlinking images is something we ought not to do for several independent reasons. (1) There is no reason to do so. The so far cited reasons for GLAM interest in this are Branding and Statistics. Hotlinking or caching would do nothing to improve branding— Most of the time a hot linked image looks just like a local one to users. Whatever branding we'd find acceptable could be accomplished as well or better locally. Statistics gathering is something that is interesting to many of our contributors, we cand should have good statistics for everything (and caching would be useless for statistics), so hotlinking should create no improvement. GLAMS have spent money building their own databases, yes. But ours are an additional copy, our problem, and not a significant cost. The only other reason I can see for hotlinking would be collecting resellable marketing data on Wikipedia viewers, and I do not believe that this would be a use we'd wish to support. (I'm not making a value judgement here— If that is indeed someone's goal thats fine— only that it's not one WMF would intentionally support). See below for more… (2) Hotlinking has enormous privacy problems When the rubber hits the road NDAs are ineffective: People make mistakes. Governments and ISPs snoop. Privacy polices are often bad and allow things which would horrify people. Hotlinking would greatly increase readers exposure to information leaks. Some random museum has no business knowing that I loaded the pederasty article just because some art was placed in it. Wikimedia's handling of reader privacy ought to be leading-edge trend-setting stuff. That would be an nearly impossible goal if media were inlined from many third party sites. (3) It significantly reduces the atomicity of the Wikimedia projects. Today are *things*, objects you can obtain (± temporary problems with the dump system), archive, data-mine, etc. I have complete (though not current right now) copies of Wikipedia in all languages along with all images and other media, as well as the core software. Not just partial bits and pieces, but the whole thing. External links are a clear boundary between what is in Wikipedia and what isn't. ... and the stuff *in* wikipedia is all freely licensed and available for download. They are now all tracked with a common revision control system, have common (if bad…) metadata. External dependency would lower reliability and make the generally less tractable. It would become more difficult to retain backups and historical records. Perhaps some day Wikipedia will be too big to maintain any singular copy of for purely technical reasons, but we are a long long way away from that now! So basically I think there are a bunch of practical and principled problems with hotlinking, but that hot-linking isn't actually needed. Really good upload systems that preserve metadata and provide good links to external resources? Statistics collection? These are good an uncontroversial things. They don't require hotlinking. Cheers—

14 years, 9 months

Re: [Foundation-l] Knol, a year later

by Mike Godwin

As perhaps some people here will recall, I was always skeptical of Knol's ability to enter the collaborative knowledge space. The reasons discussed here, including SJ's mentions of the issues of structuring public collaboration, are no doubt valid, but to me -- and of course it may be said that this is my Lawyer Vision(tm) kicking in -- the primary problem for Knol was lack of compatibility with the existing dominant free licenses used by Wikimedia projects and others. In short, it was difficult for Knol to build on the work of other collaborative freely licensed projects without, as a practical matter, violating those licenses. (We saw countless examples of people attempting to import Wikipedia content into Knol, for example, and played a bit of whack-a-mole with those folks.) But to me the takeaway from this error of Knol's licensing design is not that Knol can't work -- it's that it actually could work, if properly thought through. So my view right now is the Wikimedia community can't be complacent about Knol's apparent failure -- properly adjusted and redesigned, it could have quite an impact on us. We're going to have to continue to give serious attention to all the issues, from quality to community to legality, that give us an advantage in terms of fueling creative collaboration, as we go forward. The next Knol can't be relied upon to make the same mistakes. --Mike

14 years, 9 months

[Foundation-l] Positive mention of Wikimedia sites in a web privacy study:

by Gregory Maxwell

This paper is making the rounds: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1446862 "This is a pilot study of the use of “Flash cookies” by popular websites. We find that more than 50% of the sites in our sample are using flash cookies to store information about the user. Some are using it to "respawn" or re-instantiate HTTP cookies deleted by the user. Flash cookies often share the same values as HTTP cookies, and are even used on government websites to assign unique values to users. Privacy policies rarely disclose the presence of Flash cookies, and user controls for effectuating privacy preferences are lacking. " Inside it says: "We encountered Flash cookies on 54 of the top 100 sites. […] Ninety-eight of the top 100 sites set HTTP cookies (only wikipedia and wikimedia.org lacked HTTP cookies in our tests). These 98 sites set a total of 3,602 HTTP cookies." Kudos to the WMF for avoiding gratuitous reader tracking. Other people *are* paying attention to the privacy implications of this kind of user-invisible behavior.

14 years, 9 months

Re: [Foundation-l] Report to the Board April 2009

by Sue Gardner

(Adding to what Michael said.) Yes, we're trying to catch up. May will be posted tomorrow, and June is being worked on right now. These reports started off as simple staff activity reports (when I joined the Foundation two years ago), and when the staff was small they were fairly easy to put together quickly. Over time, we've added in new structured info such as the comScore Media Metrix data, lists of media interviews, fundraising totals, etc. That takes a little longer to gather -- for example, we don't have finalized fundraising totals until 20 days following the close of month, and comScore data can take even longer. Plus, growth in staff means it takes that much longer to collect and synthesize everyone's input. Meantime, we've been working towards a parallel data-driven monthly report -- it would include comScore data, financial information, and metrics aimed at assessing participation and quality. The financial information for that report is now regularly produced on a monthly basis, and we are pretty close to having good-enough reach, quality and participation metrics regularly produced as well, thanks to Erik Zachte and others. The goal of the data-driven report is to focus less on staff activity, and more on a high-level assessment of the overall health of the Foundation and its projects. Once we have the data-driven report in regular production, we can rethink reporting overall. For example, we might decide to publish the monthly data report + a richer text-based staff activities report once a quarter. That would mean the activities report could be less focused on small incremental changes (the staff worked on X, the staff continued Y) and more focused on providing greater detail about a small number of high-priority initiatives, e.g., the strategy project, the usability project, the bookshelf project, etc. Or, we could publish the data report, plus a lightweight, simple monthly activities report focused purely on staff work -- new hires and that kind of thing. I definitely sympathize with people wanting to be connected and aware of what's going on with the staff. I'd be curious to know what kinds of information people find most useful of what we publish today, and what you'd like to see more of -- and also what you think of the other channels we publish through, e.g., the tech blog, the Foundation blog, press releases, etc. And I do also appreciate your patience as we get caught up on this most recent backlog :-) Thanks, Sue ------Original Message------ From: Benjamin Lees Sender: foundation-l-bounces(a)lists.wikimedia.org To: Wikimedia Foundation Mailing List ReplyTo: Wikimedia Foundation Mailing List Sent: Aug 11, 2009 6:58 PM Subject: Re: [Foundation-l] Report to the Board April 2009 On Tue, Aug 11, 2009 at 6:46 PM, Sue Gardner <sgardner(a)wikimedia.org> wrote: > Report to the Wikimedia Foundation Board of Trustees > > Covering: April 2009 > Prepared by: Sue Gardner, Executive Director, Wikimedia > Foundation > Prepared for: Wikimedia Foundation Board of Trustees > I really like these reports, but they'd be more useful if they came sooner after the events they describe. Will you be able to catch up to a <1-month delay in the near future? (I wouldn't mind if the reports for May, June, and July were condensed, if that's what it took.) _______________________________________________ foundation-l mailing list foundation-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

14 years, 9 months

[Foundation-l] Donation Button Enhancement Update

by Rand Montoya

Hey All-- We've made some modifications to the process and time line for the Donation Button Enhancement project. You can find and comment on them here: http://meta.wikimedia.org/wiki/Fundraising_2009/Donation_buttons_upgrade I appreciate all the feedback so far. -Rand -- Rand Montoya Head of Community Giving Wikimedia Foundation www.wikimedia.org Email: rand(a)wikimedia.org Phone: 415.839.6885 x615 Fax: 415.882.0495 Cell: 510.685.7030 “At some future time, I hope to have something witty, intelligent, or funny in this space.”

14 years, 9 months

Re: [Foundation-l] [ol-discuss] Open Library, Wikisource, and cleaning and translating OCR of Classics

by Samuel Klein

Onion sourcing. That would be a nice improvement on simple cite styles. On Tue, Aug 11, 2009 at 12:10 PM, Gregory Crane<gregory.crane(a)tufts.edu> wrote: > There are various layers to this onion. The key element is that books and > pages are artifacts in many cases. What we really want are the logical > structures that splatter across pages. And across and around works... > First, we have added a bunch of content -- esp. editions of Greek and Latin > sources -- to the Internet Archive holdings and we are cataloguing editions > that are the overall collection, regardless of who put them there. This goes > well beyond the standard book catalogue records -- we are interested in the > content not in books per se. Thus, we may add hundreds of records for a Is there a way to deep link to a specific page-image from one of these works without removing it from the Internet Archive? > We would like to have useable etexts from all of these editions -- many of > which are not yet in our collections. Many of these are in Greek and need a > lot of work because the OCR is not very good. So bad OCR for them exists, but no usable etexts? > To use canonical texts, you need book/chapter/verse markup and you need > FRBR-like citations ... deep annotations... syntactic analyses, word sense, > co-reference... These are nice features, but perhaps you can develop a clean etext first, and overlay this metadata in parallel or later on. > My question is what environments can support contributions at various > levels. Clearly, proofreading OCR output is standard enough. > > If you want to get a sense of what operations need ultimately to be > supported, you could skim > http://digitalhumanities.org/dhq/vol/3/1/000035.html. That's a good question. What environments currently support OCR proofreading and translation, and direct links to page-images of the original source? This is doable, with no special software or tools, via wikisource (in multiple languages, with interlanguage links and crude paragraph alignment) and commons (for page images). The pages could also be stored in other repositories such as the Archive, as long as there is an easy way to link out to them or transclude thumbnails. [maybe an InstaCommons plugin for the Internet Archive?] That's quite an interesting monograph you link to. I see six main sets of features/operations described there. Each of them deserves a mention in Wikimedia's strategic planning. Aside from language analysis, each has significant value for all of the Projects, not just wikisource. OCR tools * OCR optimization: statistical data, page layout hints * Capturing page layout logical structures CROSS REFERENCING * Quote, source, plagiarism idenfication. * Named entity identification (automatic for some entities? hints) * Automatic linking (of urls, abbrv. citations, &c), markup projection TEXT ALIGNMENT * Canonical text services (chapter/verse equivalents) * Version Analysis b/t versions. * Translation alignment TRANSLATION SUPPORT * Automated translation (seed translations, hints for humans) * Translation dictionaries (on mouseover?) CROSS-LANGUAGE SEARCHING * Cross-referencing across translations * Quote identification across translations LANGUAGE ANALYSIS * Word analysis: word sense discovery, morphology. * Sentence analysis: syntactic, metrical (poetry) > Greg > > John Vandenberg wrote: >> >> On Tue, Aug 11, 2009 at 3:00 PM, Samuel Klein<meta.sj(a)gmail.com> wrote: >> >>> >>> ... >>> Let's take a practical example. A classics professor I know (Greg >>> Crane, copied here) has scans of primary source materials, some with >>> approximate or hand-polished OCR, waiting to be uploaded and converted >>> into a useful online resource for editors, translators, and >>> classicists around the world. >>> >>> Where should he and his students post that material? >>> >> >> I am a bit confused. Are these texts currently hosted at the Perseus >> Digital Library? >> >> If so, they are already a useful online resource. ;-) >> >> If they would like to see these primary sources pushed into the >> Wikimedia community, they would need to upload the images (or DjVu) >> onto Commons, and the text onto Wikisource where the distributed >> proofreading software resides. >> >> We can work with them to import a few texts in order to demonstrate >> our technology and preferred methods, and then they can decide whether >> they are happy with this technology, the community, and the potential >> for translations and commentary. >> >> I made a start on creating a Perseus-to-Wikisource importer about a year >> ago...! >> >> Or they can upload the djvu to Internet Archive.. or a similar >> depositories... and see where it goes from there. >> >> >>> >>> Wherever they end up, the primary article about each article would >>> surely link out to the OL and WS pages for each work (where one >>> exists). >>> >> >> Wikisource has been adding OCLC numbers to pages, and adding links to >> archive.org when the djvu files came from there (these links contain >> an archive.org identifier). There are also links to LibraryThing and >> Open Library; we have very few rules ;-) >> >> -- >> John Vandenberg >> > >

14 years, 9 months

[Foundation-l] Block update

by Stevertjgo

Mark W. wrote: > It looks to me like Austin did exactly what he should've so I'm not > sure why you're implying he made an incorrect decision. Exactly what > did he do wrong in your opinion? Austin may have done exactly right, but his lack of responsiveness - just as with Arbcom - just as with Cary - made it an issue. As it currently stands the list moderator has blocked three of my posts on different threads, and is also ignoring my direct request to be unblocked. Here's an idea: Arbcom - respond to case subject's questions and comments and maybe organize some case-centered discussion. Here's an idea: Mailing list creators - respond to requests for new list creation. Here's an idea: Mailing list moderators - respond to requests for clarification about blocks and state blocks openly. Nathan wrote: > Stevertigo is more interested in the debate, in my opinion, than any > particular outcome. I do love to argue, but this comment is not accurate. The truth is I just like it better when people don't act like dicks. This includes angels, supermodels, Presidents, founders, Arbcom members, foundation bureaucrats, and myself (I'm admittedly feeling a bit forced into the concept). > If you find that people don't take your side even after you have "utterly > destroyed them, point by point" then perhaps you should pick a new approach. I understand that people don't like having their pet concepts taken apart. I mean nothing personal by it - simply separate from your defunct concept, admit cordially that I might have a point, and there will be no issue. Sources of bullshit will often think that the bull-fighter is evil. "What of it? At least the [bullshit] is disposed of." (after Mencken) -Stevertigo

14 years, 9 months

[Foundation-l] Google updates web search

by Hay (Husky)

Google has put a preview online of a new version of their search engine, with a new infrastructure: http://googlewebmastercentral.blogspot.com/2009/08/help-test-some-next-gene… You can test it here: http://www2.sandbox.google.com/ Things are a lot faster, and the results differ from the current version. I'm wondering if this will have any impact on the number of visitors on our projects, because so many of our visitors come through Google links. -- Hay

14 years, 9 months

[Foundation-l] Two videos FYI: Handling difficult people

by Sue Gardner

Many of us talk/think a lot about how to reduce conflict in our projects. For those who're interested, I was recently pointed towards two relevant videos -- I'm posting them here in hopes they might be useful for others. So, for whoever's interested, here is: * Donnie Berkholz's recent talk at Open Source Bridge, titled "Assholes are killing your project." Donnie, a council member at Gentoo Linux, advocates establishment of a friendly culture including a code of conduct, and maintenance of that culture via simple mechanisms for problem reporting and resolution, plus a clear focus on mission. Unfortunately the audio here isn't terrific, so it's not super-easy to follow. http://blip.tv/file/2444432 * Two open source engineers at Google, Ben Collins-Sussman and Brian Fitzpatrick, with a talk called "How Open Source Projects Survive Poisonous People" Upshot: Preserve your project's attention and focus, build a healthy community and fortify it with good community practices, be on the lookout for problems, and disinfect where necessary, including marginalization/ignoring of difficult people, and booting them out if you need to. http://www.youtube.com/watch?v=ZSFDm3UYkeE Personally, I have also gotten some good value out of Bill Eddy's book High-Conflict People in Legal Disputes. Bill is a mediator, lawyer and former social worker who found himself repeatedly encountering destructive people in his work, and not knowing how to disarm them or disengage from them. He wrote High Conflict People to help other mental health and legal professionals recognize, understand and work productively with various types of conflict-seeking personalities -- but IMO its usefulness extends way beyond the legal system; it's relevant for our work too. http://www.amazon.com/gp/product/0981509053/ref=cm_li_v_cr_self?tag=linkedi… Thanks, Sue -- Sue Gardner Executive Director Wikimedia Foundation 415 839 6885 office Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! http://wikimediafoundation.org/wiki/Donate

14 years, 9 months

← Newer
1
...
6
7
8
9
10
11
12
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Wikimedia-l August 2009