Archiving a page should be pretty safe as long as the archived copy is only
for internal use, that means something like OTRS. If the archived copy is
republished it _might_ be viewed as a copyright infringement.
Still note that the archived copy can be used for automatic verification,
ie extract a quote and check that against a stored value, without
infringing any copyright. If a publication is withdrawn it might be an
indication that something is seriously wrong with the page, and no matter
what the archived copy at WebCitation says the page can't be trusted.
Its really a very difficult problem.
Jeblad
On 1. apr. 2012 14.08, "Helder" <helder.wiki(a)gmail.com> wrote:
Hello Wikidata Folks,
My name is Brent Hecht, and I've done a great deal of research on the
differences and similarities between the language editions. I was really
excited to hear about the Wikidata project moving forward, and I think
some of my research might be of assistance. I'd enjoy being able to help
the community make this important transition.
In particular, my experience navigating interlanguage link conflicts
might be able to help in Phase 1. Please let me know if there's anything
I can do over the short term or long term!
Some of my relevant papers:
[1] Bao, P., Hecht, B., Carton, S., Quaderi, M., Horn, M. and Gergle, D.
2012. Omnipedia: Bridging the Wikipedia Language Gap. CHI '12: 30th
International Conference on Human Factors in Computing Systems (2012).
[2] Hecht, B. and Gergle, D. 2010. The Tower of Babel Meets Web 2.0:
User-Generated Content and Its Applications in a Multilingual Context.
CHI '10: 28th International Conference on Human Factors in Computing
Systems (Atlanta, GA, 2010), 291--300.
[3] Hecht, B. and Gergle, D. 2009. Measuring Self-Focus Bias in
Community-Maintained Knowledge Repositories. Communities and
Technologies 2009: 4th International Conference on Communities and
Technologies (State College, PA, 2009), 11--19.
- Brent
Brent Hecht
Ph.D. Candidate in Computer Science
CollabLab: The Collaborative Technology Laboratory
Northwestern University
w:http://www.brenthecht.com <http://www.brenthecht.com/>
e:brent@u.northwestern.edu <mailto:brent@u.northwestern.edu>
The problem is that you usually don't know what happened. Often the article
lives on behind a paywall, and then you should (must!) use that one. To
republish an archived copy from own repo will be a copyvio in many
jurisdictions.
The interesting thing is (I guess) if we can quote the important fact in
such a way that we don't make a copyvio yet make it possible to validate
the value.
Usually you can quote a fragment and also republish that, but you can't
copy the whole article and make that available for others.
Sorry for being such a pest and saying this! ;)
Jeblad
On 1. apr. 2012 21.52, "Platonides" <platonides(a)gmail.com> wrote:
Dear Lydia,
>Hmmm I have to confess I don't understand this completely and therefor
>can't give you an answer. Could you give me an example of what you are
>talking about and how you see Wikidata fit in?
Hmmmmmmm :-) I am somewhat at loss here. Let start from some basic, then.
1) is there a concise consensual definition of what Wikidata is
expected to achieve?
2) Has this Wikidata project some Charter or terms of reference (TOR),
or dedicated web site, you could give the URL? http://wikidata.org is
redirected to http://en.wikipedia.org/wiki/Main_Page
3) Is there a list of the needs of wikimedia projects that wikidata is
to address ?
4) What is/are going to be the language(s) of Wikidata?
5) What is the Wikidata's architectural framework? WDE (whole digital
ecosystem)? Human communications? The Internet services? Users
applications? Wikimedia?
6) What are the normative choices, strategic objectives and
constraints imposed by the sponsors and the WMF?
7) How the Wikidata project is to liaise with its potential users'
representatives (depending on the response to (5))
Thank you !
jfc
JFC Morfin wrote:
> 2. Since we have a W3C expert: what is the best document/book to get
> a comprehensive and clear (not too massive) documentation on the
> semantic web?
You surely don't want to know all about semantic web - especially the
Ontology stuff with OWL dialects and entailment regimes is far too
academic and won't be part of wikidata because of computational
complexity anyway. In short, you should be *very sceptical* and
cautious every time you stumple upon anything that requires inference
rules. Even trivial inference rules such as those based on owl:sameAs
and rdf:type can be problematic in practice! The less inference you
assume, the better.
I can recommend the "Linked Data Patterns" book by Dodds and Davis:
http://patterns.dataincubator.org/book/
Jakob
--
Verbundzentrale des GBV (VZG)
Digitale Bibliothek - Jakob Voß
Platz der Goettinger Sieben 1
37073 Goettingen - Germany
+49 (0)551 39-10242
http://www.gbv.de
jakob.voss(a)gbv.de
Since some claim that Wikidata's model of representing statements with
their references and the diversity of knowledge in the world outside is
just a way to bring Wikipedia's values to the world of data, Conservapedia
has launched an alternative approach to the challenges we are tackling. In
good open source manner we hope that we can share code, and we wish
Conservadata all the best.
<http://blog.tommorris.org/post/20277406012/conservadata>
Cheers,
Denny
--
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi,
I was just about to suggest the same: there is no simple 1-to-1
relationship between articles in different articles. If Wikidata only
supports 1-to-1, this relation will be end up more fuzzy, because people
will also put "more-or-less" mappings into it. Gregor wrote:
> Would changing the model to:
> Wikipedialink = (Title, LanguageId, Relation, Badge?)
> where relation can be
> broader match
> close match
> exact match
> narrower match
>
> etc. help in expressing these situations? I believe this could prevent
> problems later on.
I'd better limit relations to "exact", "broader" and "narrower" because
"close match" is very fuzzy. Even "exact match" won't be 100% in
practice, so where is the border between "exact" and "close" (or
additional "far match", "almost exact match" etc.). Two directions
"broader" and "narrower" are easier to grasp, also in user interfaces
and for retrieval.
I took the freedom to modify the current model as it's a Wiki ;-)
Cheers
Jakob
--
Verbundzentrale des GBV (VZG)
Digitale Bibliothek - Jakob Voß
Platz der Goettinger Sieben 1
37073 Goettingen - Germany
+49 (0)551 39-10242
http://www.gbv.de
jakob.voss(a)gbv.de
Ivan,
Thanks for the reply. Yes as many people should be brought in about subject
indexing as can be found. The UDC is a faceted classification; LCC (Lib of
Congress Classification) and DDC (Dewey Decimal Classification) are not. UDC
is a multilingual taxonomy (facet=language) and these other two are not. To
me there's little to no competition with UDC.
But that's about what taxonomies are deployed -- I support many and all.
It's the implementing technology that's key here and happily doensn't
require expert panels and study groups. SKOS can handle multiple facets --
via its Collection objects. I view SKOS as a stepping stone to ISO Topic
Maps; I suspect SKOS has been invented by the W3 as a response to Topic Map
functionality not in OWL, but perhaps you can enlighten me further.
Basically I have these questions.
1. Is WP going to be empowered with subject indexes as a core feature? Is
this better an extension?
2. What's the relative priority of subject indexes over other wikidata
requirements (ie after Semantic Infoboxes)
3. Should SKOS or a very close equivalent be subsumed into the WOM?
The answer to 3 probably should echo the realtionship between the WOM and
the OWL, whatever that is intended to be.
Thanks,
John
----------------------------------------------------------------------------
----
Date: Sat, 31 Mar 2012 11:22:43 +0200
From: Ivan Herman <ivan(a)w3.org>
To: "Discussion list for the Wikidata project."
<wikidata-l(a)lists.wikimedia.org>
Subject: Re: [Wikidata-l] Wiki Subject Indexes
Message-ID: <F67C291D-CE31-4624-AD87-7B3ABCB9B612(a)w3.org>
Content-Type: text/plain; charset="us-ascii"
On Mar 31, 2012, at 02:00 , John McClure wrote:
> Hello,
> Can/should wiki subject indexes be a functional requirement of the
wikidata
> project, or of some other? I think navigating a wiki today without a
subject
> index is difficult, to say the least. A subject index seems such a
critical
> component for information libraries! WP's portals are a nice step but
still,
> users seem at the mercy of topical links inserted by authors of the
portal.
> How much better it would be to have a taxonomy of subjects that can be
> associated with a page by ITS author so that the page can be found
> independently of portals.
I see a major issue with the user interface of this (though I agree with
your assessment). It has to be very easy to set the right subject, otherwise
people will not do it.
The dbpedia people extract some rough classification of the articles. It is
not perfect, but may be worth looking at that, too.
> The semantics of SKOS, I suggest, should be baked in to wikis. I also
> suggest faceted UDC [1] or similar inter/national classification scheme be
> one of many that can be referenced by users when browsing any wiki. I
> envision that Subjects would be defined in a namespace as fundamental to a
> wiki as the Category namespace is. Basically, I can see requesting some
> software to correlate my own subject taxonomy with interwikis' (WP's)
> tasxonomies, so that I as a user don't have to manually search each
> interwiki for content relative to my personal list of subjects.
I think the project should reach out to libraries, ie, real experts. The
library world is currently looking at the issues of how to redefine
cataloging standards, how to make them Linked Data friendly, etc; because
that work is still in flux, it may be the ideal time to talk to them. In
view of the importance of WP, libraries cannot allow themselves to ignore
this (I believe...)
Ivan
> Is this an idea before its time, something already considered, or
something
> to consider now?
> Thanks for your thoughts. -- john
>
> [1] http://en.wikipedia.org/wiki/Universal_Decimal_Classification
Hey :)
On Sat, Mar 31, 2012 at 2:18 PM, JFC Morfin <jefsey(a)jefsey.com> wrote:
>
> Lydia,
>
> I have a question. I am interested in "multilinguistics". I define
> multilinguistics as the cybernetics of the mecalanguages, i.e. the
> operational, computer applications and strategic pragmatic coexistence of
> natural (and artificial) languages that can be used between
> man/man/machine/machine. I am, therefore, interested in stabilizing a table
> of all the (meca)languages names and specifics, and I have been working on
> the preparation of a WikiLinguae project for years in trying to
> converge/cooperate the teams of ISO, private, and Open Source tables, within
> the loose framework of the MAAYA network (Members:
> http://maaya.org/spip.php?article40, including UNESCO, ITU, ACALAN,
> LinguaMon, Union Latine, Francophonie, AUF, etc.) as well as to work on
> script homography, which is a DNS problem, etc.
>
> This problem is extended with the support of IDNA2008 consistency (IDNs),
> e-mail addresses, variants (i.e. the same term with different printings, or
> different Unicode Points). I would like to know if this area is part of the
> Wikidata project, or of the way it is planned to address it), polynymy
> issues (i.e. strict cross/languages synonyms), variances (i.e. the variation
> of the semantics of a term, or the appearance of a new term) both in data
> definitions and in data values.
Hmmm I have to confess I don't understand this completely and therefor
can't give you an answer. Could you give me an example of what you are
talking about and how you see Wikidata fit in?
> I would also like to know, from the very beginning, if this WikiLinguae
> project was to be kept separate, should/could ally with Wikidata, or if
> Wikidata had its own project regarding multilinguistics (*). I note here
> that it should be both multilinguistic (to document every language) and
> polylingual (to document them in every language). In addition, should there
> be a Wikimedia extension/version of the "locale" files that would be
> documented cooperatively (in cooperation or not with other "locale"
> directories
>
> Thank you.
> jfc
>
> (*) Multilinguistics by nature is a semiotic discipline with semantic,
> syntax, and pragmatics. It treats every language as equal to the others. It
> should not be confused with "globalization" (eg. Unicode), which is:
>
> * the "internationalization" of the medium (support of ISO 10646) within an
> English framework
> + the localization of the ends (ISO 15897)
> + the filtering of linguistic quoted exchanges as per their language (ISO
> 639), scripts (15924), and administrative authority as a cultural referent
> (ISO 3166:1).
> * and results in langtags (RFC 5646).
>
> Globalization and an open langtag compatible format should be sufficient at
> this stage (due to the limited number of Wikipedia languages) if there is no
> specific need for Wikimedia formats or locale file extensions. Also, the
> WikiLinguae project seeks to be a wiki-based ISO 11179 conformant reference
> "spine" in the matter of languages and cultures: this is not a small task
> and may still call for time.
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Eisenacher Straße 2
10777 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi Everyone,
I recently graduated from University with a Bachelors Degree in Computer
Science. Unfortunately, I don't write any code at work. :( So, I am
interested in helping out with the project! I know how to code and can help
with testing/documentation as well. I am also interested in learning about
what kind of database/ how you plan on storing the semantic information.
Are these definition correct:
Wikipedia- uses MediaWiki as software
MediaWiki- current software which runs Wikipedia
WikiData- new module on MediaWiki that will allow the community to add
semantic information to Wikipedia articles
Cheers!
--
Aniket Karmarkar