[Erik Moeller dropped from CC list; no response to any emails over past
week on these issues.]
On Sat, 2009-12-12 at 18:39 -0800, Fabrice Florin wrote:
I have added our web engineer Subbu Sastry to this thread, as he would know whether or
not it's feasible for us to give you this data.
Wikinews has a few techies who hack things together for the site.
containing creative spelling mistakes. I'm going to suggest one of his
widgets for NewsTrust (later), I think you and Fabrice will both quite
We've also Jon (ShakataGaNai) who helps with a number of other coding
things. Some less-noticeable people also help with other bits and
Personally I've little experience with developing server-side things
like a web API; but, does 20+ years as a systems analyst help? Mostly
working on Billing Systems and Enterprise Resource Planning, I did do
some XML-spewing designs for that. I'd probably have no problems with
looking at your database structures and identifying the information I
think most useful to expose in an API. Happy to drop Fabrice and some of
the other CC'd people from that discussion. If there's a need for a
non-disclosure agreement to get at such data it won't be the first time
I've signed one.
We don't yet have a full API, though our widgets
function a bit like an API.
One new widget we have been considering is a rating
widget which a third-party could put on their site, to show the NT story rating for a
particular story on that site. It might also be possible to show the source rating we have
for that source, if known. We hadn't planned on doing this right away, but it's in
the queue of things we would consider doing, if requested by one of our partners.
I'd say the advantage of an API is the scripts for your widgets should
be greatly simplified, and you can freely license examples of them.
The drawback is you'll want to do various logs and analyses on API usage
so you can block any particularly abusive sites; just like web spiders
that don't respect robots.txt end up blocked everywhere except SEO
Your request seems a bit different, if I understand it
correctly: you would want to display the ratings for sources cited by your articles, is
that right? if that's the case, it may be sufficient for us to get just the URL. We
would then look up that URL in our DB, and if we have it on file, that would allow us to
provide the story rating and number of reviews. We may also be able to provide the source
rating and number of reviews for the source associated with that story at the same time.
Lastly, it may be possible to provide the source rating and reviews for the source
typically associated with that domain name, though this is a risky proposition, because
often a story featured on a site is not really from that site. So it would be best to ask
for source ratings by specifying a source name, but you would need to request the exact
name we use for that source, which could be prone to human error.
Yes. At the moment there's a significant percentage of Wikinews articles
are what we call "synthesis" articles. They contain no original
research, but are constructed through using multiple independent sources
which must be listed at the foot of the article.
Within the Wikicode this looks as follows:
|title=Name of story, as given by publisher
|author=the article's author(s), if specified by the publisher
|pub=The name of the publisher. This *should* be as listed on Wikipedia
|date=Monthname daynumber, year - as specified by the publisher}}
This was one point another contributor raised off-list; we currently
list all used sources with no regard to their reliability or reputation.
That can see Fox News listed and "supposedly" on a par with the BBC,
Reuters, or PBS. Your own critique of Iain's article on the Garuda
pilot's conviction noted we'd not had contact with some key primary
sources; as independents, with zero financial backing for our reporting
activities, getting that can be challenging. International phone calls
can soon mount up if you're looking for comment from the other side of
the globe. Personally, I've sunk between €500-€1,000 into setting up our
domain, mostly used so we're not emailing people with
addresses like "fluffykitteh1024(a)hotspace.com".com".
In any case, we should probably prioritize the tasks
you are considering, so we know which is most important to you from an editorial
I don't want to end up pushing NewsTrust to develop something that would
have limited use outside that of Wikinews. However, I do think that the
elimination of cross-server scripting vulnerabilities would be a big
selling point for a published API.
Is it more important for you to have your own articles
display a story rating? or to give a rating to the third-party articles cited as sources
for your own articles?
Both, I think. But, that's the beauty of doing it with an API; anyone
could do either.
If it's the latter, how often would you need this
information to be updated? If it's an old story, its story rating is unlikely to
change much after a month or two after its release, so maybe you could settle for a
one-time rating -- the source rating is more likely to change over time, but not by much.
So maybe a once-a-month or less frequent update might be fine.
Bawolff's input on this suggests the volume of requests to NewsTrust
would naturally tail off as articles age. Thus:
* Someone request a Wikinews article.
source template, and sends them to our back end (the ToolServer).
* If less than 10-15 minutes since NewsTrust last queried, back end
returns cached data.
* Else the back end submits a new request
* If NewsTrust returns updated data (instead of an "unchanged" response)
the back end updates its stats and sends that on to the reader via the
Either way, we would need to figure out how important
all this is to you, and if we can squeeze in some simple technology that addresses most of
As you've mentioned, and one of the other headaches we have, something
from AP, Reuters, or AFP can end up on dozens - if not hundreds - of
newspaper sites. Wikinews tends to push for people to go back to the
wire site or, say, Google News' hosting of these. We also push for the
wire to be cited as the author (eg Reuters); that *might* help NewsTrust
consolidate the different URLs because the article title is generally
only changed if the site publishing it applies a house style for
If, perhaps as a more long-term goal for NewsTrust, you were getting
that data you could tie up all the different URLs for a Reuters or AP
story, group under a unique article identifier, and expose that in the
API so, once you've queried with a URL, the API asks for future requests
to use a much shorter identifier.
But this is a good conversation to have, and we
appreciate your thinking about these creative uses of NewsTrust for your site.
I did warn you Wikinewsies will steal anything that isn't nailed down
and watched by armed guards. :-P
Oh, and the widget I said I'd suggest:
Subbu will likely follow this most quickly; it's a freely licensed piece
dictionary lookup of any work a user double clicks on.
It is multilingual, so if NewsTrust account holders could set a "Mother
tongue" option they wouldn't get definitions in an English default, but
their chosen language. (The gadget looks up "example" in Wiktionary,
tracks down the link to the definition in "Mother tongue", and displays
it in a small pop-up window.)
If you'd like to try it out on Wikinews, sign up for an account, log in,
select your preferences at the top of the page, go to the gadgets tab,
and look for and enable Wiktionary Hover.
Brian McNeil <email@example.com>|http://en.wikinews.org/wiki/Brian_McNeil
Content of this message in no way represents the opinions or official position
of the Wikimedia Foundation or any of its projects.