I would recommend to buffer the raw data in some external service like http://datahub.io/ , that way the extracted data can be used for other purposes too and some version control can be kept. For the data extraction mechanism you could take some ideas from the DBpedia extraction framework http://dbpedia.org/documentation You could have a table listing the modules, their execution periodicity and a scheduler sitting on Labs. https://www.mediawiki.org/wiki/Wikimedia_Labs
I think that could become a cool project.
On Tue, Jul 23, 2013 at 11:06 PM, Michael Hale hale.michael.jr@live.comwrote:
I've looked at Boost briefly before when I needed to train a machine learning algorithm on a C++ text corpus. I felt like the Mathematica standard library provided more functionality to start from, but of course it is not open source. I'm curious if a coding project could work in a wiki manner as opposed to traditional peer review, but that's not something I'm committed to. That is an interesting point about making an "External data import task force". If we continue with the example of solar irradiance measurements, I'm unclear if it meets the current notability requirements for Wikidata. We don't want to try to replicate all public external data in Wikidata. I think the ideal scenarios for this example would be if a Lua or Python module automatically updated the chart on Commons maybe annually and then the article would just always show the newest version from Commons. Then if someone is reading the "Solar cycle" article they can click on the chart, go to Commons, and see in the description that it is automatically updated by a script module. Then they can click to go to the script module and from there they can find the source database and reuse the code that imports from the database for their own purposes.
From: dacuetu@gmail.com Date: Tue, 23 Jul 2013 22:49:40 -0400
To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode
What do you think about starting a project to import data from external websites into Wikidata? If you start an "External data import task force" I'm sure there will be quite a lot of interest in creating a collection of modules/bots to import data.
Looking at the project http://www.boost.org/ it also seems quite interesting.
Micru
On Tue, Jul 23, 2013 at 9:08 PM, Michael Hale hale.michael.jr@live.comwrote:
I've had some discussions with people on the Mathematica Stack Exchange site about the project. There is interest, but most people don't seem to have as much free time as me. So I've decided just to start the project as a way to organize and integrate my own code and code that I find. I'm just putting it all in subpages of my Wikipedia user page for now. If I ever run into problems I will retreat to a more constrained mechanism. I kicked things off last night by adding some code for the "Solar cycle" article. The article has a nice chart that shows the total solar irradiance measurements over the past few decades, although it hasn't been updated in a few years. So I added some code to grab the raw data from the World Radiation Center in Switzerland. http://en.wikipedia.org/wiki/User:Wakebrdkid/Wikicode
http://meta.mathematica.stackexchange.com/questions/1057/collaborative-packa...
On Sat, Jul 13, 2013 at 2:33 AM, Gerard Meijssen < gerard.meijssen@gmail.com> wrote:
Hoi Michael,
The one thing that makes it easy for you is that you speak English. For other languages there are not the same amount and diversity of resources. While I have my reservations about the feasibility of what Scott proposes, his proposal is for all the Wikipedia languages and then some.
If he is able to achieve his thing "only" for the Wikipedia languages it will be a roaring success in my eyes.
Thanks, GerardM
On 13 July 2013 09:21, Michael Hale hale.michael.jr@live.com wrote:
Hi Scott,
I'm personally very interested in the future of online education, and I appreciate your enthusiasm about the subject. However, I wonder if your energy would be more productive if it was directed to an older project. Have you heard of Wikiversity? It is already multilingual and doesn't have advertisements from hosting on Wikia. However, even though I knew about Wikiversity when I was still in high school, I've actually been surprised at how little I've used it over the years. I think it is trying to solve a problem that I never encountered. I think learning is one of the easiest things to do on the internet, and it has been even easier in the post-Wikipedia era now that so much of the most important information has been well summarized, consistently formatted, and heavily linked. If I check my YouTube subscriptions right now, I get free, full-length lectures in my feed from Berkeley, Stanford, MIT, Carnegie Mellon, Harvard, Yale, UCLA, Technion, UPenn, IIT Bangalore, and Cornell. I remember when MIT OpenCourseWare first came out, and it's been incredible to see how e-learning has flourished since then. I have over a hundred YouTube channels that are primarily educational. My needs are met if I know what I'm looking for or if I just want to be surprised by some current, stimulating educational content. The software library initiative we have been discussing in this thread would be a hybrid of a wiki and a regular source control system typically used in open source projects. Like I said, I can still think of several reasons why it might not work, but I keep finding myself thinking a few times every week that maybe we should try.
Date: Fri, 12 Jul 2013 17:58:38 -0700 From: worlduniversityandschool@gmail.com
To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode
Hi Michael and Wikidatans,
I just created a beginning, wiki Software Library at World University and School - see Software Libraries: http://worlduniversity.wikia.com/wiki/Software_Libraries for the initial resources - and added links to this in the following WUaS, wiki subjects -
see the WUaS Computer Science wiki subject page for this and related links
http://worlduniversity.wikia.com/wiki/Computer_Science#World_University_and_...
Educational Software: http://worlduniversity.wikia.com/wiki/Educational_Software -
Library Resources: http://worlduniversity.wikia.com/wiki/Library_Resources-
Programming: http://worlduniversity.wikia.com/wiki/Programming .
WUaS, which is like Wikipedia with MIT OCW, plans to develop in all 7,105+ languages and 204+ countries, - for open, wiki teaching and learning, in addition to free, C.C., MIT OCW-centric, university degrees, beginning in the U.N. languages after English - so not only will this extensible WUaS Software Libraries find form in all languages and countries, but WUaS's plans to move to Wikidata will make this a database. MIT-centric WUaS students will eventually add to, and develop, these libraries greatly I suspect.
Best regards, Scott
On Tue, Jul 9, 2013 at 10:19 AM, Michael Hale hale.michael.jr@live.comwrote:
I completely agree that wiki-projects are exemplary organic growth models compared to the way plans are made by Congress. I certainly support using information technology to move governments toward more direct and efficient forms of democracy. I would love to see things like income tax levels determined in real-time based on the average preferences of everyone's e-government web preferences. Many people still don't have internet access though. I think when a person comes up with a plan they typically consider 2 or 3 factors in a qualitative manner in their mental model of the system and disregard other side effects as insignificant. That paper used a model with 10 or so factors in a quantitative manner. There are many things it leaves out, but such plans are still useful as counterweights in policy arguments against ideas that are extreme in other directions. Regardless, a person couldn't design by hand the circuit layout of the processors that are currently in our computers and phones, and the number of problems that are too big for our brains that computers are helping us with is expanding. If we had a way to design computational models in a wiki manner then we could just add the irrigation and insect migration effects to the model to gauge its sustainability, then other people could make each part of the model more accurate, etc. I think it would help us find real solutions to many problems in a much faster way than listening to political speeches or exchanging paragraphs of imprecise human language on social networking sites.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Etiamsi omnes, ego non
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l