Re: [Wikidata-l] Import from external sources

18 Mar 2012

Are you trying to achieve this from within MediaWiki? Otherwise Google Docs is a good tool
for screen scraping, that can be used to produce csv-files for you wiki from sources
without an API. I wrote about it here, in Swedish: 
http://blogg.svt.se/nyhetslabbet/2012/01/screen-scraping-sa-har-gar-det-til… (assuming
you are Norwegian).
/Leo

Leonard Wallentin
leo_wallentin(a)hotmail.com
+46 (0)735-933 543
Twitter: @leo_wallentin
Skype: leo_wallentin
http://svt.se/nyhetslabbethttp://säsongsmat.nuWikiSkills:
http://wikimediasverige.wordpress.com/2012/03/01/1519/http://nairobikoll.se
...
  Date: Sun, 18 Mar 2012 09:57:34 +0100
 From: jeblad(a)gmail.com
 To: wikidata-l(a)lists.wikimedia.org
 Subject: [Wikidata-l] Import from external sources

 sources, especially those that do not have any prepared an
 well-defined API?

 A rather simple example from the website for Statistics Norway is an
 article on a website like this
 http://www.ssb.no/fobstud/
 and a table like this
 http://www.ssb.no/fobstud/tab-2002-11-21-02.html

 In that example you must follow a link to a new page which you then
 must monitor for changes. Inside that page you can use Xpath to to
 extract a field, and then optionally use something like a regexp to
 identify and split fields. As an alternate solution you might use XLT
 to transform the whole page.

 Anyhow, this can quite easily be formulated both as a parser function
 and a tag function.

 At the same site there is something called "Statistikkbanken"
 (http://statbank.ssb.no/statistikkbanken/) where you can (must) log on
 and then iterate through a sequence of pages.

 Similar data as in the previous example can be found in

http://statbank.ssb.no/statistikkbanken/selectvarval/Define.asp?MainTable=F…
 But it is very difficult to formulate a kind of click-sequence inside that page.

 Any idea? Some kind of click-sequence recording?

 Statistics Norway publish statistics about Norway for free reuse as
 long as they are credited as appropriate.
 http://www.ssb.no/english/help/

 John

 _______________________________________________
 Wikidata-l mailing list
 Wikidata-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l  		 	   		  

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata-l] Import from external sources