You can get the data from here:
http://dumps.wikimedia.org/wikidatawiki/20130417/
All items with all properties and their values are inside the dump. The
questions would be, based on this data, could we make suggestions for:
* when I create a new statement, suggest a property. then suggest a value
* suggest qualifier properties, then suggest qualifier values (there is no
data yet on qualifiers, but this would change soon)
* suggest properties for references, and values
Does this help?
Cheers,
Denny
2013/4/19 Nilesh Chakraborty <nilesh(a)nileshc.com>
Hi,
I am a 3rd year undergraduate student of computer science, pursuing my
B.Tech degree at RCC Institute of Information Technology. I am proficient
in Java, PHP and C#.
Among the project ideas on the GSoC 2013 ideas page, the one particular
idea that seemed really interesting to me is developing an Entity
Suggester for Wikidata. I want to work on it.
I am passionate about data mining, big data and recommendation engines,
therefore this idea naturally appeals to me a lot. I have experience with
building music and people recommendation systems, and have worked with
Myrrix and Apache Mahout. I recently designed and implemented such a
recommendation system and deployed it on a live production site, where I'm
interning at, to recommend Facebook users to each other depending upon
their interests.
The problem is, the documentation for Wikidata and the Wikibase extension
seems pretty daunting to me since I have not ever configured a mediawiki
instance or actually used it. (I am on my way to try it out following the
instructions at
http://www.mediawiki.org/wiki/Summer_of_Code_2013#Where_to_start.) I can
easily build a recommendation system and create a web-service or REST based
API through which the engine can be trained with existing data, and queried
and all. This seems to be a collaborative filtering problem (people who
bought x also bought y). It'll be easier if I could get some help about the
part where/how I need to integrate it with Wikidata. Also, some sample
datasets (csv files?) or schemas (just the column names and data types?)
would help a lot, for me to figure this out.
I have added this email as a comment on the bug report at
https://bugzilla.wikimedia.org/show_bug.cgi?id=46555#c1.
Please ask me if you have any questions. :-)
Thanks,
Nilesh
--
A quest eternal, a life so small! So don't just play the guitar, build one.
You can also email me at contact(a)nileshc.com or visit my
website<http://www.nileshc.com/>
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 |
http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.