Hoi,

What is the added value of yet another database? If this is about Wikipedia, it will only be a subset. When this data is to be generated, it makes sense to not duplicate a lot of work that is happening in Wikidata. It makes more sense to include the data and work on improving its quality.

The issue with quality in any database is in the power of updating data as it happens. This is true for any Wikipedia, Wikidata and any other external source. We can make Wikidata powerful enough to accommodate nearly every visualisation.

PS if data is incorrect in Wikipedia it ends up wrong in Wikidata. When we have conflicting data we can improve and seek sources to prove a point either way. Your approach will not help quality in the big picture.

Thanks,

GerardM

On 11 October 2016 at 00:11, Ian Seyer <ian.seyer@gmail.com> wrote:

Hi there,

I would like this to be a place to discuss the grant proposed here:
https://meta.wikimedia.org/wiki/Grants:Project/Arc.heolo.gy

The goal of the project is to provide a powerful semantic library to analyze and visualize relationships that might otherwise be hidden within the immense amount of knowledge contained in Wikipedia.

Any feedback or critiques based on any aspect of the project including tech stack, methodology, community engagement practices, or goals, would be HUGELY appreciated.

We are also looking for volunteers who are interested in devops, graph technology, NLP (word2vec or otherwise), or data visualization!

Thank you for your time,
Ian Seyer
--
　　　　　　　　　　　　　　　　　　　╭╮
　　　　　　　　　　　　　　　　　╭╮┃┃
　　　　　　　╭╮　　　　　　╭╮┃┃┃┃╭╮
　　　　　　　┃┃　　╭╮　　┃╰╯╰╯┃┃╰
　　　　　╭╮┃┃╭╮┃┃╭╮┃　　　　╰╯
　╭╮　　┃┃┃┃┃╰╯┃┃╰╯
　┃┃╭╮┃╰╯┃┃　　╰╯
╮┃╰╯┃┃　　╰╯
╰╯　　┃┃
　　　　╰╯

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l