On 13.09.2016 11:39, Sebastian Burgstaller wrote:
Hi all,
I think this topic might have been discussed many months ago. For certain data types in the chemical compound space (P233, canonical smiles, P2017 isomeric smiles and P234 Inchi key) a higher character limit than 400 would be really helpful (1500 to 2000 chars (I sense that this might cause problems with SPARQL)). Are there any plans on implementing this? In general, for quality assurance, many string property types would profit from a fixed max string length.
FWIW, I recall that the main reason for the char limit originally was to discourage the use of Wikidata for textual content. Simply put, we did not want Wikipedia articles in the data. Long texts could also make copyright/license issues more relevant (though, in theory, a copyrighted poem could be rather short).
However, given that we now have such a well informed community with established practices and good quality checks, it seems unproblematic to lift the character limit. I don't think there are major technical reasons for having it. Surely, BlazeGraph (the WMF SPARQL engine) should not expect texts to be short, and I would be surprised if they did. So I would not expect problems on this side.
Best, Markus
Best, Sebastian
Sebastian Burgstaller-Muehlbacher, PhD Research Associate Andrew Su Lab MEM-216, Department of Molecular and Experimental Medicine The Scripps Research Institute 10550 North Torrey Pines Road La Jolla, CA 92037 @sebotic
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata