Gustavo García wrote:
* Infoboxes could use some kind of hierarchy to avoid
differently the same concept. For ex. president.birth_date or
Musical_Artist.Born could inherit from Person and reuse
Person.birth_date to avoid inconsistencies.
Nice thinking, but this is very unlikely to happen. Infoboxes and
similar templates are created spontaneously by people who are more
interested (and knowledgeable) in 19th century poets or vintage
cars than in data structures. Your kind of structuring would send
them to programming classes before they can start to document
vintage cars and that would kill off their enthusiasm.
One day, a template for vintage car infoboxes is created, having
parameters for make, model, year and picture. Someone starts to
add number of gears and top speed to the articles. In a classic
computer science setting this would be an error and these
"undefined parameters" should be removed. But in Wikipedia it is
a useful extension and the template should be adopted to display
these new values where available.
* What is the relation between Infoboxes and web
Infoboxes grew out of Wikipedia without knowledge about web
microformats. Because vintage car enthusiasts aren't programmers.
So, what can a computer scientist do to assist this messy process?
You can extract semi-structured parameter data from template calls
in the database dumps. You can compile statistics on which
parameter names are most commonly used in various templates (e.g.
"year" vs. "age", "name" vs. "title") and give
advice to how
parameters should best be named in new templates. For each
template you can compile statistics on which parameter names
(defined or not) and values are actually used and provide feedback
on the "Template talk:" page. You can work together with
WikiProjects on the proper use of templates and infoboxes.
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se