Hello.
After a few days of pondering the issues, I would like to explain what I suggested in my previous message, in more detail and (hopefully) more clearly.
What I'm about to say is pretty abstract, so it's difficult to convey the right meaning. Please forgive me if I say something you already know, or just nonsense :-)
Jesús Quiroga escribió:
I believe a better solution is to design a domain-specific language, an idea not very different from your first one. This DSL would model the interaction between the application and the DB as it is now, and would be designed to evolve. That's it.
The problem I discuss is how to best access the data store from an application. I believe the right answer is different for each project, but it's not difficult to evaluate the alternatives, one by one, in a given context. I think it is worthwhile to do that in the context of MediaWiki.
I will refer to wiki modules and databases as if they were 'hosts' connected to a 'network', to highlight the role of languages in the operation of the system at runtime.
The first way to access the data store is the 'direct' one:
[polyglot wiki] <--- mysDataL ---> [mysql] [polyglot wiki] <--- posDataL ---> [postgresql] [polyglot wiki] <--- db2DataL ---> [db2]
Here, the polyglot wiki module talks to every database using the proper languages. 'mysDataL' means 'the data language understood by MySQL', 'posDataL' means 'the data language understood by PostgreSQL', etc.
The polyglot wiki promises to learn several languages and to speak them correctly forever, so, if a new database comes along or any of their data languages evolves, the polyglot wiki is forced to adapt at a potentially great cost. Besides, any change to the database schema can trigger lots of updates to the wiki code, and be very costly too.
The advantages of this way are well known: it is fast, no need to do design, easy to understand. The drawbacks are apparently few, but devastating: verbose and complex code in multiple places in the wiki module, very costly to maintain, even more costly to evolve. All changes cost a lot, in time and effort.
The second way to access the data store that is usually considered is the 'indirect' one:
[wiki] <--- wikiDataL ---> [polyglot translator]
[polyglot translator] <--- mysDataL ---> [mysql] [polyglot translator] <--- posDataL ---> [postgresql] [polyglot translator] <--- db2DataL ---> [db2]
Here, wikiDataL means 'some relational data definition and manipulation language suitable for use by the wiki'.
The polyglot translator promises to learn wikiDataL and the other dialects and to evolve with them, so it has all the problems the wiki had in the direct way, but now the cost is lower because a lot of complexity is 'hidden' inside the translator and can't reach the wiki. As a result, wiki code is not updated as much, and it's much cleaner and less verbose.
The advantages of this way are: wiki module code is simpler, cost of evolution is reduced. The drawbacks are apparently many: it's slower, design is needed, harder to understand, a new language (wikiDataL), translator can be very complex. However, the need to reduce the cost to achieve change is usually so great that these inconveniences are minor in comparison.
Now the interesting bit begins. A third possible way to access the data store, the 'interpreted' one:
[wiki] <--- wikiNeedL ---> [polyglot interpreter]
[polyglot interpreter] <--- mysDataL ---> [mysql] [polyglot interpreter] <--- posDataL ---> [postgresql] [polyglot interpreter] <--- db2DataL ---> [db2]
Here, wikiNeedL means 'some language adequate for the wiki to express its data access needs and nothing else'.
wikiNeedL is the domain-specific language I wrote about in my previous message.
The differences between wikiDataL and wikiNeedL are mainly these: - wikiNeedL would contain just enough wiki concepts to express the wiki's needs, so it's effectively confined to that domain. wikiDataL belongs to the relational data model domain, which is quite different. - in general, wikiNeedL would have different semantics than the dialects understood by the databases, so the translation step becomes more like interpretation, rather than just syntactic transformations. wikiDataL usually has the same semantics than the dialects. - wikiNeedL would contain just enough concepts to satisfy current needs, and will be open to extension. wikiDataL aims to be general-purpose and to fulfill current and future needs.
The main reason to consider the 'interpreted' way is, of course, that it helps reduce even more the cost to achieve change.
So that's what I was talking about. I will say more about the differences between the indirect and the interpreted ways in a future message.
Thanks for your attention.