Hello.
After a few days of pondering the issues, I would like to explain what I
suggested in my previous message, in more detail and (hopefully) more
clearly.
What I'm about to say is pretty abstract, so it's difficult to convey
the right meaning. Please forgive me if I say something you already
know, or just nonsense :-)
Jesús Quiroga escribió:
I believe a better solution is to design a
domain-specific language, an
idea not very different from your first one.
This DSL would model the interaction between the application and the DB
as it is now, and would be designed to evolve. That's it.
The problem I discuss is how to best access the data store from an
application. I believe the right answer is different for each project,
but it's not difficult to evaluate the alternatives, one by one, in a
given context. I think it is worthwhile to do that in the context of
MediaWiki.
I will refer to wiki modules and databases as if they were 'hosts'
connected to a 'network', to highlight the role of languages in the
operation of the system at runtime.
The first way to access the data store is the 'direct' one:
[polyglot wiki] <--- mysDataL ---> [mysql]
[polyglot wiki] <--- posDataL ---> [postgresql]
[polyglot wiki] <--- db2DataL ---> [db2]
Here, the polyglot wiki module talks to every database using the proper
languages. 'mysDataL' means 'the data language understood by MySQL',
'posDataL' means 'the data language understood by PostgreSQL', etc.
The polyglot wiki promises to learn several languages and to speak them
correctly forever, so, if a new database comes along or any of their
data languages evolves, the polyglot wiki is forced to adapt at a
potentially great cost. Besides, any change to the database schema can
trigger lots of updates to the wiki code, and be very costly too.
The advantages of this way are well known: it is fast, no need to do
design, easy to understand.
The drawbacks are apparently few, but devastating: verbose and complex
code in multiple places in the wiki module, very costly to maintain,
even more costly to evolve. All changes cost a lot, in time and effort.
The second way to access the data store that is usually considered is
the 'indirect' one:
[wiki] <--- wikiDataL ---> [polyglot translator]
[polyglot translator] <--- mysDataL ---> [mysql]
[polyglot translator] <--- posDataL ---> [postgresql]
[polyglot translator] <--- db2DataL ---> [db2]
Here, wikiDataL means 'some relational data definition and manipulation
language suitable for use by the wiki'.
The polyglot translator promises to learn wikiDataL and the other
dialects and to evolve with them, so it has all the problems the wiki
had in the direct way, but now the cost is lower because a lot of
complexity is 'hidden' inside the translator and can't reach the wiki.
As a result, wiki code is not updated as much, and it's much cleaner and
less verbose.
The advantages of this way are: wiki module code is simpler, cost of
evolution is reduced.
The drawbacks are apparently many: it's slower, design is needed, harder
to understand, a new language (wikiDataL), translator can be very
complex. However, the need to reduce the cost to achieve change is
usually so great that these inconveniences are minor in comparison.
Now the interesting bit begins. A third possible way to access the data
store, the 'interpreted' one:
[wiki] <--- wikiNeedL ---> [polyglot interpreter]
[polyglot interpreter] <--- mysDataL ---> [mysql]
[polyglot interpreter] <--- posDataL ---> [postgresql]
[polyglot interpreter] <--- db2DataL ---> [db2]
Here, wikiNeedL means 'some language adequate for the wiki to express
its data access needs and nothing else'.
wikiNeedL is the domain-specific language I wrote about in my previous
message.
The differences between wikiDataL and wikiNeedL are mainly these:
- wikiNeedL would contain just enough wiki concepts to express the
wiki's needs, so it's effectively confined to that domain. wikiDataL
belongs to the relational data model domain, which is quite different.
- in general, wikiNeedL would have different semantics than the
dialects understood by the databases, so the translation step becomes
more like interpretation, rather than just syntactic transformations.
wikiDataL usually has the same semantics than the dialects.
- wikiNeedL would contain just enough concepts to satisfy current
needs, and will be open to extension. wikiDataL aims to be
general-purpose and to fulfill current and future needs.
The main reason to consider the 'interpreted' way is, of course, that it
helps reduce even more the cost to achieve change.
So that's what I was talking about. I will say more about the
differences between the indirect and the interpreted ways in a future
message.
Thanks for your attention.