There has been a lot of talk over the last year or so of how and when to move MediaWiki to a service oriented architecture [0]. So much so that it is actually one of the a marquee topics at the upcoming Developer Summit.
I think the problem statement for the services RFC is dead on in describing issues that MediaWiki and the Wikimedia Foundation face today with the current monolithic MediaWiki implementation. We have organic entanglements between subsystems that make reasoning about the code base difficult. We have a growing need to API access to data and computations that have historically been only presented via generated HTML. We have cross-team and cross-project communication issues that lead to siloed implementations and duplication of effort.
The solution to these issues proposed in the RFC is to create independent services (eg Parsoid, RESTBase) to implement features that were previously handled by the core MediaWiki application. Thus far Parsoid is only required if a wiki wants to use VisualEditor. There has been discussion however of it being required in some future version of MediaWiki where HTML is the canonical representation of articles {{citation needed}}. This particular future may or may not be far off on the calendar, but there are other services that have been proposed (storage service, REST content API) that are likely to appear in production use at least for the Foundation projects within the next year.
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
[0]: https://www.mediawiki.org/wiki/Requests_for_comment/Services_and_narrow_inte...
Bryan
On Thu, Jan 15, 2015 at 10:38 PM, Bryan Davis bd808@wikimedia.org wrote:
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
This is not even about shared hostings: it is pretty obvious that running a bunch of heterogenous services is harder than just one PHP app, especially if you don't have dedicated ops like we at WMF do. Therefore, the question is: what will we gain and what will we lose by making MediaWiki unusable by 95% of its current user base?
95% is pretty extreme.
I have always questioned the balance being struck here, and would welcome an adjustment of the minimum requirements to run MediaWiki. In many cases, if we can just require shell access we can automate away the complexity for the typical use cases.
- Trevor
On Thu, Jan 15, 2015 at 10:49 PM, Max Semenik maxsem.wiki@gmail.com wrote:
On Thu, Jan 15, 2015 at 10:38 PM, Bryan Davis bd808@wikimedia.org wrote:
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
This is not even about shared hostings: it is pretty obvious that running a bunch of heterogenous services is harder than just one PHP app, especially if you don't have dedicated ops like we at WMF do. Therefore, the question is: what will we gain and what will we lose by making MediaWiki unusable by 95% of its current user base?
-- Best regards, Max Semenik ([[User:MaxSem]]) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Jan 15, 2015 at 11:27 PM, Trevor Parscal tparscal@wikimedia.org wrote:
95% is pretty extreme.
Can't agree on this, as the number covers:
- All curent shared host users - Those who use an entry-level VPS and would suddenly discover that with a few extra node services their RAM is insufficient - People who aren't technical enough to run a bunch of services glued together with PHP - Especially if these services are even a tiny bit more complex to install than `apt-get install` - People who run MW in strict corporate environments where installing just another piece of software is a big deal
I initially considered writing 99%, but decided that some users might consider upgrading their plans, etc. Still, 99% is proably the accurate estimate of current installations potentially affected by servicization.
I have always questioned the balance being struck here, and would welcome an adjustment of the minimum requirements to run MediaWiki. In many cases, if we can just require shell access we can automate away the complexity for the typical use cases.
Yep, maintaining the ability to run MW in crappy environments has always been a losing battle, and especially since we're planning to ditch PHP 5.3 support soon (which would render MW incompatible with a lot of crappy hosts) it might be the time to officially declare that we don't care about supporting environments without shell access. Still, shell access does not equate to root access or even to running Parsoid from userspace, that's a different story.
On 15-01-16 02:48 AM, Max Semenik wrote:
Can't agree on this, as the number covers [...]
An extra datapoint here: I think I can reasonably consider myself an atypical user at the tail end of the sysadmin curve, and yet the principal reason why I had delayed installing VisualEditor on a mediawiki I administer is Parsoid.
Not because it's unusually complicated to deploy and install - it's about as trivial as possible under Ubuntu at least given that there is a package - but because it changes a number of fundamental assumptions about what a mediawiki install /is/ and what maintaining it involves.
Service-based has implications on reliability and security and - no matter how well we manage those implications - they add moving parts and complexities. They increase the monitoring burden, and increase the specialization and knowledge necessary to understand and debug issues.
I'm not saying any of those things are /bad/ or insurmountable, but that they modify the landscape in a pretty drastic ways and that this /will/ impact the vast majority of users.
And I can honestly say that if the intent is to maintain a parallel working implementation in pure php then - with my reuser hat - I don't believe a word of it. :-) I happen to have been a long-time postgres user, and yet when 1.23 rolled around and I wanted to start playing around with VE I found that my only realistic alternative was to migrate to mysql: postgres support had always been sporadic and not-quite-there already, and the increased number of issues with the increased complexity meant that it moved squarely in not-maintained-enough territory.
So yeah; there *is* a very significant impact.
-- Marc
Trevor Parscal tparscal@wikimedia.org writes:
95% is pretty extreme.
Not according to WikiApiary. The largest host for wikis, even before Wikimedia, is DreamHost followed closely by GoDaddy and other shared hosters such as 1&1.[1]
Yes, Amazon and Linode are also near the top of the list, but from experience, I can tell you that most of those aren't using anything but PHP, and, if you're lucky, the Math and Scribunto extensions.
Even the most asked-for service of the SOA -- Parsoid for Visual Editor -- is almost (but not quite) exclusively on WMF wikis.[2]
Developers within an organisation like the WMF don't have to worry about what it takes to stand up and maintain a wiki -- they have Operations for that.
Most people running wikis based on MediaWiki don't have a budget or a staff exclusively for their wiki. They don't have the resources or knowledge to run a dedicated server.
There are things that can be done to address this -- make it easier to set up a SOA-using wiki on your chosen virtual machine provider -- but they need dedicated resources.
This probably means a really dedicated (group of) volunteer(s) or a new focus on non-WMF use of MediaWiki by the WMF. This seems to be the direct result of the the decisions made by developers who've implemented some very compelling features without (as Robla seemed to say recently) any overall architectural direction.
Mark.
Footnotes: [1] https://wikiapiary.com/wiki/Host:Hosts
[2] https://wikiapiary.com/wiki/Extension:VisualEditor
On Jan 16, 2015 2:49 AM, "Max Semenik" maxsem.wiki@gmail.com wrote:
On Thu, Jan 15, 2015 at 10:38 PM, Bryan Davis bd808@wikimedia.org wrote:
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
This is not even about shared hostings: it is pretty obvious that running
a
bunch of heterogenous services is harder than just one PHP app, especially if you don't have dedicated ops like we at WMF do. Therefore, the question is: what will we gain and what will we lose by making MediaWiki unusable
by
95% of its current user base?
-- Best regards, Max Semenik ([[User:MaxSem]])
Im quite concerned by this. Well shared hosting might be on the decline, I still feel like most third part users dont have the competence to do much more than ftp some files to their server and run the install script.
--bawolff
On Thu Jan 15 2015 at 11:34:58 PM Brian Wolff bawolff@gmail.com wrote:
On Jan 16, 2015 2:49 AM, "Max Semenik" maxsem.wiki@gmail.com wrote:
On Thu, Jan 15, 2015 at 10:38 PM, Bryan Davis bd808@wikimedia.org
wrote:
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
This is not even about shared hostings: it is pretty obvious that running
a
bunch of heterogenous services is harder than just one PHP app,
especially
if you don't have dedicated ops like we at WMF do. Therefore, the
question
is: what will we gain and what will we lose by making MediaWiki unusable
by
95% of its current user base?
-- Best regards, Max Semenik ([[User:MaxSem]])
Im quite concerned by this. Well shared hosting might be on the decline, I still feel like most third part users dont have the competence to do much more than ftp some files to their server and run the install script.
These days I'm not convinced it's our job to support every possible scale of wiki install. There's several simpler and smaller wiki solutions for people who can't do more than FTP a few files to a folder.
While I'm not entirely convinced about expanding the scope of MW's dependencies just yet I am pretty convinced these days that "you need shell access to something resembling a real server" is not an untenable place to be in for our minimum requirements, especially as VPS prices continue to fall.
-Chad
On 16 January 2015 at 07:38, Chad innocentkiller@gmail.com wrote:
These days I'm not convinced it's our job to support every possible scale of wiki install. There's several simpler and smaller wiki solutions for people who can't do more than FTP a few files to a folder.
In this case the problem is leaving users and their wiki content unsupported. Because they won't move while it "works", even as it becomes a Swiss cheese of security holes. Because their content is important to them.
This is the part of the mission that involves everyone else producing the sum of human knowledge. They used our software, if we're abandoning them then don't pretend there's a viable alternative for them. You know there isn't.
- d.
On Fri, Jan 16, 2015 at 4:29 AM, David Gerard dgerard@gmail.com wrote:
On 16 January 2015 at 07:38, Chad innocentkiller@gmail.com wrote:
These days I'm not convinced it's our job to support every possible scale of wiki install. There's several simpler and smaller wiki solutions for people who can't do more than FTP a few files to a folder.
In this case the problem is leaving users and their wiki content unsupported. Because they won't move while it "works", even as it becomes a Swiss cheese of security holes. Because their content is important to them.
This is the part of the mission that involves everyone else producing the sum of human knowledge. They used our software, if we're abandoning them then don't pretend there's a viable alternative for them. You know there isn't.
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago). There's a separate org that gets a grant from WMF to handle third party use, and it's funded just well enough to keep the lights on.
Take a look at the current state of MediaWiki on the internet. I'd be surprised if less than 99% of the MediaWiki wikis in existence are out of date. Most are probably running a version from years ago. The level of effort required to upgrade MediaWiki and its extensions that don't list compatibility with core versions is past the skill level of most people that use the software. Even places with a dedicated ops team find MediaWiki difficult to keep up to date. Hell, I find it difficult and I worked for WMF on the ops team and have been a MediaWiki dev since 1.3.
I don't think adding a couple more services is going to drastically alter the current situation.
- Ryan
On Jan 16, 2015, at 11:27 AM, Ryan Lane rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 4:29 AM, David Gerard dgerard@gmail.com wrote:
On 16 January 2015 at 07:38, Chad innocentkiller@gmail.com wrote:
These days I'm not convinced it's our job to support every possible scale of wiki install. There's several simpler and smaller wiki solutions for people who can't do more than FTP a few files to a folder.
In this case the problem is leaving users and their wiki content unsupported. Because they won't move while it "works", even as it becomes a Swiss cheese of security holes. Because their content is important to them.
This is the part of the mission that involves everyone else producing the sum of human knowledge. They used our software, if we're abandoning them then don't pretend there's a viable alternative for them. You know there isn't.
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago). There's a separate org that gets a grant from WMF to handle third party use, and it's funded just well enough to keep the lights on.
Take a look at the current state of MediaWiki on the internet. I'd be surprised if less than 99% of the MediaWiki wikis in existence are out of date. Most are probably running a version from years ago. The level of effort required to upgrade MediaWiki and its extensions that don't list compatibility with core versions is past the skill level of most people that use the software. Even places with a dedicated ops team find MediaWiki difficult to keep up to date. Hell, I find it difficult and I worked for WMF on the ops team and have been a MediaWiki dev since 1.3.
I don't think adding a couple more services is going to drastically alter the current situation.
- Ryan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This sounds like a problem we need to fix, rather than making it worse. I'd most wikis are not up to date then we should work on making it easier to keep up to date, not making it harder. Any SOA approach is sadly DOA for any shared host; even if the user has shell access or other means of launching processes (so excluding almost every free host in existence here), there is no guarantee that a particular environment such as node.js or Python is installed or at the correct version (and similarly no guarantee for a host to install them for you). Such a move, while possibly ideal for the WMF, would indeed make running on shared hosting nearly if not entirely impossible. The net effect is that people will keep their existing MW installs at their current version or even install old versions of the software so they can look like Wikipedia or whatnot, rife with unpatched security vulnerabilities. I cannot fathom that the WMF would be ok with leaving third-party users in such a situation.
Moving advanced things into services may make sense, but it should not come at the expense of making the core worthless for third parties; sensible refactors could still be done on core while leaving it as pure PHP. Services, if implemented, should be entirely optional (as an example, Parsoid is not required to have a base working wiki. I would like to stick to that model for any future service as well).
Regards, Ryan Schmidt
On Fri, Jan 16, 2015 at 10:21 AM, Ryan Schmidt skizzerz@gmail.com wrote:
This sounds like a problem we need to fix, rather than making it worse. I'd most wikis are not up to date then we should work on making it easier to keep up to date, not making it harder. Any SOA approach is sadly DOA for any shared host; even if the user has shell access or other means of launching processes (so excluding almost every free host in existence here), there is no guarantee that a particular environment such as node.js or Python is installed or at the correct version (and similarly no guarantee for a host to install them for you). Such a move, while possibly ideal for the WMF, would indeed make running on shared hosting nearly if not entirely impossible. The net effect is that people will keep their existing MW installs at their current version or even install old versions of the software so they can look like Wikipedia or whatnot, rife with unpatched security vulnerabilities. I cannot fathom that the WMF would be ok with leaving third-party users in such a situation.
You'd be wrong. WMF has been ok with it since they dropped efforts for supporting third parties (and knew that most MW installs were insecure well before that). It would be great if the problem was solved, but people have been asking for it for over 10 years now and it hasn't happened. I don't expect it to happen any time soon. There's only a couple substantial MW users and neither of them care about third party usage.
Moving advanced things into services may make sense, but it should not come at the expense of making the core worthless for third parties; sensible refactors could still be done on core while leaving it as pure PHP. Services, if implemented, should be entirely optional (as an example, Parsoid is not required to have a base working wiki. I would like to stick to that model for any future service as well).
I agree with other people that MW as-is should be forked. Until this happens it'll never get proper third party support.
- Ryan
I challenge the very foundation of arguments based on this 95% statistic.
The 95% statistic is bogus, not because it's inaccurate, but because it's misleading.
The number of MediaWiki instances out there is meaningless. The value we get from 3rd party use correlates much more strongly with the number of users, not the number of instances.
Consider the experience of using a MediaWiki instance running on a shared host. MediaWiki doesn't actually perform well enough in this environment to support a significant amount of usage. It's simply not possible that shared host installs can achieve enough usage to be relevant.
- Trevor
On Fri, Jan 16, 2015 at 10:21 AM, Ryan Schmidt skizzerz@gmail.com wrote:
On Jan 16, 2015, at 11:27 AM, Ryan Lane rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 4:29 AM, David Gerard dgerard@gmail.com
wrote:
On 16 January 2015 at 07:38, Chad innocentkiller@gmail.com wrote:
These days I'm not convinced it's our job to support every possible scale of wiki install. There's several simpler and smaller wiki
solutions
for people who can't do more than FTP a few files to a folder.
In this case the problem is leaving users and their wiki content unsupported. Because they won't move while it "works", even as it becomes a Swiss cheese of security holes. Because their content is important to them.
This is the part of the mission that involves everyone else producing the sum of human knowledge. They used our software, if we're abandoning them then don't pretend there's a viable alternative for them. You know there isn't.
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago). There's a separate org that gets a grant from WMF to handle third party use, and it's funded
just
well enough to keep the lights on.
Take a look at the current state of MediaWiki on the internet. I'd be surprised if less than 99% of the MediaWiki wikis in existence are out of date. Most are probably running a version from years ago. The level of effort required to upgrade MediaWiki and its extensions that don't list compatibility with core versions is past the skill level of most people that use the software. Even places with a dedicated ops team find
MediaWiki
difficult to keep up to date. Hell, I find it difficult and I worked for WMF on the ops team and have been a MediaWiki dev since 1.3.
I don't think adding a couple more services is going to drastically alter the current situation.
- Ryan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This sounds like a problem we need to fix, rather than making it worse. I'd most wikis are not up to date then we should work on making it easier to keep up to date, not making it harder. Any SOA approach is sadly DOA for any shared host; even if the user has shell access or other means of launching processes (so excluding almost every free host in existence here), there is no guarantee that a particular environment such as node.js or Python is installed or at the correct version (and similarly no guarantee for a host to install them for you). Such a move, while possibly ideal for the WMF, would indeed make running on shared hosting nearly if not entirely impossible. The net effect is that people will keep their existing MW installs at their current version or even install old versions of the software so they can look like Wikipedia or whatnot, rife with unpatched security vulnerabilities. I cannot fathom that the WMF would be ok with leaving third-party users in such a situation.
Moving advanced things into services may make sense, but it should not come at the expense of making the core worthless for third parties; sensible refactors could still be done on core while leaving it as pure PHP. Services, if implemented, should be entirely optional (as an example, Parsoid is not required to have a base working wiki. I would like to stick to that model for any future service as well).
Regards, Ryan Schmidt
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 16/01/15 19:05, Trevor Parscal wrote:
I challenge the very foundation of arguments based on this 95% statistic.
The 95% statistic is bogus, not because it's inaccurate, but because it's misleading.
The number of MediaWiki instances out there is meaningless. The value we get from 3rd party use correlates much more strongly with the number of users, not the number of instances.
Consider the experience of using a MediaWiki instance running on a shared host. MediaWiki doesn't actually perform well enough in this environment to support a significant amount of usage. It's simply not possible that shared host installs can achieve enough usage to be relevant.
The 95% statistic may not be meaningful, but neither is this. The number of users involved does not reflect the importance of the information presented any more than that a project exists at all.
The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.
This applies whether a project has several thousand active users building a general encyclopedia of common topics, or a team of ten working on a collection of bird names, or even one single individual putting together a chronicle of their family history. MediaWiki is a platform that empowers all of these, and to turn your back on them directly contradicts Wikimedia's overall mission.
-I
On Fri, Jan 16, 2015 at 12:09 PM, Isarra Yos zhorishna@gmail.com wrote:
The 95% statistic may not be meaningful, but neither is this. The number of
users involved does not reflect the importance of the information presented any more than that a project exists at all.
The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.
This applies whether a project has several thousand active users building a general encyclopedia of common topics, or a team of ten working on a collection of bird names, or even one single individual putting together a chronicle of their family history. MediaWiki is a platform that empowers all of these, and to turn your back on them directly contradicts Wikimedia's overall mission.
I believe the claim is being made that many of those people with tiny one-off self-hosted projects are *not* being well-served right now by a MediaWiki instance that's out of date with security holes and no new features.
Here's where I tend to make the comparison to WordPress:
WordPress is open source and available for self-hosting (including on shared hosts with PHP and MySQL), but a lot of people find it's a pain to maintain -- even with the HUGE usability and upgrade improvements that WordPress folks have made in the last few years.
I personally maintain my own WordPress blog, and pretty much just occasionally make a security update and otherwise hardly touch it. Plenty of other people do this too, with varying degrees of success staying up to date and fixing occasional operational problems.
But lots of people also use WordPress's hosted service as well. No need to manually install security upgrades, maintain your own vhost, etc.
In our comparison, we have MediaWiki for self-hosters, and we have hosted projects for Wikimedia, but if you want a hosted wiki that's not covered in ads your choices are vague and poorly documented.
I would argue that many small projects would be better served by good wiki farm hosting than trying to maintain their own sites.
Sure, that's not for everyone -- but I think the people who are best served by running their own servers have a lot of overlap with the people who would be capable of maintaining multiple services on a virtual machine.
-- brion
On 16/01/15 20:28, Brion Vibber wrote:
On Fri, Jan 16, 2015 at 12:09 PM, Isarra Yos zhorishna@gmail.com wrote:
The 95% statistic may not be meaningful, but neither is this. The number of
users involved does not reflect the importance of the information presented any more than that a project exists at all.
The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.
This applies whether a project has several thousand active users building a general encyclopedia of common topics, or a team of ten working on a collection of bird names, or even one single individual putting together a chronicle of their family history. MediaWiki is a platform that empowers all of these, and to turn your back on them directly contradicts Wikimedia's overall mission.
I believe the claim is being made that many of those people with tiny one-off self-hosted projects are *not* being well-served right now by a MediaWiki instance that's out of date with security holes and no new features.
Here's where I tend to make the comparison to WordPress:
WordPress is open source and available for self-hosting (including on shared hosts with PHP and MySQL), but a lot of people find it's a pain to maintain -- even with the HUGE usability and upgrade improvements that WordPress folks have made in the last few years.
I personally maintain my own WordPress blog, and pretty much just occasionally make a security update and otherwise hardly touch it. Plenty of other people do this too, with varying degrees of success staying up to date and fixing occasional operational problems.
But lots of people also use WordPress's hosted service as well. No need to manually install security upgrades, maintain your own vhost, etc.
In our comparison, we have MediaWiki for self-hosters, and we have hosted projects for Wikimedia, but if you want a hosted wiki that's not covered in ads your choices are vague and poorly documented.
I would argue that many small projects would be better served by good wiki farm hosting than trying to maintain their own sites.
Sure, that's not for everyone -- but I think the people who are best served by running their own servers have a lot of overlap with the people who would be capable of maintaining multiple services on a virtual machine.
-- brion
Comparing it to wordpress is a false equivalence, however. There, Wordpress is both generic host and developer/maintainer, but here Wikimedia, the developer/maintainer, only hosts its own projects. Not others. No other host will be able to achieve the same scope and resources as Wikimedia, and even if what a host has is /good/, they still will not have anywhere near the same upstream clout.
And this is exactly the problem that I think many of us are objecting to. As third-party users, even when we are hosters ourselves, we have little clout, few resources, and yet somehow we are expected to stand on our own? How?
One of MediaWiki's strongest features is its flexibility, too, and that is something that wiki hosting services, even relatively good ones, often cannot effectively support due to limited scope. Even wanting fairly common things like SMW narrows options down considerably, and if you're building your own extensions in order to deal with a particular kind of data specific to your own research? Then what?
In many cases when folks run their own wikis, this is why - they either need something specific, or want to maintain some form of control over their own content/communities that isn't necessarily afforded otherwise. Otherwise, they wouldn't be bothering at all, because even as is it is considerable extra work to do this.
-I
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago).
{{citation needed}}
On Fri, Jan 16, 2015 at 1:05 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org
wrote:
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago).
{{citation needed}}
There was a WMF engineering meeting where it was announced internally. I was the only one that spoke against it. I can't give a citation to it because it was never announced outside of WMF, but soon after that third party support was moved to a grant funded org, which is the current status quo.
Don't want to air dirty laundry, but you asked ;).
- Ryan
On Fri, Jan 16, 2015 at 1:14 PM, Ryan Lane rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago).
{{citation needed}}
There was a WMF engineering meeting where it was announced internally. I was the only one that spoke against it. I can't give a citation to it because it was never announced outside of WMF, but soon after that third party support was moved to a grant funded org, which is the current status quo.
I think the confusion between "third party support" and "an open source project" is unhelpful. We're obviously an open source project with lots of contributors who aren't paid (and many of them are motivated by Wikimedia's mission), it's just that the project puts primary emphasis on the Wikimedia mission, and only secondary emphasis on third party needs. I don't think it's a dirty secret that we moved to a model of contracting out support to third parties -- there was even an RFP for it. ;-)
Whether this is the best model, or whether it's time to think about alternatives, is always up for debate. We just have a legitimate tension between needing to focus as an org on what donors are supporting us for, vs. potentially very tangentially related needs (PostgreSQL support for some enterprise wiki installation).
Erik
On Fri, Jan 16, 2015 at 1:20 PM, Erik Moeller erik@wikimedia.org wrote:
I think the confusion between "third party support" and "an open source project" is unhelpful. We're obviously an open source project with lots of contributors who aren't paid (and many of them are motivated by Wikimedia's mission), it's just that the project puts primary emphasis on the Wikimedia mission, and only secondary emphasis on third party needs. I don't think it's a dirty secret that we moved to a model of contracting out support to third parties -- there was even an RFP for it. ;-)
There's open source in technicality and open source in spirit. MediaWiki is obviously the former. It's hard to call MediaWiki open source in spirit when the code is dumped over the wall by its primary maintainer. Third parties get security fixes, but little to no emphasis is put into making it usable for them. Projects that involve making the software usable for third parties are either rejected or delegated to WMF projects that never get resourced because they aren't beneficial to WMF (effectively licking all those cookies). MediaWiki for third parties is effectively abandonware since no one is maintaining it for that purpose.
All of this is to say: Wikimedia Foundation shouldn't consider third party usage because it doesn't as-is. I don't understand why there's a conversation about it. It's putting constraints on its own architecture model because of a third party community it doesn't support.
- Ryan
On 16/01/15 21:20, Erik Moeller wrote:
On Fri, Jan 16, 2015 at 1:14 PM, Ryan Lane rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago).
{{citation needed}}
There was a WMF engineering meeting where it was announced internally. I was the only one that spoke against it. I can't give a citation to it because it was never announced outside of WMF, but soon after that third party support was moved to a grant funded org, which is the current status quo.
I think the confusion between "third party support" and "an open source project" is unhelpful. We're obviously an open source project with lots of contributors who aren't paid (and many of them are motivated by Wikimedia's mission), it's just that the project puts primary emphasis on the Wikimedia mission, and only secondary emphasis on third party needs. I don't think it's a dirty secret that we moved to a model of contracting out support to third parties -- there was even an RFP for it. ;-)
According to the text of the RfP, it was requesting proposals for managing the releases themselves. Those are nothing more than the very end result of the development process, and a part that only faces a subset of users. The process and content itself is a much larger problem that cannot so easily be shuffled away, which was, I thought, the entire reason why the WMF was specifically focussing on that instead. Unlike the releases themselves, this part is important to all of us, not just some.
Whether this is the best model, or whether it's time to think about alternatives, is always up for debate. We just have a legitimate tension between needing to focus as an org on what donors are supporting us for, vs. potentially very tangentially related needs (PostgreSQL support for some enterprise wiki installation).
Erik
This tension applies to all of the non-Wikipedia projects, not just MediaWiki. Would you say that Commons gathering up images for use on Wikisource is also something that shouldn't be done, as this is likewise not a part of the movement that donors tend to know about?
-I
On Fri, Jan 16, 2015 at 4:14 PM, Ryan Lane rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 1:05 PM, Brad Jorsch (Anomie) < bjorsch@wikimedia.org
wrote:
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open
Source
project quite a while ago (at least 2 years ago).
{{citation needed}}
There was a WMF engineering meeting where it was announced internally. I was the only one that spoke against it. I can't give a citation to it because it was never announced outside of WMF, but soon after that third party support was moved to a grant funded org, which is the current status quo.
So this was never publicly announced and has had no visible effect, to the point that the latest version of all the code is still publicly available under a free license and volunteers are still encouraged to contribute?
Or is this conflating "having someone else release tarballs" with "abandoning MediaWiki as an open source project"? That's an extremely big stretch.
On Fri, Jan 16, 2015 at 1:23 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org
wrote:
So this was never publicly announced and has had no visible effect, to the point that the latest version of all the code is still publicly available under a free license and volunteers are still encouraged to contribute?
At the time there were projects about to start for improving the installer, making extension management better, and a few other third-party things. They were dropped to "narrow scope". In practice third-party support was the only thing that year dropped from scope. So, maybe not a visible effect because it maintained the status quo, but there was definitely an effect.
- Ryan
On Jan 16, 2015 5:14 PM, "Ryan Lane" rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 1:05 PM, Brad Jorsch (Anomie) <
bjorsch@wikimedia.org
wrote:
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open
Source
project quite a while ago (at least 2 years ago).
{{citation needed}}
There was a WMF engineering meeting where it was announced internally. I was the only one that spoke against it. I can't give a citation to it because it was never announced outside of WMF, but soon after that third party support was moved to a grant funded org, which is the current status quo.
Don't want to air dirty laundry, but you asked ;).
- Ryan
The mw release whatever its called is hardly at the centre of mediawiki the open source project. They do regular releases - which is great but that's essentially all they do (which is fine, that is their function). They arent handling very much third party support (eg on irc) or doing much in way of third party dev work or even planning. But yet these things still happen. When it does happen its usually volunteers but also some wmf staff (perhaps in a volunteer capacity) who do them.
MediaWiki as an open source project may be in tough times in many ways, but it is not dead (yet).
--bawolff
+1 to everything Brian said.
-Chad
On Sat, Jan 17, 2015, 3:38 AM Brian Wolff bawolff@gmail.com wrote:
On Jan 16, 2015 5:14 PM, "Ryan Lane" rlane32@gmail.com wrote:
On Fri, Jan 16, 2015 at 1:05 PM, Brad Jorsch (Anomie) <
bjorsch@wikimedia.org
wrote:
On Fri, Jan 16, 2015 at 12:27 PM, Ryan Lane rlane32@gmail.com wrote:
What you're forgetting is that WMF abandoned MediaWiki as an Open
Source
project quite a while ago (at least 2 years ago).
{{citation needed}}
There was a WMF engineering meeting where it was announced internally. I was the only one that spoke against it. I can't give a citation to it because it was never announced outside of WMF, but soon after that third party support was moved to a grant funded org, which is the current
status
quo.
Don't want to air dirty laundry, but you asked ;).
- Ryan
The mw release whatever its called is hardly at the centre of mediawiki the open source project. They do regular releases - which is great but that's essentially all they do (which is fine, that is their function). They arent handling very much third party support (eg on irc) or doing much in way of third party dev work or even planning. But yet these things still happen. When it does happen its usually volunteers but also some wmf staff (perhaps in a volunteer capacity) who do them.
MediaWiki as an open source project may be in tough times in many ways, but it is not dead (yet).
--bawolff _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I don't think adding a couple more services is going to drastically alter the current situation.
MediaWik core and most of the extensions actually only depend on php and mysql, both are de facto standard included most in webhosting packages, which makes MediaWiki to run (in theory) on a wide range of web hosting packages, and updating isn't too difficult, if you follow the steps on the upgrade page on mediawiki.org [1] (ok, more difficult as other software packages). Adding new services, like parsoid or other stuff, makes upgrading core and extensions more complicated: -> you need to update the service as well (or hold it on a compatible version) (see my last point, too) -> you need shell access (most web hosting packages don't offer shell access, which isn't good, yeah, but maybe, for some wiki administrators it's better, that they don't have shell access (even without root access)) -> maybe for some services it's more as just "run apt-get upgrade" or upload your ftp files to the server and run the update script, which makes the whole process much more complicated
[1] https://www.mediawiki.org/wiki/Manual:Upgrading
-----Ursprüngliche Nachricht----- Von: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von Ryan Lane Gesendet: Freitag, 16. Januar 2015 18:28 An: Wikimedia developers Betreff: Re: [Wikitech-l] The future of shared hosting
On Fri, Jan 16, 2015 at 4:29 AM, David Gerard dgerard@gmail.com wrote:
On 16 January 2015 at 07:38, Chad innocentkiller@gmail.com wrote:
These days I'm not convinced it's our job to support every possible scale of wiki install. There's several simpler and smaller wiki solutions for people who can't do more than FTP a few files to a folder.
In this case the problem is leaving users and their wiki content unsupported. Because they won't move while it "works", even as it becomes a Swiss cheese of security holes. Because their content is important to them.
This is the part of the mission that involves everyone else producing the sum of human knowledge. They used our software, if we're abandoning them then don't pretend there's a viable alternative for them. You know there isn't.
What you're forgetting is that WMF abandoned MediaWiki as an Open Source project quite a while ago (at least 2 years ago). There's a separate org that gets a grant from WMF to handle third party use, and it's funded just well enough to keep the lights on.
Take a look at the current state of MediaWiki on the internet. I'd be surprised if less than 99% of the MediaWiki wikis in existence are out of date. Most are probably running a version from years ago. The level of effort required to upgrade MediaWiki and its extensions that don't list compatibility with core versions is past the skill level of most people that use the software. Even places with a dedicated ops team find MediaWiki difficult to keep up to date. Hell, I find it difficult and I worked for WMF on the ops team and have been a MediaWiki dev since 1.3.
I don't think adding a couple more services is going to drastically alter the current situation.
- Ryan _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi!
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
I think we're trying to fulfill a bit of a contradictory requirement here - running on the same software both the site of the size of *.wikipedia.org and a 1-visit-a-week-maybe $2/month shared hosting install. I think it would be increasingly hard to be committed to both equally. However, with clear API architecture we could maybe have alternatives - i.e. be able to have the same service performed by a superpowered cluster or by a PHP implementation on the URL on the same host. Of course, the latter would have much worse performance and maybe reduced feature set. How large is the delta we'd need to decide. But if we have clear architecture I think it should be possible to define minimum capability and have ability to degrade from full-power install to reduced-capability install (e.g. - blazingly fast fulltext search or just limited slow 'LIKE %word%' search?), while still keeping the same architecture. Is this something we could do (or, having the API, let the willing people from the community do it)? Is the $2/month use case important enough to actually do it?
On Fri, Jan 16, 2015 at 2:49 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
However, with clear API architecture we could maybe have alternatives - i.e. be able to have the same service performed by a superpowered cluster or by a PHP implementation on the URL on the same host.
The problem there is that the PHP implementation is likely to code-rot, unless we create really good unit tests that actually run for it.
It would be less bad if the focus were on librarization and daemonization (and improvement) of the existing PHP code so both methods could share a majority of the code base, rather than rewriting things from scratch in an entirely different language.
and maybe reduced feature set.
That becomes a hazard when other stuff starts to depend on the non-reduced feature set.
On Jan 16, 2015 11:07 AM, "Brad Jorsch (Anomie)" bjorsch@wikimedia.org wrote:
On Fri, Jan 16, 2015 at 2:49 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
However, with clear API architecture we could maybe have alternatives - i.e. be able to have the same service performed by a superpowered cluster or by a PHP implementation on the URL on the same host.
The problem there is that the PHP implementation is likely to code-rot, unless we create really good unit tests that actually run for it.
It would be less bad if the focus were on librarization and daemonization (and improvement) of the existing PHP code so both methods could share a majority of the code base, rather than rewriting things from scratch in an entirely different language.
and maybe reduced feature set.
That becomes a hazard when other stuff starts to depend on the non-reduced feature set.
-- Brad Jorsch (Anomie) Software Engineer Wikimedia Foundation
I like this approach. After all it worked well enough for image thumbnailing and job queue.
Writing things in different languages just reminds me of what a mess math support used to be.
--bawolff
Hi!
The problem there is that the PHP implementation is likely to code-rot, unless we create really good unit tests that actually run for it.
With service architecture, there should be a test suite running against the service and verifying the compliance with the API definition. So if the PHP implementation passes that, it should be fine.
It would be less bad if the focus were on librarization and daemonization (and improvement) of the existing PHP code so both methods could share a majority of the code base, rather than rewriting things from scratch in an entirely different language.
I think language is secondary here to other concerns. I.e. if the scale requirements require implementing something that is best written in language X, then requiring it to be all in PHP may just not be practical for the scale.
That becomes a hazard when other stuff starts to depend on the non-reduced feature set.
True, but I think there could still be minimal API and extended API. E.g. search engine could support just word match search or advanced expressions, or a database may be transactional with high levels or reliability or non-transactional storage with minimal guarantees. Of course, that means some functions would not be available everywhere - but I don't see a good way to avoid it when the requirements diverge.
On 01/16/2015 01:49 AM, Stas Malyshev wrote:
I think we're trying to fulfill a bit of a contradictory requirement here - running on the same software both the site of the size of *.wikipedia.org and a 1-visit-a-week-maybe $2/month shared hosting install. I think it would be increasingly hard to be committed to both equally.
This has always been my question independent of the specific choices made or being made to address this. So, some answers could be:
* Yes, it is possible and for that we need to continue with a monolithic install, and if VisualEditor/Flow, etc. is needed, then things like Parsoid would have to become part of core so continue to support WMF's product offerings.
* It is difficult to support installs equally well at the extremities of scaling / performance requirements, and for WMF's scale and features (new ones being developed and conceived on an ongoing basis), we need a different architecture where there are independent services that can be developed and maintained independently, but perhaps with packaging solutions, a large subset of existing wiki installs can be supported and developed
* It is not really possible to do both and there should be a fork / split of the mediawiki codebase, and each should go their own way.
Maybe I am setting up strawmen to articulate my point of view, which is the middle solution, simply because I don't believe that with continuing demand / development of new features, you can make the same codebase work for WMF's expanding feature set and for a lot of small wikis who are content with simple wikitext-based wikis without any of the additional complexity.
Anyway, independent of the specific outlines above, I think the question that should inform a discussion would be: is it really possible for the same codebase to support both WMF's requirements (I say WMF because it is leading the development of new feaures / products) with new and evolving products / features as well as the countless small wikis out there who have exactly one ore two requirements of the wiki?
I am happy to be disabused of the importance of this question.
Subbu.
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki.
Seems like an opportunity. "Deploy your MediaWiki-Vagrant instance to our cloud infrastructure cheap." It's not $2/month, but Digital Ocean can host a 1GB VM for $10/month. Maybe https://atlas.hashicorp.com/ will host vagrant boxes cheaper.
Docker allows many Linux containers on the same physical system which should lead to lower pricing. Some MediaWiki dockerfiles install Parsoid and nodejs as well as MediaWiki and PHP, e.g. CannyComputing https://github.com/CannyComputing/*frontend-mediawiki https://github.com/CannyComputing/frontend-mediawiki* , or you'd have separate container for each service. I can't figure out the pricing.
-- =S Page WMF Tech writer
Le 16/01/2015 07:38, Bryan Davis a écrit :
There has been a lot of talk over the last year or so of how and when to move MediaWiki to a service oriented architecture [0]. So much so that it is actually one of the a marquee topics at the upcoming Developer Summit.
<snip>
Hello,
Moving to a service oriented architecture will certainly ease Wikimedia maintenance, or at least delegate maintenance authority to independent sub teams which would be rather nice.
The monolithic approach of MediaWiki with all services / classes being tightly coupled makes the platform rather hard to build upon. It certainly lacks clear contracts between components.
Can the SOA model be supported on shared hosts or by third parties? Surely. But one has to handle the integration and release work to make it easy to install all the components and configure them. Is Wikimedia going to handle it? I doubt it.
The underlying choice is:
a) should Wikimedia keep being slowed down claiming MediaWiki is suitable for others
b) Reclaims MediaWiki has a Wiki made to create and distribute free knowledge geared toward supporting Wikipedia and others WMF projects.
I think WMF should reclaim it, take decisions and stop pretending we support third parties installation. That would mean going as far as dropping postgreSQL support since we have no use for it.
So what we might end up with:
- Wikimedia using the SOA MediaWiki with split components maintained by staff and the Wikimedia volunteers devs. Code which is of no use for the cluster is dropped which would surely ease maintainability. We can then reduce MediaWiki to a very thin middleware and eventually rewrite whatever few code is left.
- The old MediaWiki PHP based is forked and given to the community to maintain. WMF is no more involved in it and the PHP-only project live it's on life. That project could be made to remove a lot of the rather complicated code that suits mostly Wikimedia, making MediaWiki simpler and easier to adjust for small installations.
So, is it time to fork both intent? I think so.
On 16 January 2015 at 16:09, Antoine Musso hashar+wmf@free.fr wrote:
So what we might end up with:
- Wikimedia using the SOA MediaWiki with split components maintained by
staff and the Wikimedia volunteers devs. Code which is of no use for the cluster is dropped which would surely ease maintainability. We can then reduce MediaWiki to a very thin middleware and eventually rewrite whatever few code is left.
- The old MediaWiki PHP based is forked and given to the community to
maintain. WMF is no more involved in it and the PHP-only project live it's on life. That project could be made to remove a lot of the rather complicated code that suits mostly Wikimedia, making MediaWiki simpler and easier to adjust for small installations. So, is it time to fork both intent? I think so.
This is not a great idea because it makes WMF wikis unforkable in practical terms. The data is worthless without being able to run an instance of the software. This will functionally proprietise all WMF wikis, whatever the licence statement.
- d.
<quote name="David Gerard" date="2015-01-16" time="16:27:22 +0000">
This is not a great idea because it makes WMF wikis unforkable in practical terms. The data is worthless without being able to run an instance of the software. This will functionally proprietise all WMF wikis, whatever the licence statement.
"Unforkable" as a binary is not true. "Not easy to fork without implementing all of the pieces carefully" is more accurate. But it's still possible, especially since all of the config/puppet code that is used to do it is open and Freely licensed.
It ain't easy hosting a site this large with this amount of traffic (thanks Ops!) so it's not surprising that "forking" it wouldn't be easy either.
On 16 January 2015 at 17:10, Greg Grossmeier greg@wikimedia.org wrote:
<quote name="David Gerard" date="2015-01-16" time="16:27:22 +0000">
This is not a great idea because it makes WMF wikis unforkable in practical terms. The data is worthless without being able to run an instance of the software. This will functionally proprietise all WMF wikis, whatever the licence statement.
"Unforkable" as a binary is not true.
That's why I said "functionally".
"Not easy to fork without implementing all of the pieces carefully" is more accurate. But it's still possible, especially since all of the config/puppet code that is used to do it is open and Freely licensed.
A licence that was the GPL with the additional requirement of attaching 1kg of gold to each copy would be technically free as well, but present difficulties in practice.
What I'm saying is that "technically" is insufficient.
It ain't easy hosting a site this large with this amount of traffic (thanks Ops!) so it's not surprising that "forking" it wouldn't be easy either.
However, at present low-traffic copies are pretty easy.
- d.
On Fri, Jan 16, 2015 at 8:27 AM, David Gerard dgerard@gmail.com wrote:
On 16 January 2015 at 16:09, Antoine Musso hashar+wmf@free.fr wrote:
So what we might end up with:
- Wikimedia using the SOA MediaWiki with split components maintained by
staff and the Wikimedia volunteers devs. Code which is of no use for the cluster is dropped which would surely ease maintainability. We can then reduce MediaWiki to a very thin middleware and eventually rewrite whatever few code is left.
- The old MediaWiki PHP based is forked and given to the community to
maintain. WMF is no more involved in it and the PHP-only project live it's on life. That project could be made to remove a lot of the rather complicated code that suits mostly Wikimedia, making MediaWiki simpler and easier to adjust for small installations. So, is it time to fork both intent? I think so.
This is not a great idea because it makes WMF wikis unforkable in practical terms. The data is worthless without being able to run an instance of the software. This will functionally proprietise all WMF wikis, whatever the licence statement.
So far all of the new services are open source. If you want to work on forking Wikimedia projects you can even start a Labs project to work on it. I don't think this is a concern here.
- Ryan
IMHO, the cases traditionally handled by "tiny one-off wiki on shared hosting" would be better served by migrating to a dedicated wiki hosting service which will actually update the software for security issues and new features.
I believe WMF should concentrate on the large-scale hosting case, with Vagrant and similar VM solutions for easy dev & one-off setups where people are willing to maintain them themselves.
This should include better support for creating and maintaining large farms of small independent wikis.
We should also either spin off an affordable for-pay no-ads wiki hosting company, or let people create one (or multiple) and direct people to them.
-- brion
On Thu, Jan 15, 2015 at 10:38 PM, Bryan Davis bd808@wikimedia.org wrote:
There has been a lot of talk over the last year or so of how and when to move MediaWiki to a service oriented architecture [0]. So much so that it is actually one of the a marquee topics at the upcoming Developer Summit.
I think the problem statement for the services RFC is dead on in describing issues that MediaWiki and the Wikimedia Foundation face today with the current monolithic MediaWiki implementation. We have organic entanglements between subsystems that make reasoning about the code base difficult. We have a growing need to API access to data and computations that have historically been only presented via generated HTML. We have cross-team and cross-project communication issues that lead to siloed implementations and duplication of effort.
The solution to these issues proposed in the RFC is to create independent services (eg Parsoid, RESTBase) to implement features that were previously handled by the core MediaWiki application. Thus far Parsoid is only required if a wiki wants to use VisualEditor. There has been discussion however of it being required in some future version of MediaWiki where HTML is the canonical representation of articles {{citation needed}}. This particular future may or may not be far off on the calendar, but there are other services that have been proposed (storage service, REST content API) that are likely to appear in production use at least for the Foundation projects within the next year.
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki. What will happen to the numerous MediaWiki installs on shared hosting providers like 1and1, Dreamhost or GoDaddy when running MediaWiki requires multiple node.js/java/hack/python stand alone processes to function? Is the MediaWiki community making a conscious decision to abandon these customers? If so should we start looking for a suitable replacement that can be recommended and possibly develop tools to easy the migration away from MediaWiki to another monolithic wiki application? If not, how are we going to ensure that pure PHP alternate implementations get equal testing and feature development if they are not actively used on the Foundation's project wikis?
Bryan
Bryan Davis Wikimedia Foundation bd808@wikimedia.org [[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA irc: bd808 v:415.839.6885 x6855
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 16/01/15 17:38, Bryan Davis wrote:
The solution to these issues proposed in the RFC is to create independent services (eg Parsoid, RESTBase) to implement features that were previously handled by the core MediaWiki application. Thus far Parsoid is only required if a wiki wants to use VisualEditor. There has been discussion however of it being required in some future version of MediaWiki where HTML is the canonical representation of articles {{citation needed}}.
Parsoid depends on the MediaWiki parser, it calls it via api.php. It's not a complete, standalone implementation of wikitext to HTML transformation.
HTML storage would be a pretty simple feature, and would allow third-party users to use VE without Parsoid. It's not so simple to use Parsoid without the MediaWiki parser, especially if you want to support all existing extensions.
So, as currently proposed, HTML storage is actually a way to reduce the dependency on services for non-WMF wikis, not to increase it.
Based on recent comments from Gabriel and Subbu, my understanding is that there are no plans to drop the MediaWiki parser at the moment.
This particular future may or may not be far off on the calendar, but there are other services that have been proposed (storage service, REST content API) that are likely to appear in production use at least for the Foundation projects within the next year.
There is a proposal to move revision storage to Cassandra, possibly with node.js middleware. I don't think that project requires dropping support for revision storage in MySQL. I think MediaWiki should be a client for multiple revision storage backends, like what we are already doing for file storage.
There's no reason to think Cassandra is the best storage system that will ever be conceived; the end of history. There will be new technologies in the future, and an abstract backend API for revision storage will help us to utilise them when they become available.
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki.
As long as there are no actual reasons for dropping pure-PHP core functionality, the idea of WMF versus shared hosting is a false dichotomy.
Note that feature parity with Wikipedia has not been possible in pure PHP since 2003, when texvc was introduced. And now that we have Scribunto, you can't even copy an infobox template from Wikipedia to a pure-PHP hosted MediaWiki instance. The shared hosting environment has never been preferred, and I'm not particularly attached to it. Support for it is an accidental consequence of MediaWiki's simplicity and flexibility, and those qualities should be valued for their own reasons.
-- Tim Starling
Tim Starling wrote:
Note that feature parity with Wikipedia has not been possible in pure PHP since 2003, when texvc was introduced. And now that we have Scribunto, you can't even copy an infobox template from Wikipedia to a pure-PHP hosted MediaWiki instance. The shared hosting environment has never been preferred, and I'm not particularly attached to it. Support for it is an accidental consequence of MediaWiki's simplicity and flexibility, and those qualities should be valued for their own reasons.
I tripped across http://www.glottopedia.org/index.php/Main_Page recently. This is one of many specialized wikis throughout the Internet. I don't know if Glottopedia is using shared hosting or not, but the larger point is that small ideas can turn into big ideas. You've been quite fond of referencing the Wikimedia Foundation's vision statement lately. I think you probably recognize that Wikimedia wikis can't house everything; they can house much, much more than they currently do, but some information will be too specialized or esoteric for an encyclopedia or a source repository or a quote collection or whatever. And so you have the actual "rest of the library" to build and maintain (as opposed to what Wikia has wrought), some of which is built upon small MediaWiki installations, which may or may not turn into large MediaWiki installations one day.
There are a lot of decisions that MediaWiki development could have made to make shared hosting support difficult or nearly impossible. .php5 file suffixes, the ability to re-run the installer as an upgrade path, retaining support for older versions of PHP, etc. We've added or supported functionality in order to serve users who, for example, may or may not have shell access or who may or may not have strong technical capabilities. I completely agree with supporting simplicity and flexibility and I agree that complete feature parity hasn't been possible in quite some time. It's in thinking about users who are on more limited hosts that we often develop better (simpler and more flexible) solutions.
I strongly disagree with the seeming dismissal of smaller sites that have more limited budgets who have already chosen to use MediaWiki. We should be proud of how popular MediaWiki has become across the Internet and we should support its growth because it ultimately supports our own. The more people engaged with and using MediaWiki, the better off we are. While you may not be attached to these smaller installations, the people who work on them are and I don't think it's fair to these users, MediaWiki's customers and supporters, to casually wave them aside as flukes.
MZMcBride
MZMcBride z@mzmcbride.com writes:
We should be proud of how popular MediaWiki has become across the Internet and we should support its growth because it ultimately supports our own. The more people engaged with and using MediaWiki, the better off we are.
Thank you for this. I agree whole heartedly. I'd like to see what we can do to increase its usage.
Mark.
Tim Starling tstarling@wikimedia.org writes:
HTML storage would be a pretty simple feature, and would allow third-party users to use VE without Parsoid.
I've been thinking about how to implement an alternative to Parsoid so that users of MW could use VE without standing up a node.js server.
This gives me hope.
Could anyone elaborate on what VE needs from MW in order to operate without Parsoid? Maybe there is an architecture diagram I've been missing?
Mark.
On Tue, Jan 20, 2015 at 10:54 AM, Mark A. Hershberger mah@nichework.com wrote:
Tim Starling tstarling@wikimedia.org writes:
HTML storage would be a pretty simple feature, and would allow third-party users to use VE without Parsoid.
I've been thinking about how to implement an alternative to Parsoid so that users of MW could use VE without standing up a node.js server.
This gives me hope.
Could anyone elaborate on what VE needs from MW in order to operate without Parsoid? Maybe there is an architecture diagram I've been missing?
Funny, we were just talking about that in the office :-)
There are two things that are needed: 1. ContentHandler support for storing HTML. As Tim points out, this should be reasonably straightforward. I imagine there is a long tail of small implementation gaps, but a prototype-quality implementation of this would probably be easy. 2. HTML validation - our current security model relies on the HTML being generated server-side by a wikitext parser. If we cut wikitext out of the loop, we'll need some other way of ensuring that people can't inject arbitrary Javascript/Flash embedding/Java applet/ActionScript/iframe or any other security horrors.
#2 is a lot harder, and something we need to decide how we want to implement (e.g. is this something we do in PHP or Node?).
Rob
On 20 January 2015 at 11:08, Rob Lanphier robla@wikimedia.org wrote:
On Tue, Jan 20, 2015 at 10:54 AM, Mark A. Hershberger mah@nichework.com wrote:
Tim Starling tstarling@wikimedia.org writes:
HTML storage would be a pretty simple feature, and would allow third-party users to use VE without Parsoid.
I've been thinking about how to implement an alternative to Parsoid so that users of MW could use VE without standing up a node.js server.
This gives me hope.
Could anyone elaborate on what VE needs from MW in order to operate without Parsoid? Maybe there is an architecture diagram I've been missing?
Funny, we were just talking about that in the office :-)
There are two things that are needed:
- ContentHandler support for storing HTML. As Tim points out, this
should be reasonably straightforward. I imagine there is a long tail of small implementation gaps, but a prototype-quality implementation of this would probably be easy. 2. HTML validation - our current security model relies on the HTML being generated server-side by a wikitext parser. If we cut wikitext out of the loop, we'll need some other way of ensuring that people can't inject arbitrary Javascript/Flash embedding/Java applet/ActionScript/iframe or any other security horrors.
#2 is a lot harder, and something we need to decide how we want to implement (e.g. is this something we do in PHP or Node?).
There's also #3 – doing all the dozens of things that the wikitext parser does on ingestion. All the redlinks and categories tables, building a user-land (not UI) HTML template system for transclusions, media updates, *etc.* Not a trivial task.
J.
James Forrester jforrester@wikimedia.org writes:
There's also #3 – doing all the dozens of things that the wikitext parser does on ingestion. All the redlinks and categories tables, building a user-land (not UI) HTML template system for transclusions, media updates, *etc.* Not a trivial task.
If I'm not mistaken, the MW Parser already does thos and Parsoid relies on MW for this.
So assuming we can go from Wikitext -> HTML (we already can) and HTML -> Wikitext (which is, AFAICT, what we would need a PHP implementation of) then we're golden!
Mark.
Hi!
- HTML validation - our current security model relies on the HTML
being generated server-side by a wikitext parser. If we cut wikitext out of the loop, we'll need some other way of ensuring that people can't inject arbitrary Javascript/Flash embedding/Java applet/ActionScript/iframe or any other security horrors.
There are tools like HTML Purifier which are pretty good at it, though performance of those are not stellar, especially on big texts. The Purifier pretty much disassembles it into DOM, validates that, throws out what it doesn't like and reassembles it back. Which is not very fast in PHP, but is pretty strict. Still, there's a chance people could sneak something past it.
Hello everyone,
On 19/01/15 06:47, Tim Starling wrote:
As long as there are no actual reasons for dropping pure-PHP core functionality, the idea of WMF versus shared hosting is a false dichotomy.
I kind of agree. Instead of seeing things in black and white, aka shared hosting or not, we should take a look at the needs of users who run their MW on a shared hosting. What exactly do they consider "core functionality"? I don't think we actually know the answer yet. To me, it seems very likely that MWs on shared hosts are a starting base into the MW world. Probably, their admins are mostly not technologically experienced. In the near future, most of them will want to see VE on their instances for improved user experience. But do they really care about wikicode? Or do they care about some functionality that solves their problems. I could imagine, templating is one of the main reasons to ask for wikicode. Can we, then, support templating in pure HTML versions of parsoid? Are there other demands and solutions? What I mean is: there are many shades of [any color you like], in order to make users outside the WMF happy.
I see shared hosts somewhere in a middle position in terms of skills needed to run and in terms of dedication to the site. On the "lower" ground, there are test instances run on local computers. These can be supported, for example, with vagrant setups, in order to make it very easy to get started. On the "upper" level, there are instances that run on servers with root access, vms, in clouds, etc. They can be supported, for instance, with modular setup instructions, packages, predefined machines, puppet and other install scripts in order to get a proper setup. So shared hosting is a special case, then, but it seems to have a significant base of users and supporters.
While the current SOA approach makes things more complex in terms of technologies one needs to support in order to run a setup that matches one of the top 5 websites, it also makes things easier in terms of modularity, if we do it right. So, for example, we (tm) could provide a lightweight PHP implementation of parsoid. Or someone (tm) runs a parsoid service somewhere in the net.
The question is, then, who should be "someone". Naturally, WMF seems to be predestined to lay the ground, e.g. by publishing setup instructions, interfaces, puppet rules, etc. But there's plenty of room for other initiatives (some could even make money out of this :)). An ecosystem around MediaWiki can help do the trick. But here's the crucial bit: We will only get a healthy ecosystem around MediaWiki, if things are reliable in a way. If the developer community and the WMF commits to support such an environment. In the current situation, there's so much insecurity I doubt anyone will seriously consider putting a lot of effort in, say, a PHP parsoid port (I'd be happy if someone proves me wrong).
So, to make a long story short: Let's look forward and try to find solutions. MediaWiki is an amazing piece of software and we should never stop to salutate and support the hundreds of thousands of people that are using it as a basis of furthering the cause of free knowledge.
Best, Markus
-----Ursprüngliche Nachricht----- Von: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von Tim Starling Gesendet: Montag, 19. Januar 2015 06:47 An: wikitech-l@lists.wikimedia.org Betreff: Re: [Wikitech-l] The future of shared hosting
On 16/01/15 17:38, Bryan Davis wrote:
The solution to these issues proposed in the RFC is to create independent services (eg Parsoid, RESTBase) to implement features that were previously handled by the core MediaWiki application. Thus far Parsoid is only required if a wiki wants to use VisualEditor. There has been discussion however of it being required in some future version of MediaWiki where HTML is the canonical representation of articles {{citation needed}}.
Parsoid depends on the MediaWiki parser, it calls it via api.php. It's not a complete, standalone implementation of wikitext to HTML transformation.
HTML storage would be a pretty simple feature, and would allow third-party users to use VE without Parsoid. It's not so simple to use Parsoid without the MediaWiki parser, especially if you want to support all existing extensions.
So, as currently proposed, HTML storage is actually a way to reduce the dependency on services for non-WMF wikis, not to increase it.
Based on recent comments from Gabriel and Subbu, my understanding is that there are no plans to drop the MediaWiki parser at the moment.
This particular future may or may not be far off on the calendar, but there are other services that have been proposed (storage service, REST content API) that are likely to appear in production use at least for the Foundation projects within the next year.
There is a proposal to move revision storage to Cassandra, possibly with node.js middleware. I don't think that project requires dropping support for revision storage in MySQL. I think MediaWiki should be a client for multiple revision storage backends, like what we are already doing for file storage.
There's no reason to think Cassandra is the best storage system that will ever be conceived; the end of history. There will be new technologies in the future, and an abstract backend API for revision storage will help us to utilise them when they become available.
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki.
As long as there are no actual reasons for dropping pure-PHP core functionality, the idea of WMF versus shared hosting is a false dichotomy.
Note that feature parity with Wikipedia has not been possible in pure PHP since 2003, when texvc was introduced. And now that we have Scribunto, you can't even copy an infobox template from Wikipedia to a pure-PHP hosted MediaWiki instance. The shared hosting environment has never been preferred, and I'm not particularly attached to it. Support for it is an accidental consequence of MediaWiki's simplicity and flexibility, and those qualities should be valued for their own reasons.
-- Tim Starling
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
But there's plenty of room for other initiatives (some could even make
money out of this :)).
For what it's worth, this brings to mind a couple interesting examples of this pattern: * https://travis-ci.com/ (and its free counterpart, https://travis-ci.org/), the hosted version of https://github.com/travis-ci/travis-ci * https://gitlab.com/, the hosted version of https://github.com/gitlabhq/gitlabhq
On Tue, Jan 20, 2015 at 3:37 PM, Markus Glaser glaser@hallowelt.biz wrote:
Hello everyone,
On 19/01/15 06:47, Tim Starling wrote:
As long as there are no actual reasons for dropping pure-PHP core
functionality,
the idea of WMF versus shared hosting is a false dichotomy.
I kind of agree. Instead of seeing things in black and white, aka shared hosting or not, we should take a look at the needs of users who run their MW on a shared hosting. What exactly do they consider "core functionality"? I don't think we actually know the answer yet. To me, it seems very likely that MWs on shared hosts are a starting base into the MW world. Probably, their admins are mostly not technologically experienced. In the near future, most of them will want to see VE on their instances for improved user experience. But do they really care about wikicode? Or do they care about some functionality that solves their problems. I could imagine, templating is one of the main reasons to ask for wikicode. Can we, then, support templating in pure HTML versions of parsoid? Are there other demands and solutions? What I mean is: there are many shades of [any color you like], in order to make users outside the WMF happy.
I see shared hosts somewhere in a middle position in terms of skills needed to run and in terms of dedication to the site. On the "lower" ground, there are test instances run on local computers. These can be supported, for example, with vagrant setups, in order to make it very easy to get started. On the "upper" level, there are instances that run on servers with root access, vms, in clouds, etc. They can be supported, for instance, with modular setup instructions, packages, predefined machines, puppet and other install scripts in order to get a proper setup. So shared hosting is a special case, then, but it seems to have a significant base of users and supporters.
While the current SOA approach makes things more complex in terms of technologies one needs to support in order to run a setup that matches one of the top 5 websites, it also makes things easier in terms of modularity, if we do it right. So, for example, we (tm) could provide a lightweight PHP implementation of parsoid. Or someone (tm) runs a parsoid service somewhere in the net.
The question is, then, who should be "someone". Naturally, WMF seems to be predestined to lay the ground, e.g. by publishing setup instructions, interfaces, puppet rules, etc. But there's plenty of room for other initiatives (some could even make money out of this :)). An ecosystem around MediaWiki can help do the trick. But here's the crucial bit: We will only get a healthy ecosystem around MediaWiki, if things are reliable in a way. If the developer community and the WMF commits to support such an environment. In the current situation, there's so much insecurity I doubt anyone will seriously consider putting a lot of effort in, say, a PHP parsoid port (I'd be happy if someone proves me wrong).
So, to make a long story short: Let's look forward and try to find solutions. MediaWiki is an amazing piece of software and we should never stop to salutate and support the hundreds of thousands of people that are using it as a basis of furthering the cause of free knowledge.
Best, Markus
-----Ursprüngliche Nachricht----- Von: wikitech-l-bounces@lists.wikimedia.org [mailto: wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von Tim Starling Gesendet: Montag, 19. Januar 2015 06:47 An: wikitech-l@lists.wikimedia.org Betreff: Re: [Wikitech-l] The future of shared hosting
On 16/01/15 17:38, Bryan Davis wrote:
The solution to these issues proposed in the RFC is to create independent services (eg Parsoid, RESTBase) to implement features that were previously handled by the core MediaWiki application. Thus far Parsoid is only required if a wiki wants to use VisualEditor. There has been discussion however of it being required in some future version of MediaWiki where HTML is the canonical representation of articles {{citation needed}}.
Parsoid depends on the MediaWiki parser, it calls it via api.php. It's not a complete, standalone implementation of wikitext to HTML transformation.
HTML storage would be a pretty simple feature, and would allow third-party users to use VE without Parsoid. It's not so simple to use Parsoid without the MediaWiki parser, especially if you want to support all existing extensions.
So, as currently proposed, HTML storage is actually a way to reduce the dependency on services for non-WMF wikis, not to increase it.
Based on recent comments from Gabriel and Subbu, my understanding is that there are no plans to drop the MediaWiki parser at the moment.
This particular future may or may not be far off on the calendar, but there are other services that have been proposed (storage service, REST content API) that are likely to appear in production use at least for the Foundation projects within the next year.
There is a proposal to move revision storage to Cassandra, possibly with node.js middleware. I don't think that project requires dropping support for revision storage in MySQL. I think MediaWiki should be a client for multiple revision storage backends, like what we are already doing for file storage.
There's no reason to think Cassandra is the best storage system that will ever be conceived; the end of history. There will be new technologies in the future, and an abstract backend API for revision storage will help us to utilise them when they become available.
One of the bigger questions I have about the potential shift to requiring services is the fate of shared hosting deployments of MediaWiki.
As long as there are no actual reasons for dropping pure-PHP core functionality, the idea of WMF versus shared hosting is a false dichotomy.
Note that feature parity with Wikipedia has not been possible in pure PHP since 2003, when texvc was introduced. And now that we have Scribunto, you can't even copy an infobox template from Wikipedia to a pure-PHP hosted MediaWiki instance. The shared hosting environment has never been preferred, and I'm not particularly attached to it. Support for it is an accidental consequence of MediaWiki's simplicity and flexibility, and those qualities should be valued for their own reasons.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org