I'm trying to setup two Parsoid servers to play nicely with two MediaWiki application servers and am having some issues. I have no problem getting things working with Parsoid on a single app server, or multiple Parsoid servers being used by a single app server, but ran into issues when I increased to multiple app servers. To try to get this working I starting making the app and Parsoid servers communicate through my load balancer. So an overview of my config is:
Load balancer = 192.168.56.63
App1 = 192.168.56.80 App2 = 192.168.56.60
Parsoid1 = 192.168.56.80 Parsoid2 = 192.168.56.60
Note, App1 and Parsoid1 are the same server, and App2 and Parsoid2 are the same server. I can only spin up so many VMs on my laptop.
The load balancer (HAProxy) is configured as follows: * 80 forwards to 443 * 443 forwards to App1 and App2 port 8080 * 8081 forwards to App1 and App2 port 8080 (this will be a private network connection later) * 8001 forwards to Parsoid1 and Parsoid2 port 8000 (also will be private)
On App1/Parsoid1 I can run `curl 192.168.56.63:8001` and get the appropriate response from Parsoid. I can run `curl 192.168.56.63:8081` and get the appropriate response from MediaWiki. The same is true for both on App2/Parsoid2. So the servers can get the info they need from the services.
Currently I'm getting a the error "Error loading data from server: 500: docserver-http: HTTP 500. Would you like to retry?" when attempting to use Visual Editor. I've tried various different settings and have not always gotten that specific error, but am getting it with the settings I currently have in localsettings.js and LocalSettings.php (shown below in this email). Removing the proxy config lines from these settings gave slightly better results. I did not get the 500 error, but instead it sometimes after a very long time it would work. It also may have been throwing errors in the parsoid log (with debug on). I have those logs saved if they help. I'm hoping someone can just point out some misconfiguration, though.
Here are snippets of my config files:
On App1/Parsoid1, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' }, domain: 'demo', prefix: 'demo' } );
parsoidConfig.serverInterface = '192.168.56.80';
On App2/Parsoid2, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' },
domain: 'demo', prefix: 'demo'
} );
parsoidConfig.serverInterface = '192.168.56.60';
On App1/Parsoid1, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
On App2/Parsoid2, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
Thanks!
--James
I think in general the first thing you should do for performance is set up restbase in front of parsoid? Caching the parsoid results will be faster than running multiple parsoids in parallel. That would also match the wmf configuration more closely, which would probably help us help you. I wrote up instructions for configuring restbase on the VE and Parsoid wiki pages. As it turns out I updated these today to use VRS configuration. Let me know if you run into trouble, perhaps some further minor updates are necessary. --scott
On Jun 7, 2017 6:26 PM, "James Montalvo" jamesmontalvo3@gmail.com wrote:
I'm trying to setup two Parsoid servers to play nicely with two MediaWiki application servers and am having some issues. I have no problem getting things working with Parsoid on a single app server, or multiple Parsoid servers being used by a single app server, but ran into issues when I increased to multiple app servers. To try to get this working I starting making the app and Parsoid servers communicate through my load balancer. So an overview of my config is:
Load balancer = 192.168.56.63
App1 = 192.168.56.80 App2 = 192.168.56.60
Parsoid1 = 192.168.56.80 Parsoid2 = 192.168.56.60
Note, App1 and Parsoid1 are the same server, and App2 and Parsoid2 are the same server. I can only spin up so many VMs on my laptop.
The load balancer (HAProxy) is configured as follows:
- 80 forwards to 443
- 443 forwards to App1 and App2 port 8080
- 8081 forwards to App1 and App2 port 8080 (this will be a private network
connection later)
- 8001 forwards to Parsoid1 and Parsoid2 port 8000 (also will be private)
On App1/Parsoid1 I can run `curl 192.168.56.63:8001` and get the appropriate response from Parsoid. I can run `curl 192.168.56.63:8081` and get the appropriate response from MediaWiki. The same is true for both on App2/Parsoid2. So the servers can get the info they need from the services.
Currently I'm getting a the error "Error loading data from server: 500: docserver-http: HTTP 500. Would you like to retry?" when attempting to use Visual Editor. I've tried various different settings and have not always gotten that specific error, but am getting it with the settings I currently have in localsettings.js and LocalSettings.php (shown below in this email). Removing the proxy config lines from these settings gave slightly better results. I did not get the 500 error, but instead it sometimes after a very long time it would work. It also may have been throwing errors in the parsoid log (with debug on). I have those logs saved if they help. I'm hoping someone can just point out some misconfiguration, though.
Here are snippets of my config files:
On App1/Parsoid1, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' }, domain: 'demo', prefix: 'demo' } );
parsoidConfig.serverInterface = '192.168.56.80';
On App2/Parsoid2, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' },
domain: 'demo', prefix: 'demo'
} );
parsoidConfig.serverInterface = '192.168.56.60';
On App1/Parsoid1, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
On App2/Parsoid2, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
Thanks!
--James _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Setting up RESTBase is very involved. I'd really prefer not to add that complexity at this time. Also I'm not sure at my scale RESTBase would provide much performance benefit (though I don't know much about it so that's just a hunch). The parsoid and VE configs have fields for proxy (as shown in my snippets), so it seems like running them this way is intended. Am I wrong?
Thanks, James
On Jun 7, 2017 8:12 PM, "C. Scott Ananian" cananian@wikimedia.org wrote:
I think in general the first thing you should do for performance is set up restbase in front of parsoid? Caching the parsoid results will be faster than running multiple parsoids in parallel. That would also match the wmf configuration more closely, which would probably help us help you. I wrote up instructions for configuring restbase on the VE and Parsoid wiki pages. As it turns out I updated these today to use VRS configuration. Let me know if you run into trouble, perhaps some further minor updates are necessary. --scott
On Jun 7, 2017 6:26 PM, "James Montalvo" jamesmontalvo3@gmail.com wrote:
I'm trying to setup two Parsoid servers to play nicely with two MediaWiki application servers and am having some issues. I have no problem getting things working with Parsoid on a single app server, or multiple Parsoid servers being used by a single app server, but ran into issues when I increased to multiple app servers. To try to get this working I starting making the app and Parsoid servers communicate through my load balancer.
So
an overview of my config is:
Load balancer = 192.168.56.63
App1 = 192.168.56.80 App2 = 192.168.56.60
Parsoid1 = 192.168.56.80 Parsoid2 = 192.168.56.60
Note, App1 and Parsoid1 are the same server, and App2 and Parsoid2 are
the
same server. I can only spin up so many VMs on my laptop.
The load balancer (HAProxy) is configured as follows:
- 80 forwards to 443
- 443 forwards to App1 and App2 port 8080
- 8081 forwards to App1 and App2 port 8080 (this will be a private
network
connection later)
- 8001 forwards to Parsoid1 and Parsoid2 port 8000 (also will be private)
On App1/Parsoid1 I can run `curl 192.168.56.63:8001` and get the appropriate response from Parsoid. I can run `curl 192.168.56.63:8081`
and
get the appropriate response from MediaWiki. The same is true for both on App2/Parsoid2. So the servers can get the info they need from the
services.
Currently I'm getting a the error "Error loading data from server: 500: docserver-http: HTTP 500. Would you like to retry?" when attempting to
use
Visual Editor. I've tried various different settings and have not always gotten that specific error, but am getting it with the settings I
currently
have in localsettings.js and LocalSettings.php (shown below in this
email).
Removing the proxy config lines from these settings gave slightly better results. I did not get the 500 error, but instead it sometimes after a
very
long time it would work. It also may have been throwing errors in the parsoid log (with debug on). I have those logs saved if they help. I'm hoping someone can just point out some misconfiguration, though.
Here are snippets of my config files:
On App1/Parsoid1, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' }, domain: 'demo', prefix: 'demo' } );
parsoidConfig.serverInterface = '192.168.56.80';
On App2/Parsoid2, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' },
domain: 'demo', prefix: 'demo'
} );
parsoidConfig.serverInterface = '192.168.56.60';
On App1/Parsoid1, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
On App2/Parsoid2, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
Thanks!
--James _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I had a mistake in the config written in my original email, but it was correct on my servers.
192.168.56.80 LocalSettings.php snippet: ``` $wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => 'http://192.168.56.63:8001', 'HTTPProxy' => 'http://192.168.56.63:8001',
'domain' => $wikiId, 'prefix' => $wikiId ); ```
192.168.56.80 localsettings.js snippet: ``` parsoidConfig.setMwApi({ uri: 'http://192.168.56.63:8081/demo/api.php', proxy: { uri: 'http://192.168.56.63:8081/' },
domain: 'demo', prefix: 'demo' });
parsoidConfig.serverInterface = '192.168.56.80'; ```
192.168.56.60 LocalSettings.php snippet: ``` $wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => 'http://192.168.56.63:8001', 'HTTPProxy' => 'http://192.168.56.63:8001',
'domain' => $wikiId, 'prefix' => $wikiId ); ```
192.168.56.60 localsettings.js snippet: ``` parsoidConfig.setMwApi({ uri: 'http://192.168.56.63:8081/demo/api.php', proxy: { uri: 'http://192.168.56.63:8081/' },
domain: 'demo', prefix: 'demo' });
parsoidConfig.serverInterface = '192.168.56.60'; ```
Sorry for the bad info, and thanks to Subramanya for pointing it out.
--James
On Wed, Jun 7, 2017 at 10:30 PM, James Montalvo jamesmontalvo3@gmail.com wrote:
Setting up RESTBase is very involved. I'd really prefer not to add that complexity at this time. Also I'm not sure at my scale RESTBase would provide much performance benefit (though I don't know much about it so that's just a hunch). The parsoid and VE configs have fields for proxy (as shown in my snippets), so it seems like running them this way is intended. Am I wrong?
Thanks, James
On Jun 7, 2017 8:12 PM, "C. Scott Ananian" cananian@wikimedia.org wrote:
I think in general the first thing you should do for performance is set up restbase in front of parsoid? Caching the parsoid results will be faster than running multiple parsoids in parallel. That would also match the wmf configuration more closely, which would probably help us help you. I wrote up instructions for configuring restbase on the VE and Parsoid wiki pages. As it turns out I updated these today to use VRS configuration. Let me know if you run into trouble, perhaps some further minor updates are necessary. --scott
On Jun 7, 2017 6:26 PM, "James Montalvo" jamesmontalvo3@gmail.com wrote:
I'm trying to setup two Parsoid servers to play nicely with two
MediaWiki
application servers and am having some issues. I have no problem getting things working with Parsoid on a single app server, or multiple Parsoid servers being used by a single app server, but ran into issues when I increased to multiple app servers. To try to get this working I starting making the app and Parsoid servers communicate through my load
balancer. So
an overview of my config is:
Load balancer = 192.168.56.63
App1 = 192.168.56.80 App2 = 192.168.56.60
Parsoid1 = 192.168.56.80 Parsoid2 = 192.168.56.60
Note, App1 and Parsoid1 are the same server, and App2 and Parsoid2 are
the
same server. I can only spin up so many VMs on my laptop.
The load balancer (HAProxy) is configured as follows:
- 80 forwards to 443
- 443 forwards to App1 and App2 port 8080
- 8081 forwards to App1 and App2 port 8080 (this will be a private
network
connection later)
- 8001 forwards to Parsoid1 and Parsoid2 port 8000 (also will be
private)
On App1/Parsoid1 I can run `curl 192.168.56.63:8001` and get the appropriate response from Parsoid. I can run `curl 192.168.56.63:8081`
and
get the appropriate response from MediaWiki. The same is true for both
on
App2/Parsoid2. So the servers can get the info they need from the
services.
Currently I'm getting a the error "Error loading data from server: 500: docserver-http: HTTP 500. Would you like to retry?" when attempting to
use
Visual Editor. I've tried various different settings and have not always gotten that specific error, but am getting it with the settings I
currently
have in localsettings.js and LocalSettings.php (shown below in this
email).
Removing the proxy config lines from these settings gave slightly better results. I did not get the 500 error, but instead it sometimes after a
very
long time it would work. It also may have been throwing errors in the parsoid log (with debug on). I have those logs saved if they help. I'm hoping someone can just point out some misconfiguration, though.
Here are snippets of my config files:
On App1/Parsoid1, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' }, domain: 'demo', prefix: 'demo' } );
parsoidConfig.serverInterface = '192.168.56.80';
On App2/Parsoid2, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' },
domain: 'demo', prefix: 'demo'
} );
parsoidConfig.serverInterface = '192.168.56.60';
On App1/Parsoid1, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
On App2/Parsoid2, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
Thanks!
--James _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
So I'm getting the 500 error message from VE, and this is my parsoid log file on the parsoid server that attempted to handle the request. Not sure if this helps.
[info][master][7904] initializing 4 workers [info][worker][7924] loading ... [info][worker][7912] loading ... [info][worker][7918] loading ... [info][worker][7930] loading ... [info][worker][7912] ready on 192.168.56.80:8000 [info][worker][7918] ready on 192.168.56.80:8000 [info][worker][7924] ready on 192.168.56.80:8000 [info][worker][7930] ready on 192.168.56.80:8000 { "0": "Starting HTTP request: ", "1": { "method": "GET", "followRedirect": true, "uri": "http://192.168.56.63:8081/demo/api.php", "qs": { "format": "json", "action": "query", "meta": "siteinfo", "siprop": "namespaces|namespacealiases|magicwords|functionhooks|extensiontags|general|interwikimap|languages|protocols|specialpagealiases", "rawcontinue": 1 }, "timeout": 40000, "agent": { "domain": null, "_events": {}, "_eventsCount": 1, "defaultPort": 80, "protocol": "http:", "options": { "maxSockets": 15, "connectTimeout": 5000, "path": null }, "requests": {}, "sockets": {}, "freeSockets": {}, "keepAliveMsecs": 1000, "keepAlive": false, "maxSockets": 15, "maxFreeSockets": 256 }, "headers": { "X-Request-ID": null, "User-Agent": "Parsoid/0.5.1+git", "Connection": "close" }, "strictSSL": true, "proxy": "http://192.168.56.63:8081/" } } (on a functioning installation normally there are several more entries after the JSON above)
--James
On Wed, Jun 7, 2017 at 11:33 PM, James Montalvo jamesmontalvo3@gmail.com wrote:
I had a mistake in the config written in my original email, but it was correct on my servers.
192.168.56.80 LocalSettings.php snippet:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => 'http://192.168.56.63:8001', 'HTTPProxy' => 'http://192.168.56.63:8001', 'domain' => $wikiId, 'prefix' => $wikiId );
192.168.56.80 localsettings.js snippet:
parsoidConfig.setMwApi({ uri: 'http://192.168.56.63:8081/demo/api.php', proxy: { uri: 'http://192.168.56.63:8081/' }, domain: 'demo', prefix: 'demo' }); parsoidConfig.serverInterface = '192.168.56.80';
192.168.56.60 LocalSettings.php snippet:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => 'http://192.168.56.63:8001', 'HTTPProxy' => 'http://192.168.56.63:8001', 'domain' => $wikiId, 'prefix' => $wikiId );
192.168.56.60 localsettings.js snippet:
parsoidConfig.setMwApi({ uri: 'http://192.168.56.63:8081/demo/api.php', proxy: { uri: 'http://192.168.56.63:8081/' }, domain: 'demo', prefix: 'demo' }); parsoidConfig.serverInterface = '192.168.56.60';
Sorry for the bad info, and thanks to Subramanya for pointing it out.
--James
On Wed, Jun 7, 2017 at 10:30 PM, James Montalvo jamesmontalvo3@gmail.com wrote:
Setting up RESTBase is very involved. I'd really prefer not to add that complexity at this time. Also I'm not sure at my scale RESTBase would provide much performance benefit (though I don't know much about it so that's just a hunch). The parsoid and VE configs have fields for proxy (as shown in my snippets), so it seems like running them this way is intended. Am I wrong?
Thanks, James
On Jun 7, 2017 8:12 PM, "C. Scott Ananian" cananian@wikimedia.org wrote:
I think in general the first thing you should do for performance is set up restbase in front of parsoid? Caching the parsoid results will be faster than running multiple parsoids in parallel. That would also match the wmf configuration more closely, which would probably help us help you. I wrote up instructions for configuring restbase on the VE and Parsoid wiki pages. As it turns out I updated these today to use VRS configuration. Let me know if you run into trouble, perhaps some further minor updates are necessary. --scott
On Jun 7, 2017 6:26 PM, "James Montalvo" jamesmontalvo3@gmail.com wrote:
I'm trying to setup two Parsoid servers to play nicely with two
MediaWiki
application servers and am having some issues. I have no problem
getting
things working with Parsoid on a single app server, or multiple Parsoid servers being used by a single app server, but ran into issues when I increased to multiple app servers. To try to get this working I
starting
making the app and Parsoid servers communicate through my load
balancer. So
an overview of my config is:
Load balancer = 192.168.56.63
App1 = 192.168.56.80 App2 = 192.168.56.60
Parsoid1 = 192.168.56.80 Parsoid2 = 192.168.56.60
Note, App1 and Parsoid1 are the same server, and App2 and Parsoid2 are
the
same server. I can only spin up so many VMs on my laptop.
The load balancer (HAProxy) is configured as follows:
- 80 forwards to 443
- 443 forwards to App1 and App2 port 8080
- 8081 forwards to App1 and App2 port 8080 (this will be a private
network
connection later)
- 8001 forwards to Parsoid1 and Parsoid2 port 8000 (also will be
private)
On App1/Parsoid1 I can run `curl 192.168.56.63:8001` and get the appropriate response from Parsoid. I can run `curl 192.168.56.63:8081`
and
get the appropriate response from MediaWiki. The same is true for both
on
App2/Parsoid2. So the servers can get the info they need from the
services.
Currently I'm getting a the error "Error loading data from server: 500: docserver-http: HTTP 500. Would you like to retry?" when attempting to
use
Visual Editor. I've tried various different settings and have not
always
gotten that specific error, but am getting it with the settings I
currently
have in localsettings.js and LocalSettings.php (shown below in this
email).
Removing the proxy config lines from these settings gave slightly
better
results. I did not get the 500 error, but instead it sometimes after a
very
long time it would work. It also may have been throwing errors in the parsoid log (with debug on). I have those logs saved if they help. I'm hoping someone can just point out some misconfiguration, though.
Here are snippets of my config files:
On App1/Parsoid1, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' }, domain: 'demo', prefix: 'demo' } );
parsoidConfig.serverInterface = '192.168.56.80';
On App2/Parsoid2, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' },
domain: 'demo', prefix: 'demo'
} );
parsoidConfig.serverInterface = '192.168.56.60';
On App1/Parsoid1, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
On App2/Parsoid2, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
Thanks!
--James _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
RESTBase actually adds a lot of immediate performance, since it lets VE load the editable representation directly from cache, instead of requiring the editor to wait for Parsoid to parse the page before it can be edited. I documented the RESTBase install; it shouldn't actually be any more difficult than Parsoid. They both use the same service runner framework now.
At any rate: in your configurations you have URL and HTTPProxy set to the exact same string. This is almost certainly not right. I believe if you just omit the proxy lines entirely from the configuration you'll find things work as you expect. --scott
On Wed, Jun 7, 2017 at 11:30 PM, James Montalvo jamesmontalvo3@gmail.com wrote:
Setting up RESTBase is very involved. I'd really prefer not to add that complexity at this time. Also I'm not sure at my scale RESTBase would provide much performance benefit (though I don't know much about it so that's just a hunch). The parsoid and VE configs have fields for proxy (as shown in my snippets), so it seems like running them this way is intended. Am I wrong?
Thanks, James
On Jun 7, 2017 8:12 PM, "C. Scott Ananian" cananian@wikimedia.org wrote:
I think in general the first thing you should do for performance is set
up
restbase in front of parsoid? Caching the parsoid results will be faster than running multiple parsoids in parallel. That would also match the
wmf
configuration more closely, which would probably help us help you. I
wrote
up instructions for configuring restbase on the VE and Parsoid wiki
pages.
As it turns out I updated these today to use VRS configuration. Let me
know
if you run into trouble, perhaps some further minor updates are
necessary.
--scott
On Jun 7, 2017 6:26 PM, "James Montalvo" jamesmontalvo3@gmail.com
wrote:
I'm trying to setup two Parsoid servers to play nicely with two
MediaWiki
application servers and am having some issues. I have no problem
getting
things working with Parsoid on a single app server, or multiple Parsoid servers being used by a single app server, but ran into issues when I increased to multiple app servers. To try to get this working I
starting
making the app and Parsoid servers communicate through my load
balancer.
So
an overview of my config is:
Load balancer = 192.168.56.63
App1 = 192.168.56.80 App2 = 192.168.56.60
Parsoid1 = 192.168.56.80 Parsoid2 = 192.168.56.60
Note, App1 and Parsoid1 are the same server, and App2 and Parsoid2 are
the
same server. I can only spin up so many VMs on my laptop.
The load balancer (HAProxy) is configured as follows:
- 80 forwards to 443
- 443 forwards to App1 and App2 port 8080
- 8081 forwards to App1 and App2 port 8080 (this will be a private
network
connection later)
- 8001 forwards to Parsoid1 and Parsoid2 port 8000 (also will be
private)
On App1/Parsoid1 I can run `curl 192.168.56.63:8001` and get the appropriate response from Parsoid. I can run `curl 192.168.56.63:8081`
and
get the appropriate response from MediaWiki. The same is true for both
on
App2/Parsoid2. So the servers can get the info they need from the
services.
Currently I'm getting a the error "Error loading data from server: 500: docserver-http: HTTP 500. Would you like to retry?" when attempting to
use
Visual Editor. I've tried various different settings and have not
always
gotten that specific error, but am getting it with the settings I
currently
have in localsettings.js and LocalSettings.php (shown below in this
email).
Removing the proxy config lines from these settings gave slightly
better
results. I did not get the 500 error, but instead it sometimes after a
very
long time it would work. It also may have been throwing errors in the parsoid log (with debug on). I have those logs saved if they help. I'm hoping someone can just point out some misconfiguration, though.
Here are snippets of my config files:
On App1/Parsoid1, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' }, domain: 'demo', prefix: 'demo' } );
parsoidConfig.serverInterface = '192.168.56.80';
On App2/Parsoid2, relevant localsettings.js:
parsoidConfig.setMwApi({
uri: 'http://192.168.56.80:8081/demo/api.php', proxy: { uri: 'http://192.168.56.80:8081/' },
domain: 'demo', prefix: 'demo'
} );
parsoidConfig.serverInterface = '192.168.56.60';
On App1/Parsoid1, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
On App2/Parsoid2, relevant LocalSettings.php:
$wgVirtualRestConfig['modules']['parsoid'] = array( 'url' => '192.168.56.80:8001',
'HTTPProxy' => 'http://192.168.56.80:8001',
'domain' => $wikiId, 'prefix' => $wikiId );
Thanks!
--James _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi James,
I don't know if you have noticed the following in C. Scott's response
At any rate: in your configurations you have URL and HTTPProxy set to the exact same string. This is almost certainly not right. I believe if you just omit the proxy lines entirely from the configuration you'll find things work as you expect. --scott
but I could not help but notice the error too. AFAIK setting these variables instruct both software to use http://192.168.56.63:8001/ as a forward proxy which is NOT what you have there. HAproxy is a reverse proxy software, not a forward proxy (although you can abuse it to achieve that functionality). In the setup you describe there is no need for forward proxies so neither parsoid nor mediawiki need a proxy configuration.
I also don't think you need RESTBase as long as you are willing to wait for parsoid to finish parsing and returning the result. It should be fine for small articles, but as these grow larger, you will start having various performance related problems (for example you might have to adjust haproxy timeouts). But from what I gather, you are not there yet.
On Fri, Jun 9, 2017 at 12:56 AM, Alexandros Kosiaris < akosiaris@wikimedia.org> wrote:
I also don't think you need RESTBase as long as you are willing to wait for parsoid to finish parsing and returning the result.
Apart from performance, there is also functionality that is missing without RESTBase:
- Diffs are going to contain a lot of extra changes (commonly called "dirty diffs"), as no original HTML or data-parsoid is available to Parsoid's selective serialization algorithm. This might make it difficult to review changes. - Switching between wikitext and visual editing won't work. - Visual editing in general will very likely stop working once we reduce the size of HTML by separating out metadata (see https://phabricator.wikimedia.org/T78676). We keep pushing this back due to a lack of resources, but it is still planned, and might happen within the next six months.
In short, using Parsoid directly for visual editing is an unsupported configuration, and is likely to stop working altogether in the foreseeable future.
On Thu, Jun 8, 2017 at 7:10 AM, James Montalvo jamesmontalvo3@gmail.com wrote:
I've read through the documentation I think you're talking about. It's kind of hard to determine where to start since the docs are spread out between multiple VE, Parsoid and RESTBase pages. Installing RESTBase is, as you say, straightforward (git clone, npm install, basically). Configuring is not clear to me, and without clear docs it's the kind of thing that takes hours of trial and error.
The RESTBase install instructions https://github.com/wikimedia/restbase#installation point to a fairly well-commented example documentation file: https://github.com/wikimedia/restbase/blob/master/config.example.yaml
For a basic install, all you should need is adjust the lines marked with XXX in there. The default backend will use SQLite. Cassandra offers better scalability and distribution for large scale, but this is not likely something you need. A single SQLite-backed RESTBase instance and a single Parsoid instance should be all you need.
We are aware of the complexity of setting up a fully featured MediaWiki system, and are working on a Kubernetes-based solution right now (see https://github.com/wikimedia/mediawiki-containers/blob/k8s/README.k8s.md for current work in progress). The early prototype already sets up MediaWiki, VisualEditor, RESTBase, Parsoid, Math, as well as other services like EventBus. The current work is primarily aimed at development and testing, but we expect it to also offer a quick way to spin up a complete & fully-featured containerized MediaWiki system for small installs.
Hope this helps,
Gabriel
Hi,
In short, using Parsoid directly for visual editing is an unsupported configuration, and is likely to stop working altogether in the foreseeable future.
for current work in progress). The early prototype already sets up MediaWiki, VisualEditor, RESTBase, Parsoid, Math, as well as other services like EventBus. The current work is primarily aimed at development and testing, but we expect it to also offer a quick way to spin up a complete & fully-featured containerized MediaWiki system for small installs.
This sounds like a lot of sublayers that can potentially disrupt a simple editing process and I wonder from the many non-WMF MediaWiki installations and administrators, who will be able and capable to debug those once an issue arise.
MediaWiki-core itself isn't a walk in a park and now and then you find a bug that is addressed post six month, left alone for a year, or worse never see a response. Now, I can imagine that those added layers don't necessarily make this process easier and the current response times on phabricator make me skeptic about the outlined requirements and hereby the added complexity.
Cheers
On 6/9/17, Gabriel Wicke gwicke@wikimedia.org wrote:
On Fri, Jun 9, 2017 at 12:56 AM, Alexandros Kosiaris < akosiaris@wikimedia.org> wrote:
I also don't think you need RESTBase as long as you are willing to wait for parsoid to finish parsing and returning the result.
Apart from performance, there is also functionality that is missing without RESTBase:
- Diffs are going to contain a lot of extra changes (commonly called
"dirty diffs"), as no original HTML or data-parsoid is available to Parsoid's selective serialization algorithm. This might make it difficult to review changes.
- Switching between wikitext and visual editing won't work.
- Visual editing in general will very likely stop working once we reduce
the size of HTML by separating out metadata (see https://phabricator.wikimedia.org/T78676). We keep pushing this back due to a lack of resources, but it is still planned, and might happen within the next six months.
In short, using Parsoid directly for visual editing is an unsupported configuration, and is likely to stop working altogether in the foreseeable future.
On Thu, Jun 8, 2017 at 7:10 AM, James Montalvo jamesmontalvo3@gmail.com wrote:
I've read through the documentation I think you're talking about. It's kind of hard to determine where to start since the docs are spread out between multiple VE, Parsoid and RESTBase pages. Installing RESTBase is, as you say, straightforward (git clone, npm install, basically). Configuring is not clear to me, and without clear docs it's the kind of thing that takes hours of trial and error.
The RESTBase install instructions https://github.com/wikimedia/restbase#installation point to a fairly well-commented example documentation file: https://github.com/wikimedia/restbase/blob/master/config.example.yaml
For a basic install, all you should need is adjust the lines marked with XXX in there. The default backend will use SQLite. Cassandra offers better scalability and distribution for large scale, but this is not likely something you need. A single SQLite-backed RESTBase instance and a single Parsoid instance should be all you need.
We are aware of the complexity of setting up a fully featured MediaWiki system, and are working on a Kubernetes-based solution right now (see https://github.com/wikimedia/mediawiki-containers/blob/k8s/README.k8s.md for current work in progress). The early prototype already sets up MediaWiki, VisualEditor, RESTBase, Parsoid, Math, as well as other services like EventBus. The current work is primarily aimed at development and testing, but we expect it to also offer a quick way to spin up a complete & fully-featured containerized MediaWiki system for small installs.
Hope this helps,
Gabriel _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Fri, Jun 9, 2017 at 8:25 AM, James HK jamesin.hongkong.1@gmail.com wrote:
This sounds like a lot of sublayers that can potentially disrupt a simple editing process and I wonder from the many non-WMF MediaWiki installations and administrators, who will be able and capable to debug those once an issue arise.
This is a familiar pattern in the history of computers. Early computers were programmed in assembly, until complexity was added with compilers. Early wikis were simple Perl CGI scripts backed by files, until Wikipedia's scale (traffic and organizational), security and feature requirements made it necessary to add caching layers, isolated services, and distributed storage systems.
Each of these steps added layers of abstraction and complexity, and concerns about understanding all those layers was (rightfully) brought up at each step along the way. And yet the move towards higher levels of abstraction has been highly successful. Complex systems like web browsers or even entire distributed system clusters can now be deployed with a single click, on largely commoditized platforms.
We are not yet at the point where we can offer you this degree of automation for MediaWiki, but we are working on it.
On 06/09/2017 09:57 AM, Gabriel Wicke wrote:
On Fri, Jun 9, 2017 at 12:56 AM, Alexandros Kosiaris < akosiaris@wikimedia.org> wrote:
I also don't think you need RESTBase as long as you are willing to wait for parsoid to finish parsing and returning the result.
Apart from performance, there is also functionality that is missing without RESTBase:
- Diffs are going to contain a lot of extra changes (commonly called "dirty diffs"), as no original HTML or data-parsoid is available to Parsoid's selective serialization algorithm. This might make it difficult to review changes.
What Gabriel said there about dirty diffs. So, this depends on whether wikis are concerned about their wikitext getting normalized to "Parsoid-determined canonical" formats (wrt choice of whitespaces, quotes, for ex.). For example, this is a extremely important for wikimedia wikis, but may be less so for some smaller wikis, if they take a one-time normalization dirty diff and adopt identical norms in source editing.
- Switching between wikitext and visual editing won't work.
This is because of the dirty-diff requirement. As far as I understand, even if wikis are okay with dirty diffs, VE's source <-> html switching functionality requires restbase right now.
- Visual editing in general will very likely stop working once we reduce the size of HTML by separating out metadata (see https://phabricator.wikimedia.org/T78676). We keep pushing this back due to a lack of resources, but it is still planned, and might happen within the next six months.
There are some unresolved questions about how willing (Parsoid) clients are to work with this stripped-html format. That and the matter of us being resource-strapped means we keep kicking this down the road. But, when this happens, this will break VE-editing unless VE and Parsoid hide the data-mw stripping behind a config flag.
In short, using Parsoid directly for visual editing is an unsupported configuration, and is likely to stop working altogether in the foreseeable future.
Just to be clear, we haven't yet made any formal decision to go down this route, but Gabriel articulates the reasons why it might make sense to do this. There are some aspects to consider here: (a) whether we want to support this combination behind a config flag at all given that some functionality may not be available (unless Parsoid clients figure out ways to support some functionality without RESTBase) (b) the complexity (maintenance, testing, documentation, support) of supporting multiple combinations.
We don't have fully resolved answers to this yet. I don't know what VE's take on this is -- so there is also that to consider. But, when we have firm resolutions on all of this, we will make suitable announcements on lists, suggest upgrade options, and update wikis.
But, also, what Gabriel said earlier about RESTBase. If you are already installing Parsoid, adding RESTBase (since it is also node.js) with the default sqlite backend might not be a whole lot more complexity. So, if VE-editing wikis that use Parsoid start adopting this, that would also inform our decisions above.
Subbu.
Thanks to everyone for all the responses. I'm learning a lot.
In the short term we need to figure out how to make this work without RESTBase, but I've been convinced by this email chain that in the long term we'll need to incorporate RESTBase into our setup.
At this point I think I've determined that the problem we're having is not actually a Parsoid problem, but somehow related to MediaWiki Core (PHP) response times. Something about my multi-server setup is causing 25% of MW core response times to be 25x longer than normal. I didn't notice this in my dev setup, prior to testing Parsoid, probably because I just assumed my laptop was old and underpowered. In other words, normal page loads were slower but I just figured that having multiple VMs up on my laptop functioning as full app servers was the reason. Parsoid evidently has a default timeout short enough that when Parsoid makes MW core API requests I was getting failures, causing me to misinterpret it as a Parsoid issue.
To ensure it was not my underpowered laptop I moved my testing to a machine with 12 CPUs and 64 GB RAM.
Our configuration script that allows us to define our setup as follows:
load balancers = list, of, IP, addresses, ...
app servers = list, of, IP, addresses, ...
memcached servers = list, of, IP, addresses, ...
db master = a.single.ip.address
db replicas = list, of, IP, addresses, ...
parsoid servers = list, of, IP, addresses, ...
elasticsearch servers = list, of, IP, addresses, ...
I have not run it with that many servers yet, but it's theoretically possible. A single server does not need to fill a single role, so in testing thus far my configs look more like:
load balancers = server.3.ip.addr
app servers = server.1.ip.addr, server.2.ip.addr
memcached servers = server.1.ip.addr, server.2.ip.addr
db master = server.1.ip.addr
db replicas = server.2.ip.addr
parsoid servers = server.1.ip.addr, server.2.ip.addr
elasticsearch servers = server.1.ip.addr, server.2.ip.addr
In short: three servers, one exclusively a load balancer, two with everything installed albeit one acting as DB master and the other as DB replica.
We're running this setup in production with all servers configured as "localhost", e.g. everything installed on one server.
I'm pretty sure I've narrowed down the 25x-longer-response-times to being a multiple app-server problem because I can take the dev config above (server.1.ip.addr, server.2.ip.addr, server.3.ip.addr) and comment out various servers and re-run deploy. This allows me to quickly switch from a single app server to two, two DBs to one, etc. I see the issue with multiple app servers. I don't see it with a single app server, regardless of whether the other services have 1 or 2 servers.
My LocalSettings.php files are are at [1] and [2] for dual app servers. These reference Extensions.php which _shouldn't_ have any impact but can be found at [3]. The files are written by Ansible and I'm kind of bad at getting the indenting correct...so, sorry about that if it looks funny. All of this is created by our project called meza [4]. We weren't really planning on announcing meza yet, but basically its purpose is to simplify MediaWiki install with all the bells and whistles for "enterprise" (whatever that means :) ) use cases. We've been running it on a single server for about a year, but need to migrate to a high availability setup to support 24/7 mission critical operations.
Any ideas what may cause two load-balanced app servers to respond slowly 25% of the time?
Thanks!
--James
[1] https://gist.github.com/jamesmontalvo3/5adf207623454c9eff98e93152b43108#file...
[2] https://gist.github.com/jamesmontalvo3/5adf207623454c9eff98e93152b43108#file...
[3] https://gist.github.com/jamesmontalvo3/5adf207623454c9eff98e93152b43108#file...
[4] https://github.com/enterprisemediawiki/meza
On Fri, Jun 9, 2017 at 12:57 PM, Subramanya Sastry ssastry@wikimedia.org wrote:
On 06/09/2017 09:57 AM, Gabriel Wicke wrote:
On Fri, Jun 9, 2017 at 12:56 AM, Alexandros Kosiaris <
akosiaris@wikimedia.org> wrote:
I also don't think you need RESTBase as long as you are willing to wait for parsoid to finish parsing and returning the result.
Apart from performance, there is also functionality that is missing without RESTBase:
- Diffs are going to contain a lot of extra changes (commonly called "dirty diffs"), as no original HTML or data-parsoid is available to Parsoid's selective serialization algorithm. This might make it
difficult to review changes.
What Gabriel said there about dirty diffs. So, this depends on whether wikis are concerned about their wikitext getting normalized to "Parsoid-determined canonical" formats (wrt choice of whitespaces, quotes, for ex.). For example, this is a extremely important for wikimedia wikis, but may be less so for some smaller wikis, if they take a one-time normalization dirty diff and adopt identical norms in source editing.
- Switching between wikitext and visual editing won't work.
This is because of the dirty-diff requirement. As far as I understand, even if wikis are okay with dirty diffs, VE's source <-> html switching functionality requires restbase right now.
- Visual editing in general will very likely stop working once we
reduce the size of HTML by separating out metadata (see https://phabricator.wikimedia.org/T78676). We keep pushing this back due to a lack of resources, but it is still planned, and might happen within the next six months.
There are some unresolved questions about how willing (Parsoid) clients are to work with this stripped-html format. That and the matter of us being resource-strapped means we keep kicking this down the road. But, when this happens, this will break VE-editing unless VE and Parsoid hide the data-mw stripping behind a config flag.
In short, using Parsoid directly for visual editing is an unsupported
configuration, and is likely to stop working altogether in the foreseeable future.
Just to be clear, we haven't yet made any formal decision to go down this route, but Gabriel articulates the reasons why it might make sense to do this. There are some aspects to consider here: (a) whether we want to support this combination behind a config flag at all given that some functionality may not be available (unless Parsoid clients figure out ways to support some functionality without RESTBase) (b) the complexity (maintenance, testing, documentation, support) of supporting multiple combinations.
We don't have fully resolved answers to this yet. I don't know what VE's take on this is -- so there is also that to consider. But, when we have firm resolutions on all of this, we will make suitable announcements on lists, suggest upgrade options, and update wikis.
But, also, what Gabriel said earlier about RESTBase. If you are already installing Parsoid, adding RESTBase (since it is also node.js) with the default sqlite backend might not be a whole lot more complexity. So, if VE-editing wikis that use Parsoid start adopting this, that would also inform our decisions above.
Subbu.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
It may be a lock issue. IIRC mediawiki can invoke parsoid which can then reinvoke the mediawiki api. I remember there being some corner case with locking which caused this recursive invocation to deadlock in some (not anything like production) situations. If you can get a trace from the "slow" mediawiki perhaps you will find it waiting for a lock to time out. --scott
On Jun 9, 2017 4:29 PM, "James Montalvo" jamesmontalvo3@gmail.com wrote:
Thanks to everyone for all the responses. I'm learning a lot.
In the short term we need to figure out how to make this work without RESTBase, but I've been convinced by this email chain that in the long term we'll need to incorporate RESTBase into our setup.
At this point I think I've determined that the problem we're having is not actually a Parsoid problem, but somehow related to MediaWiki Core (PHP) response times. Something about my multi-server setup is causing 25% of MW core response times to be 25x longer than normal. I didn't notice this in my dev setup, prior to testing Parsoid, probably because I just assumed my laptop was old and underpowered. In other words, normal page loads were slower but I just figured that having multiple VMs up on my laptop functioning as full app servers was the reason. Parsoid evidently has a default timeout short enough that when Parsoid makes MW core API requests I was getting failures, causing me to misinterpret it as a Parsoid issue.
To ensure it was not my underpowered laptop I moved my testing to a machine with 12 CPUs and 64 GB RAM.
Our configuration script that allows us to define our setup as follows:
load balancers = list, of, IP, addresses, ...
app servers = list, of, IP, addresses, ...
memcached servers = list, of, IP, addresses, ...
db master = a.single.ip.address
db replicas = list, of, IP, addresses, ...
parsoid servers = list, of, IP, addresses, ...
elasticsearch servers = list, of, IP, addresses, ...
I have not run it with that many servers yet, but it's theoretically possible. A single server does not need to fill a single role, so in testing thus far my configs look more like:
load balancers = server.3.ip.addr
app servers = server.1.ip.addr, server.2.ip.addr
memcached servers = server.1.ip.addr, server.2.ip.addr
db master = server.1.ip.addr
db replicas = server.2.ip.addr
parsoid servers = server.1.ip.addr, server.2.ip.addr
elasticsearch servers = server.1.ip.addr, server.2.ip.addr
In short: three servers, one exclusively a load balancer, two with everything installed albeit one acting as DB master and the other as DB replica.
We're running this setup in production with all servers configured as "localhost", e.g. everything installed on one server.
I'm pretty sure I've narrowed down the 25x-longer-response-times to being a multiple app-server problem because I can take the dev config above (server.1.ip.addr, server.2.ip.addr, server.3.ip.addr) and comment out various servers and re-run deploy. This allows me to quickly switch from a single app server to two, two DBs to one, etc. I see the issue with multiple app servers. I don't see it with a single app server, regardless of whether the other services have 1 or 2 servers.
My LocalSettings.php files are are at [1] and [2] for dual app servers. These reference Extensions.php which _shouldn't_ have any impact but can be found at [3]. The files are written by Ansible and I'm kind of bad at getting the indenting correct...so, sorry about that if it looks funny. All of this is created by our project called meza [4]. We weren't really planning on announcing meza yet, but basically its purpose is to simplify MediaWiki install with all the bells and whistles for "enterprise" (whatever that means :) ) use cases. We've been running it on a single server for about a year, but need to migrate to a high availability setup to support 24/7 mission critical operations.
Any ideas what may cause two load-balanced app servers to respond slowly 25% of the time?
Thanks!
--James
[1] https://gist.github.com/jamesmontalvo3/5adf207623454c9eff98e93152b431 08#file-localsettings-app1-php
[2] https://gist.github.com/jamesmontalvo3/5adf207623454c9eff98e93152b431 08#file-localsettings-app2-php
[3] https://gist.github.com/jamesmontalvo3/5adf207623454c9eff98e93152b431 08#file-extensions-php
[4] https://github.com/enterprisemediawiki/meza
On Fri, Jun 9, 2017 at 12:57 PM, Subramanya Sastry ssastry@wikimedia.org wrote:
On 06/09/2017 09:57 AM, Gabriel Wicke wrote:
On Fri, Jun 9, 2017 at 12:56 AM, Alexandros Kosiaris <
akosiaris@wikimedia.org> wrote:
I also don't think you need RESTBase as long as you are willing to wait for parsoid to finish parsing and returning the result.
Apart from performance, there is also functionality that is missing without RESTBase:
- Diffs are going to contain a lot of extra changes (commonly called "dirty diffs"), as no original HTML or data-parsoid is available to Parsoid's selective serialization algorithm. This might make it
difficult to review changes.
What Gabriel said there about dirty diffs. So, this depends on whether wikis are concerned about their wikitext getting normalized to "Parsoid-determined canonical" formats (wrt choice of whitespaces,
quotes,
for ex.). For example, this is a extremely important for wikimedia wikis, but may be less so for some smaller wikis, if they take a one-time normalization dirty diff and adopt identical norms in source editing.
- Switching between wikitext and visual editing won't work.
This is because of the dirty-diff requirement. As far as I understand, even if wikis are okay with dirty diffs, VE's source <-> html switching functionality requires restbase right now.
- Visual editing in general will very likely stop working once we
reduce the size of HTML by separating out metadata (see https://phabricator.wikimedia.org/T78676). We keep pushing this
back
due to a lack of resources, but it is still planned, and might happen within the next six months.
There are some unresolved questions about how willing (Parsoid) clients are to work with this stripped-html format. That and the matter of us
being
resource-strapped means we keep kicking this down the road. But, when
this
happens, this will break VE-editing unless VE and Parsoid hide the
data-mw
stripping behind a config flag.
In short, using Parsoid directly for visual editing is an unsupported
configuration, and is likely to stop working altogether in the
foreseeable
future.
Just to be clear, we haven't yet made any formal decision to go down this route, but Gabriel articulates the reasons why it might make sense to do this. There are some aspects to consider here: (a) whether we want to support this combination behind a config flag at all given that some functionality may not be available (unless Parsoid clients figure out ways to support some functionality without RESTBase) (b) the complexity (maintenance, testing, documentation, support) of supporting multiple combinations.
We don't have fully resolved answers to this yet. I don't know what VE's take on this is -- so there is also that to consider. But, when we have firm resolutions on all of this, we will make suitable announcements on lists, suggest upgrade options, and update wikis.
But, also, what Gabriel said earlier about RESTBase. If you are already installing Parsoid, adding RESTBase (since it is also node.js) with the default sqlite backend might not be a whole lot more complexity. So, if VE-editing wikis that use Parsoid start adopting this, that would also inform our decisions above.
Subbu.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org