At this point I guess I would recommend adding five or so
g2.cores8.ram36.disk20 flavor VPSs to WMCS, with between one and three
RTX A6000 GPUs each, plus a 1TB SSD each, which should cost under
$60k. That should allow for very widely multilingual models somewhere
between GPT-3.5 and 4 performance with current training rates.

Having part of the cluster for this makes sense, even as what it is used for changes over time.

These models can be quantized into int4 weights which run on cell
phones: https://github.com/rupeshs/alpaca.cpp/tree/linux-android-build-support
It seems inevitable that we will someday include such LLMs with
Internet-in-a-Box, and, why not also the primary mobile apps

Eventually, yes. A good reason to renew attention to mobile as a canonical wiki experience.