I turns out, this is a little more complicated than it appeared at first; usercontribs
and list users have different concepts of "invalid". If you ask for
usercontribs on "1.2.3.4", it's valid. If you pass in
"1.2.3.0/24", you get baduser.. But list users returns:
{
"batchcomplete": "",
"query": {
"users": [
{
"name": "1.2.3.4",
"invalid": ""
}
]
}
}
which I guess makes sense in that context since it can't map it to a userid. I can
work around this, but mentioning it for the sake of some poor developer searching the
archives N years from now trying to figure it out :-)
On Aug 19, 2021, at 6:21 PM, Bryan Davis
<bd808(a)wikimedia.org> wrote:
On Thu, Aug 19, 2021 at 4:04 PM Roy Smith <roy(a)panix.com> wrote:
I've got a tool which parses sockpuppet investigation (SPI) pages and does some
analysis. One of the steps is I need to validate that all of the usernames found in the
SPI report are valid. I do that by sequentially calling usercontribs on each name with
uclimit=1 and seeing if I get a baduser error.
This works, but it's slow because I need to make 1 API call for each user. For a big
SPI case, the time to do this swamps everything else. Is there a more efficient way to do
this? Some API call where I can give it a bunch of usernames in a batch and have it tell
me which ones are invalid? Alternatively, is there a regex I could apply on the client
side to test if a username is valid?
The most common type of invalid name I see is when somebody puts down an iprange (i.e.
1.2.4.0/24) as a username. Testing for that client-side would be trivial, but it might
miss some others.
You can do lookups in batches of 50 (500 if you have the
"apihighlimits" right which is commonly granted by the "Bots" group
on
movement wikis) with
<https://en.wikipedia.org/w/api.php?action=help&modules=query%2Busers>.
Here's a quick example:
<https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=users&format=json&utf8=1&formatversion=2&ususers=Bryan%20Davis%7CBryanDavis%7CBDavis%20(WMF)%7Cbd808>
The results will look something like:
```
{
"batchcomplete": true,
"query": {
"users": [
{
"name": "Bryan Davis",
"missing": true
},
{
"userid": 2619078,
"name": "BryanDavis"
},
{
"userid": 19474624,
"name": "BDavis (WMF)"
},
{
"userid": 24257381,
"name": "Bd808"
}
]
}
}
```
Bryan
--
Bryan Davis Technical Engagement Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808
_______________________________________________
Cloud mailing list -- cloud(a)lists.wikimedia.org
List information:
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/