-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Roan Kattouw wrote:
Brion Vibber schreef:
Why do you use JSON here? Wouldn't it be better/faster/more intuitive/whatever to use format=php and unserialize()? Surely PHP can interpret its own format faster than JSON?
Unserializing PHP serialized data that comes from a foreign server is potentially unsafe.
[snip]
We're handling an *array* here, right? How the hell can unserializing() something, then using [] on it cause anything else than a fatal error?
"If the variable being unserialized is an object, after successfully reconstructing the object PHP will automatically attempt to call the __wakeup() member function (if it exists)."
Arbitrary data input + execution of a function -> potentially could cause problems, depending on what the wakeup function of the given class does.
At the moment that's probably reasonably safe given our codebase (ignoring that unserialize() itself has had a rash of low-level security problems such as buffer overflows), but it gives me the willies as a low-level problem.
Even if it does only cause a fatal error, now your site's broken because of a problem with a third-party provider. That's a pretty poor practice and should be avoided.
Or does PHP allow overloading of the [] operator? (I though PHP didn't *do* operator overloading, but I could be wrong.) Anyway, is_array() should be safe enough here.
It sort of does -- classes can provide an array-like interface.
I'm not saying it makes a huge difference and I know trying to optimize non-bottleneck code is discouraged, but if you can get better performance by just using a different function, it's too easy not to do.
That depends on what other concerns are present. Using low-level serialization on unvalidated input data just isn't a good security practice.
It's *convenient* for quickie hacks, but it's not very reliable and not very safe. These concerns trump a hypothetical speed advantage which may not even exist.
Also, I though Http::get() was a faux pas, why use it here?
On the contrary -- *not* using Http::get() is a faux pas. :)
We can control the timeout values etc with Http::get(), and choose the best backend (PHP's URL wrappers or CURL module), all nice and encapsulated. Duplicating all that by hand is a big no-no.
OK. I remember someone saying something related to bug 14024 (interwiki API) that calling Http::get() would take too long etc.
That's an unrelated issue where you'd be trying to proxy third-party requests.
Right. If the local wiki's not in English, it may construct the output using something other than the canonical 'Image:', which won't be recognized by the foreign repo.
In this case, we need to canonicalize the title first (i.e. replace Bild: by Image: on a German wiki), then submit that title to the remote wiki, which will normalize Image: to whatever it uses.
Exactly. :)
I don't really get what the foreach is for: it's only one file, right? You can use reset() to get the first element of an array if it doesn't have index 0.
IMHO, reset() and foreach() are equally ugly here. :)
Personal preference then, I guess. To me, foreach() implies that something runs more than once (although that's impossible in this case because of the return statement).
reset() followed by current()/next() is exactly equivalent to foreach(), but without the clean, visible structure around the iterator.
Sometimes it's necessary because you're doing something _weird_ with your loop, but I tend to think of it as very icky. :)
Anyway this can be fairly easily extended to implement the new multiple findFiles(), with the singular findFile() as a special case that passes only one title, so a foreach is gonna be just fine. ;)
- -- brion vibber (brion @ wikimedia.org)