There are several stumbling effects of flagged revs.
The vandalism will have a larger window where it will be observed before it goes away unnoticed. This is often named probability of detection in military projects. In the present UI in Wp this is a major problem. The window is very short and when the vandalism passes out of this window the chance of detection rapidly drops of to a level given by the reading frequency of the article. The sighting process will (can) lock the vandalism in a indefinitely much larger window. This increases the PoD but will also be the prime reason why people observe that the system doesn't scale well. That is, the users observe the edits that previously went away unnoticed.
An other thing closely related to this is an ability to verify several previous versions in one operation, ie qualifying a new version as sighted without regard to previous history. This makes the whole process much more effective then the present method of patrolling. As a guesstimate this can give a factor of 2-3 times, but limited to how effective the UI is on conveying the information from the previous edits.
Lastly there are a very odd effect that will emerge if the users that do patroling/sighting observe the ''unsighted'' versions in a cumulative way. By marking the versions as sighted they will successive concentrate on those versions that contains vandalism. This increases the effectiveness several times, increases the time one unsighted version is observable and therefore increases PoD.
I believe sighted versions will need fewer patrolers because they will be more effective, and the result will be articles with less vandalism. Unfortunatly the system breaks down ungracefully when the patrolers can't keep up and the vandalism starts to pile up. On the good side the heap of vandalism can be handled later as long as the heap don't consistently builds up over some time.
John E