Are you good in swearing? WE NEED YOU
Huggle 3 comes with vandalism-prediction as it is precaching the diffs even before they are enqueued including their contents. Each edit has so called "score" which is a numerical value that if higher, the edit is more likely a vandalism.
If you want to help us improve this feature, it is necessary to define a "score words" list for every wiki where huggle is about to be used, for example on English wiki.
Each list has following syntax:
(see https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Config&diff=...)
score-words(score): list of words separated by comma, can contain newlines but comma must be present
example
score-words(200): these, are, some, words, which, presence, of, increases, the, score, each, word, by, 200,
So, if you know english better than me, which you likely do, go ahead and improve the configuration file there, no worries, huggle's config parser is very syntax-error proof.
If you have any other suggestion how to improve huggle's prediction, go ahead and tell us!
On 19/09/13 10:35, Petr Bena wrote:
Are you good in swearing? WE NEED YOU
Huggle 3 comes with vandalism-prediction as it is precaching the diffs even before they are enqueued including their contents. Each edit has so called "score" which is a numerical value that if higher, the edit is more likely a vandalism.
If you want to help us improve this feature, it is necessary to define a "score words" list for every wiki where huggle is about to be used, for example on English wiki.
Each list has following syntax:
(see https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Config&diff=...)
score-words(score): list of words separated by comma, can contain newlines but comma must be present
example
score-words(200): these, are, some, words, which, presence, of, increases, the, score, each, word, by, 200,
[[en:User:/DeltaQuad/UAA/Blacklist]] contains a fairly comprehensive overview of English-language profanity and general trash-talk formatted as regexps, mixed in with other non-sweary blocking patterns that are specific to that blacklist's needs.
Neil
About swears in English language, sorry I can't help but I'm very good at Persian :D, We have an abuse filter about Persian swears which is hidden from public https://fa.wikipedia.org/wiki/%D9%88%DB%8C%DA%98%D9%87:%D9%BE%D8%A7%D9%84%D8...
And It works pretty good, So If you need to i18n huggle, this page will be a good help
Best
On Thu, Sep 19, 2013 at 8:59 PM, Neil Harris neil@tonal.clara.co.uk wrote:
On 19/09/13 10:35, Petr Bena wrote:
Are you good in swearing? WE NEED YOU
Huggle 3 comes with vandalism-prediction as it is precaching the diffs even before they are enqueued including their contents. Each edit has so called "score" which is a numerical value that if higher, the edit is more likely a vandalism.
If you want to help us improve this feature, it is necessary to define a "score words" list for every wiki where huggle is about to be used, for example on English wiki.
Each list has following syntax:
(see https://en.wikipedia.org/w/**index.php?title=Wikipedia:** Huggle/Config&diff=573615259&**oldid=573615075https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Config&diff=573615259&oldid=573615075 )
score-words(score): list of words separated by comma, can contain newlines but comma must be present
example
score-words(200): these, are, some, words, which, presence, of, increases, the, score, each, word, by, 200,
[[en:User:/DeltaQuad/UAA/**Blacklist]] contains a fairly comprehensive overview of English-language profanity and general trash-talk formatted as regexps, mixed in with other non-sweary blocking patterns that are specific to that blacklist's needs.
Neil
______________________________**_________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l