Wiki-research-l August 2006

wiki-research-l@lists.wikimedia.org

6 participants
6 discussions

by Erik Zachte

==Reposting with better title to get included in proper thread== Brian: > ps: Does anyone know of a script that can strip out wiki syntax? This > is pertinent. It will also be necessary to leve only paragraphs of > text in the articles..the below data is noticably skewed in some (but > not all) of the mesures. > Brian, here an inital reponse: Some perl code from the WikiCounts job, that strips lots of markup code, used to get cleaner text for word count and article size in chars. It is not 100% accurate, and not all markup is removed, but these regexps slow down the whole job big time. The result is at least far closer to a decent word count than wc would be on the raw data. $article =~ s/\'\'+//go ; # strip bold/italic formatting $article =~ s/\<[^\>]+\>//go ; # strip <...> html # these are valid UTF-8 chars, but it takes way too long to process, so # I combine those in one set # $article =~ s/[\xc0-\xdf][\x80-\xbf]| # [\xe0-\xef][\x80-\xbf]{2}| # [\xf0-\xf7][\x80-\xbf]{3}/x/gxo ; # this one set selects UTF-8 faster (with 99.9% accuracy I would say) $article =~ s/[\xc0-\xf7][\x80-\xbf]+/x/gxo ; # count unicode chars as one char $article =~ s/\&\w+\;/x/go ; # count htlm chars as one char $article =~ s/\&\#\d+\;/x/go ; # count htlm chars as one char $article =~ s/\[\[ [^\:\]]+ \: [^\]]* \]\]//gxoi ; # strip image/category/interwiki links # a few internal links with colon in title will get lost too $article =~ s/http \: [\w\.\/]+//gxoi ; # strip external links $article =~ s/\=\=+ [^\=]* \=\=+//gxo ; # strip headers $article =~ s/\n\**//go ; # strip linebreaks + unordered list tags (other lists are relatively scarce) $article =~ s/\s+/ /go ; # remove extra spaces Actually the code in WikiCountsInput.pl is a bit more complicated as it tries to find a decent solution for ja/zh/ko Also numbers are counted as one word (including embedded points and commas). if ($language eq "ja") { $words = int ($unicodes * 0.37) ; } etc > pss: I recall from the Wikimania meeting that someone had a script to > convert a dump to tab-delimited data. That would be useful to me... > could someone provide a link? > http://karma.med.harvard.edu/mailman/private/freelogy-discuss/2006-July/0000 47.html > Erik: The largest of articles takes approx. 1/10 of a second running > the binary produced by this C code. Using Inline::C in perl, I could > fairly easily embed the code (style.c from GNU Diction) into your > script. It would take and return strings. "Simple!" =) Otherwise I can > just produce the data in csv etc.. and provide it to you. > Questions and caveats: 1/10 secs x 2 million articles early in 2007 is 55 hours. Plus German is 80 hours. Of course you say 1/10 is for largest articles only. Still it adds up big time when all months are processed, and running WikiCounts incrementally only adding data for last month has its drawbacks as explained in out meeting at Wikimania. Is it 1/10 sec for all tests combined? Could we limit ourselves to the better researched tests or the tests which are supported in more languages or deemed more sensible anyway ? I would prefer tests that work in all alphabet based languages. When wiki syntax is introduced that is not stripped by regexps above or some other tool it would produce artificial drift in the results over the months. > This data is very easy to reproduce. I provide a unix command for each > that assumes you have installed the lynx text browser, which has a > dump command to strip out html and leave text, and the GNU Diction > package, which provides style. Style supports English/German. Strip html is already done. See above. I could imagine we run these tests on a yet to be determined sample of all articles to save processing costs. Tracking 10.000 or 50.000 articles from month to month, if chosen properly (random ?) should give decent results. Cheers, Erik Zachte

17 years, 8 months

Re: [Wiki-research-l] Wiki-research-l Digest, Vol 14, Issue 2

by Erik Zachte

Brian: > ps: Does anyone know of a script that can strip out wiki syntax? This > is pertinent. It will also be necessary to leve only paragraphs of > text in the articles..the below data is noticably skewed in some (but > not all) of the mesures. > Brian, here an inital reponse: Some perl code from the WikiCounts job, that strips lots of markup code, used to get cleaner text for word count and article size in chars. It is not 100% accurate, and not all markup is removed, but these regexps slow down the whole job big time. The result is at least far closer to a decent word count than wc would be on the raw data. $article =~ s/\'\'+//go ; # strip bold/italic formatting $article =~ s/\<[^\>]+\>//go ; # strip <...> html # these are valid UTF-8 chars, but it takes way too long to process, so # I combine those in one set # $article =~ s/[\xc0-\xdf][\x80-\xbf]| # [\xe0-\xef][\x80-\xbf]{2}| # [\xf0-\xf7][\x80-\xbf]{3}/x/gxo ; # this one set selects UTF-8 faster (with 99.9% accuracy I would say) $article =~ s/[\xc0-\xf7][\x80-\xbf]+/x/gxo ; # count unicode chars as one char $article =~ s/\&\w+\;/x/go ; # count htlm chars as one char $article =~ s/\&\#\d+\;/x/go ; # count htlm chars as one char $article =~ s/\[\[ [^\:\]]+ \: [^\]]* \]\]//gxoi ; # strip image/category/interwiki links # a few internal links with colon in title will get lost too $article =~ s/http \: [\w\.\/]+//gxoi ; # strip external links $article =~ s/\=\=+ [^\=]* \=\=+//gxo ; # strip headers $article =~ s/\n\**//go ; # strip linebreaks + unordered list tags (other lists are relatively scarce) $article =~ s/\s+/ /go ; # remove extra spaces Actually the code in WikiCountsInput.pl is a bit more complicated as it tries to find a decent solution for ja/zh/ko Also numbers are counted as one word (including embedded points and commas). if ($language eq "ja") { $words = int ($unicodes * 0.37) ; } etc > pss: I recall from the Wikimania meeting that someone had a script to > convert a dump to tab-delimited data. That would be useful to me... > could someone provide a link? > http://karma.med.harvard.edu/mailman/private/freelogy-discuss/2006-July/0000 47.html > Erik: The largest of articles takes approx. 1/10 of a second running > the binary produced by this C code. Using Inline::C in perl, I could > fairly easily embed the code (style.c from GNU Diction) into your > script. It would take and return strings. "Simple!" =) Otherwise I can > just produce the data in csv etc.. and provide it to you. > Questions and caveats: 1/10 secs x 2 million articles early in 2007 is 55 hours. Plus German is 80 hours. Of course you say 1/10 is for largest articles only. Still it adds up big time when all months are processed, and running WikiCounts incrementally only adding data for last month has its drawbacks as explained in out meeting at Wikimania. Is it 1/10 sec for all tests combined? Could we limit ourselves to the better researched tests or the tests which are supported in more languages or deemed more sensible anyway ? I would prefer tests that work in all alphabet based languages. When wiki syntax is introduced that is not stripped by regexps above or some other tool it would produce artificial drift in the results over the months. > This data is very easy to reproduce. I provide a unix command for each > that assumes you have installed the lynx text browser, which has a > dump command to strip out html and leave text, and the GNU Diction > package, which provides style. Style supports English/German. Strip html is already done. See above. I could imagine we run these tests on a yet to be determined sample of all articles to save processing costs. Tracking 10.000 or 50.000 articles from month to month, if chosen properly (random ?) should give decent results. Cheers, Erik Zachte

17 years, 8 months

Readability examples

by Brian

Here are a few readability measure examples. Just a side-by-side comparison of the text from the GWB article from en.wp and simple.wp, and de.wp. I plan on parsing en, de and simple in full and exploring how these measures might be correlated with quality. ps: Does anyone know of a script that can strip out wiki syntax? This is pertinent. It will also be necessary to leve only paragraphs of text in the articles..the below data is noticably skewed in some (but not all) of the mesures. pss: I recall from the Wikimania meeting that someone had a script to convert a dump to tab-delimited data. That would be useful to me... could someone provide a link? Erik: The largest of articles takes approx. 1/10 of a second running the binary produced by this C code. Using Inline::C in perl, I could fairly easily embed the code (style.c from GNU Diction) into your script. It would take and return strings. "Simple!" =) Otherwise I can just produce the data in csv etc.. and provide it to you. See [[Readability]] and Google to get an idea of what these readability grades mean. Briefly: All of these explained quite simply: http://www.readability.info/info.shtml Kincaid: http://en.wikipedia.org/wiki/Flesch-Kincaid_Readability_Test#Flesch-Kincaid… ARI: http://en.wikipedia.org/wiki/Automated_Readability_Index Coleman-Liau: http://en.wikipedia.org/wiki/Coleman-Liau_Index Flesh Index: http://en.wikipedia.org/wiki/Flesch-Kincaid_Readability_Test#Flesch_Reading… Fog Index: http://en.wikipedia.org/wiki/Gunning-Fog_Index Lix: http://www.readability.info/info.shtml SMOG-Grading: http://en.wikipedia.org/wiki/SMOG_Index This data is very easy to reproduce. I provide a unix command for each that assumes you have installed the lynx text browser, which has a dump command to strip out html and leave text, and the GNU Diction package, which provides style. Style supports English/German. ---------------------------------------------------------------- [[George W. Bush]] on en.wp: lynx -dump http://en.wikipedia.org/wiki/"George W. Bush" | style YMMV: I removed all the hyperlinks in this article before running style ---------------------------------------------------------------- readability grades: Kincaid: 11.7 ARI: 13.5 Coleman-Liau: 12.8 Flesch Index: 54.0 Fog Index: 15.3 Lix: 51.3 = school year 10 SMOG-Grading: 13.1 sentence info: 60081 characters 12376 words, average length 4.85 characters = 1.52 syllables 513 sentences, average length 24.1 words 58% (299) short sentences (at most 19 words) 18% (97) long sentences (at least 34 words) 65 paragraphs, average length 7.9 sentences 0% (3) questions 22% (114) passive sentences longest sent 294 wds at sent 507; shortest sent 1 wds at sent 5 word usage: verb types: to be (155) auxiliary (49) types as % of total: conjunctions 4% (544) pronouns 3% (336) prepositions 11% (1311) nominalizations 3% (311) sentence beginnings: pronoun (47) interrogative pronoun (3) article (40) subordinating conjunction (23) conjunction (5) preposition (40) ---------------------------------------------------------------- [[George W. Bush]] on simple.wp: lynx -dump http://simple.wikipedia.org/wiki/"George W. Bush" | style ---------------------------------------------------------------- readability grades: Kincaid: 3.3 ARI: 0.7 Coleman-Liau: 6.0 Flesch Index: 88.6 Fog Index: 6.5 Lix: 23.6 = below school year 5 SMOG-Grading: 7.4 sentence info: 8659 characters 2344 words, average length 3.69 characters = 1.28 syllables 248 sentences, average length 9.5 words 65% (163) short sentences (at most 4 words) 10% (26) long sentences (at least 19 words) 14 paragraphs, average length 17.7 sentences 0% (0) questions 10% (27) passive sentences longest sent 253 wds at sent 39; shortest sent 1 wds at sent 4 word usage: verb types: to be (40) auxiliary (1) types as % of total: conjunctions 1% (24) pronouns 1% (33) prepositions 4% (95) nominalizations 1% (24) sentence beginnings: pronoun (10) interrogative pronoun (0) article (3) subordinating conjunction (3) conjunction (1) preposition (2) ---------------------------------------------------------------- [[George W. Bush]] on de.wp: lynx -dump http://de.wikipedia.org/wiki/"George W. Bush" | style -L de ---------------------------------------------------------------- readability grades: Kincaid: 8.0 ARI: 6.7 Coleman-Liau: 12.3 Flesch Index: 57.7 Fog Index: 10.8 Lix: 34.4 = school year 5 SMOG-Grading: 5.3 sentence info: 37740 characters 7909 words, average length 4.77 characters = 1.63 syllables 694 sentences, average length 11.4 words 63% (441) short sentences (at most 6 words) 16% (116) long sentences (at least 21 words) 56 paragraphs, average length 12.4 sentences 0% (2) questions 6% (44) passive sentences longest sent 274 wds at sent 256; shortest sent 1 wds at sent 191 sentence beginnings: pronoun (14) interrogative pronoun (3) article (37) ---------------------------------------------------------------- Cheers, Brian Mingus

17 years, 8 months

Wikimedia Research, Quantitative Analysis, General User Survey and more

by Erik Zachte

This mail (including pictures) was sent to attendants of Wikimania 2006 and some others that recently showed active interest in quantitative research. Crossposting here. I hope you will find at least something in this mail that is to your liking. Wikimania 2006 was, like its predecessor in Frankfurt, a source of inspiration. Several official and impromptu meetings were held that were related to research and quantitative analysis. On a conference with 6 parallel sessions one has to make difficult choices, and for me it was impossible to attend several highly interesting research meetings. ---------------------------------------------------------------------- Wikimedia Research I am very much looking forward towards a transcript or at least speaker notes and/or personal observations of several presentations. Foremost among them James' Research about Wikimedia: A workshop [1] I also hope that James as Chief Research Officer could give us a sense of direction and timing: the mission of the Wikimedia Research Network [2] is lofty, the number of Wikimedians that subscribed large, but the current status for most activities seems to be 'idle' [3] [4] ? Also is there any coordination with external research groups, like mentioned on [5] and elsewhere [6] ? Would it be useful to divide Wikimedia Research Network activities in A Quantitative Analysis B Social Research Collaborations [7] C Other Activities and coordinate these separately? C would still cover 50%+ of the WRN mission statement, like: identify the needs of the individual Wikimedia projects, make recommendations for targeted development, guide and motivate outside developers, assist in the study of new project proposals. I expect on Wikimania most social science sessions [8] presented relevant material and either used or added to quantative research. So there is synergy between A and B. [1] http://wikimania2006.wikimedia.org/wiki/Proceedings:JF1 [2] http://meta.wikimedia.org/wiki/Wikimedia_Research_Network [3] http://meta.wikimedia.org/wiki/Category:Research_Team [4] http://meta.wikimedia.org/wiki/Research/Research_Projects [5] http://meta.wikimedia.org/wiki/Research [6] http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wikidemia [7] http://meta.wikimedia.org/wiki/Research/Social_Research_Collaborations [8] http://wikimania2006.wikimedia.org/wiki/Category:Wiki_Social_Science ---------------------------------------------------------------------- Communication There was no IRC meeting of the Research Team after December 2005. There are pretty active Wikimedia researchers outside the team though. For me Wikimania 2006 confirmed that more exchange of ideas would be helpful. I'm not sure more IRC discussions are a panacea. Personally I prefer discussion via wiki and mailing list, it is less spontaneous but one can easier formulate a coherent proposal or comment on it in a thoughtful manner, and no less important: it is much better to follow for others who read the discussion later. Part of the information flow is now on meta, some of it on the research mailing list [8] (which is largely dormant [9], though recent posts are very useful). And some of it on the freelogy list [10] and probably elsewhere. What about making the Wikimedia research list the central forum for all broad and conceptual discussions and link from there to meta for detailed discussions? I will post his mail there anyway, of course without the images. [8] http://mail.wikipedia.org/pipermail/wiki-research-l/ [9] http://www.infodisiac.com/Wikipedia/ScanMail/Wiki-research-l.html [10] http://karma.med.harvard.edu/mailman/listinfo/freelogy-discuss ---------------------------------------------------------------------- Visualisation I personally enjoyed very much session Can Visualization Help? [11] = IBM researcher Fernanda Viégas [12] talked about the famous Wikipedia History Flow tool [13], which was recently extended, announced a free edition and told that Tim Starling had pledged to reinstate the relevant export function so that we can use the tool on our projects. = IBM researcher Martin Wattenberg [14] showed his newest toy where one can see all contributions of one single Wikimedia editor, presented as an association cloud (titles grouped per namespace and sorted by number of edits, font size varied per title to express relative number of edits). It is somewhat scary though, I feel a quantitative improvement - exposing data that are already online in a much more efficient manner -, can lead to a qualitative setback - exposing ones character and interests in a way that was never expected. People may after all regret that they edited under their real name. Although personally I will happily continue to do so, it is a matter of responsibility towards the community to at least discuss whether we should actively promote such a tool. I know I'm partially guilty in this respect myself with mailing list stats but feel that did not cross the line. = Visualization guru Ben Schneiderman [15] made a case for more advanced data visualisation tools to spice up wikistats. I am a long time admirer of several of his UI inventions and happy to take up the challenge. [11] http://wikimania2006.wikimedia.org/wiki/Proceedings:FV1 [12] http://alumni.media.mit.edu/~fviegas/ [13] http://www.research.ibm.com/visual/projects/history_flow/ [14] http://www.bewitched.com/ [15] http://www.cs.umd.edu/~ben/ --------------------------------------------------------- General User Survey One promising but sleeping WRT project, that I initiated myself, is the 'General User Survey' [21]. A few Wikimania participants interested in wikistats gathered ad hoc at lunch time on Saturday (others interested in the project, Cormaggio, Piotrus were at the conference, but not in the vicinity at that moment). Kevin Gamble, associate director of 75 Land-Grant Universities, expressed his continued interest and said he might be able to offer programming support A project definition plus rationale [21] and a mockup questionnaire form [22] have been created and discussed for more than a year. I started the transition towards technical design [23] and with Kevins support and resources coding might follow later this year. Once we have a proof of concept in e.g. English and German (at least two languages to show multilingual aspects) I'm sure more people will start to take notice, and help to discuss and fine-tune the questionnaire. At a later stage, before going live with a multilingual golden edition, we will probably have to discuss matters with the board (Anthere already stated her support) in order to make this an official survey, hopefully with coverage on the project pages themselves (banner announcement ?). Mind you, the implementation is not exactly trivial, lots of issues involved that require critical discussion, code and coordination. I invite everyone to comment on tech notes, especially of course Kevin, and hope to learn from him whether coding this project fits within his budget. [21] http://meta.wikimedia.org/wiki/General_User_Survey [22] http://meta.wikimedia.org/wiki/General_User_Survey/Questionnaire [23] http://meta.wikimedia.org/wiki/General_User_Survey/Implementation_Issues -------------------------------------------------------- Quantitative Analysis Saturday I met Jeremy Tobacman. We had a long and very interesting discussion, mainly on new initiatives centered around the freelogy servers. Jeremy proposed to held an impromptu lunch meeting on Sunday and gathered a room full of people. [pictures removed] Several mails have already been written about this, but to a smaller audience. So here are a few highlights. Issues that were discussed: 1 Hardware The two tool servers [32] are very crowded and insufficient for all stats jobs we might want to run. The tool servers run a mirror of the live database so well behaved SQL queries are possible. Well behaved meaning they should no try to emulate the xml dump process where extracting the English Wikipedia (all revisions) already takes a full week. Alexander Wait (Sasha) has access to huge hardware resources, enough to calculate how many parallel universes it takes to find at least one zebra couple where a black-and-white mother and a white-and-black father have exactly mirrored patterns and thus produce offspring that is either all black or all white (mind you, albino's are false positives). Since in reality Sasha is merely interested in unraveling the secrets of DNA he has some cpu cycles to spare. Upon request virtual machines can be catered for. The freelogy-discuss mailing list archives have information about hardware availability [33] By the way, Jeremy and Erik Tobacman have a server at The National Bureau of Economic Research (NBER) for quantitative research on Wikipedia. Also I am urged by the Communications Subcomittee to spend more of my time on publishable stats (in time spent TomeRaider offline edition of Wikipedia easily dominated, but the time for offline browsing is nearly over) and they want me to have a dedicated server. I would like it to be well utilised, but of course it should produce timely wikistats in the first place, as that is what it is offered for. To be discussed. 2 Real time data collection / Performance / Storage It would be useful to learn when a page is being slashdotted or otherwise in the news, at the moment of the actual event, in order that vandal patrols can be timely summoned, and article improvement can commence right away. Major performance issues need to be addressed. Do we gather and keep every page hit ? Hardly practicable. Wikimedia visitor stats were not disabled for no reason. It seems we are getting switches that can log accesses stochastically (e.g. every 100nth access, plus for a selected subset of IP addresses all hits to monitor navigation patterns). There might be a need to store data in aggregated (condensed) form, as volumes will be huge. At least tapping from switches directly puts no burden on squids (=web proxies/caches). Brion will be asked to drop bz2 compression on xml dump job, as it is so much slower and compresses so much less than 7zip. Brion had to develop a distributed version of bzip to get it working at all on the 800 Gb enwiki dump file. Format bz2 is however supported on more platforms, so Brion may no comply. Specifically about wikistats: I explained why I always process the full historic dump instead of doing incremental steps: new functionality in wikistats means processing it all anyway. Data for older months are not really static due to frequent deletions and moves. Could I speed up counts section of wikistats by splitting job over several servers ? I'll have to look into it. 3 Data publishing We should be careful not to publish very granular data for outside inspection. It is a well known fact that China wants complete control over its citizens. Less known is that they have the latest technology (mainly bought in the US) and lots of it, and about 30.000 IT professionals (estimate by Reporters without Borders/Reporters sans Frontières) working on concealment of internet resources, redirection of internet requests and spying on internet usage patterns in general. They would love to see our raw access logs. Cathy will you attend the Chinese Wikimania? [34] If you happen to hear about these things, I hope you will blog about it. See also [35] See also well timed scoop [36] about AOL privacy disaster. 4 Measuring quality quantitatively It may be impossible to define quality, let alone measure it, But it will be fun to zoom in on it and see how far we can come. Spurred by Jimbo's excellent Wikimania kick off speech, where he stressed we will need more attention to quality, I started a project to extend wikistats. Brian offered lots of ideas and hopefully will prove me wrong in my belief that adding spelling, grammar and readability assessments is not to be taken too lightly in a multilingual environment [37] [38] [31] http://wikimania2006.wikimedia.org/wiki/Proceedings:CM1 mp3 audio available [32] http://meta.wikimedia.org/wiki/Toolserver [33] http://karma.med.harvard.edu/mailman/private/freelogy-discuss/2006-May/00000 2.html (registration needed: http://karma.med.harvard.edu/mailman/listinfo/freelogy-discuss) [34] http://en.wikinews.org/wiki/Chinese_Wikimania_2006_to_be_held_in_Hong_Kong [35] http://wikimania2006.wikimedia.org/wiki/User:Roadrunner (I wonder if he is the person who gave a smashing full hour speech on this at 20c3 Berlin) [36] http://www.siliconbeat.com/entries/2006/08/06/aol_research_exposes_data_weve _got_a_little_sick_feeling.html (data were anonimized but some users had searched for their own name several times and were easily recognized, lots of very embarrassing stuff was uncovered) [37] http://meta.wikimedia.org/wiki/Wikistats/Measuring_Article_Quality (conceptual overview) [38] http://meta.wikimedia.org/wiki/Wikistats/Measuring_Article_Quality/Operation alisation_for_wikistats --------------------------------------------------------- Ongoing By the way Angela Beasley and Jakob Voss will give a workshop on Wikipedia research on WikiSym 2006 [41] [42] [41] http://ws2006.wikisym.org/space/Workshop%3E%3EWikipedia+Research [42] http://meta.wikimedia.org/wiki/Workshop_on_Wikipedia_Research%2C_WikiSym_200 6 Regards, Erik Zachte

17 years, 8 months

outline of possible social research collaborations

by Doug M

Hi, On August 6, 2006, at lunch on the last day of Wikimania, eight doctoral students in the humanities and social sciences, from around the world (Italy, Ireland, Germany, Greece, Taiwan, U.S.--Georgia, Boston, Chicago), discussed various ways to collaborate. This web page on Wikimedia is one place we may continue discussing collaborations: Social Research Collaborations http://meta.wikimedia.org/wiki/Research/Social_Research_Collaborations If you know any graduate students who are actively engaged in social research of aspects of the free culture movement or of wikimedia projects, please let them know about this webpage. Thanks, Doug

17 years, 8 months

Final CfP: The 2006 Wiki Symposium

by Dirk Riehle

WikiSym 2006 is upon us in two weeks. To get an idea of what the conference will be like, please view the WikiSym program: http://www.wikisym.org/ws2006/program.html You may also enjoy the "How and Why Wikipedia Works" interview with our keynoter: http://www.riehle.org/computer-science/research/2006/wikisym-2006-interview… ================ CALL FOR PARTICIPATION WIKISYM 2006: THE 2006 INTERNATIONAL SYMPOSIUM ON WIKIS August 21-23, 2006, Odense, Denmark CO-LOCATED WITH ACM HYPERTEXT 2006 See http://www.wikisym.org/ws2006 Archival - Peer Reviewed - ACM Sponsored EARLY REGISTRATION DEADLINE APPROACHING: June 19, 2006 GENERAL INFORMATION This year's Wiki Symposium brings together wiki researchers and practitioners in the historic and beautiful city of Odense, Denmark, on August 21-23, 2006. Participants will present, discuss, and move forward the latest advances in wiki contents, sociology, and technology. The symposium program offers invited talks by Angela Beesley ("How and Why Wikipedia Works"), Doug Engelbart and Eugene E. Kim ("The Augmented Wiki"), Mark Bernstein ("Intimate Information") and Ward Cunningham ("Design Principles of Wikis"). The research paper track presents and discusses breaking wiki research, the panels let you listen to and contribute to topics like "Wikis in Education" and "The Future of Wikis", and the workshops let you get active and contribute to on-going research and practitioner work with your peers. (Many workshops accept walk-ins, so it is not too late!) What's more, for the first time, we will have an on-going openspace track (to replace BOFs) so you can get active and involved in an organized fashion on any wiki topic you like. We believe this is how to get the most out of your experience at WikiSym 2006! And, of course, if you can't wait, please join our conversation on wiki research and practice on the symposium wiki at http://ws2006.wikisym.org PROGRAM OVERVIEW See http://www.wikisym.org/ws2006/program.html Keynotes and invited talks: * Angela Beesley: How and Why Wikipedia Works * Doug Engelbart and Eugene E. Kim: The Augmented Wiki * Mark Bernstein: Intimate Information * Ward Cunningham: Design Principles of Wiki Panels on: * Wikis in Education * The Future of Wikis Research papers and practitioner reports on: * wiki technology * wiki sociology and philosophy * wiki uses, for example, in software, education, and politics and many more, see http://www.wikisym.org/ws2006/program.html#Papers Workshops on: * wikis in education * wikipedia research * wiki markup standards * wikis and the semantic web And, of course: Demos! We have pre-set demos, but please feel free to bring your own notebook! We will provide space for you to demo on-the-spot in our Monday night demo session, a favorite from WikiSym 2005. SYMPOSIUM LOGISTICS Handled through the Hypertext 2006 website: * Conference registration: http://hypertext.expositus.com/information.asp?Page=76&menu=13 * Conference hotel: http://hypertext.expositus.com/information.asp?Page=93&menu=13 * Travel information: http://hypertext.expositus.com/information.asp?Page=91&menu=13 SYMPOSIUM COMMITTEE Dirk Riehle, Bayave Software GmbH, Germany (Symposium Chair) Ward Cunningham, Eclipse Foundation, U.S.A. Kouichirou Eto, AIST, Japan (Publicity Co-Chair) Richard P. Gabriel, Sun Microsystems, U.S.A. Beat Doebeli Honegger, UAS Northwestern Switzerland (Workshop Chair) Matthias L. Jugel, Fraunhofer FIRST, Germany (Panel Chair) Samuel J. Klein, Harvard University, U.S.A. Helmut Leitner, HLS Software, Austria (Publicity Co-Chair) James Noble, Victoria University of Wellington, New Zealand (Program Chair) Sebastien Paquet, Socialtext, U.S.A. (Demonstrations Chair) Sunir Shah, University of Toronto, Canada (Publicity Co-Chair) PROGRAM COMMITTEE James Noble, Victoria University of Wellington, New Zealand (Program Chair) Ademar Aguiar, Universidade do Porto, Portugal Robert Biddle, Carleton University, Canada Amy Bruckman, Georgia Institute of Technology, U.S.A. Alain Désilet, NRC, CNRC, Canada Ann Majchrzak, University of Southern California, U.S.A. Frank Fuchs-Kittowski, Fraunhofer ISST, Germany Mark Guzdial, Georgia Institute of Technology, U.S.A. Samuel J. Klein, Harvard University, U.S.A. Dirk Riehle, Bayave Software GmbH, Germany Robert Tolksdorf, Freie Universität Berlin, Germany

17 years, 8 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l August 2006