Hi Travis,
I thought that was you when I read your post; yes, we did indeed talk. Actually, it was
after our talk that I went through extensive searching to find what is considered top-tier
in computer science. Here are brief comments I should have included earlier explaining how
I came up with the three sources of computer science "high quality"
conferences:
* Top Tier and 2nd tier conferences from
http://webdocs.cs.ualberta.ca/~zaiane/htmldocs/ConfRanking.html: In extensive searching
for computer science conference rankings, this is the absolute best I could find, and most
other rankings I found have either referred to or copied from this list.
* A-ranked conferences in Information and Computing Sciences from
http://lamp.infosys.deakin.edu.au/era/?page=cforsel10: This is the most exhaustive journal
ranking exercise I have ever found anywhere. Unfortunately, I like you have serious
questions about the face validity of these rankings; I think they heavily overrate many
conferences in my own field of information systems; I assume the same is true with other
fields that I don't know so well. (My primary reservation with conference or journal
rankings by professors is that I strongly suspect that one of the main criteria for their
rankings is whether or not they have published in that outlet before.) Unfortunately, I
don't know of anything that approaches this ranking in comprehensiveness.
* We also considered including all WikiSym articles on Wikipedia: This is not because of
any statement of WikiSym's quality, but simply because WikiSym is probably the closest
thing that exists to an academic conference specifically for Wikipedia-related research.
Is there no widely-accepted listing of computer science conference rankings? You say,
"Everyone in my field (HCI) pretty much knows what the first tier conferences are
where wikipedia research is published." The problem is that I could say the same
thing about my field, but another researcher would have a different list. There is
generally consensus about the top two or three in any field, but the huge grey zone comes
when you try to draw a line. Even your idea of getting small groups of experts to validate
a number of conferences is pretty shaky, since another small group of experts would almost
definitely give different results.
Citation counts are always a sticky issue; they depend mainly on indexing by citation
count databases and recency of articles. However, I do consider them one of the most
objective (not necessarily one of the best, but one of the most objective) criteria for
paper quality. Based on your suggestion, I just now discovered that ACM Digital Library
includes citation counts for conference papers. By way of brainstorming, I'm thinking
of this possible inclusion rule:
* Calculate (a) the average citation count for Wikipedia articles (either only journal
average, only conference average, or average of both), and (b) average citation count for
each journal and/or conference that publishes Wikipedia research. (b) is basically (a)
grouped by journal/conference.
* Rather than doing raw citation counts, we could try to calculate citations per year or
some other weighting that recognizes that more recent articles would have fewer citations
than older ones.
* Include all conference papers greater than the average (whichever average we choose)
and/or include conference papers from all conferences greater than the average. Or we
could include all conference papers whose average citations per year are greater than the
average for journal articles. Or just include the top 100 ranked conference papers, or
however many we can handle.
Although still somewhat artificial, this could give possibly give us a somewhat objective
basis to filter up the "higher quality" conference papers based on citation
analysis.
I don't know if I'm trying to go far with this citation count possibility, but
what do you all think?
Thanks again,
Chitu
-------- Message original --------
Sujet: Re: [Wiki-research-l] Wikipedia literature review - include or exclude conference
articles (was Request to verify articles for Wikipedia literature review)
De : Travis Kriplean <travis(a)cs.washington.edu>
Pour : Research into Wikimedia content and communities
<wiki-research-l(a)lists.wikimedia.org>
Copie à : Chitu Okoli <Chitu.Okoli(a)concordia.ca>
Date : 15/03/2011 5:26 PM
Hey there,
I sympathize with your dilemma...and I think we might have actually talked about this at
Wikimania 2009. Unfortunately, while you may be satisfied that 600 journal articles +
theses is enough (I certainly would be too), you should be equipped to recognize that if
you keep it that way you are systematically excluding large, significant bodies of
research deriving from computer science and HCI. As you make this choice, read through one
or two of these conference papers and measure it against the quality of a randomly
selected set of journal articles in your set:
-
http://dub.washington.edu/djangosite/media/papers/tmpZ77p1r.pdf
-
http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/download/1485/1841
-
http://www.cs.cornell.edu/~danco/research/papers/suggestbot-iui07.pdf
-
http://users.soe.ucsc.edu/~luca/papers/07/wikiwww2007.pdf
-
http://portal.acm.org/citation.cfm?id18928
I bet that these conference papers are on the balance of higher quality than a random
journal article in your set.
Unfortunately, there isn't a good answer for the best methods to follow. Everyone in
my field (HCI) pretty much knows what the first tier conferences are where wikipedia
research is published: CHI, CSCW, and UIST; and second tier at GROUP. These are all under
the ACM SIGCHI banner (
http://www.sigchi.org/). Another way to put this is that there are
no objective measures, its a question of what the researchers themselves see as high
quality. Ultimately, this is the same as with journals, although they tend to have impact
factors. If I were to estimate how many high quality conference papers from the HCI angle
there are, I would put it at about 20-30.
Of course, this is only for HCI research, not all CS research. Conferences such as WWW
have published excellent research on Wikipedia, such as the initial paper out of the
WikiTrust group, which, if you've been around the wiki community, know that they have
had a big impact. WWW is considered to be a high quality CS conference. Likewise, there
has been Wiki research published at database and AI conferences. For example, the
Intelligence in Wikipedia project (summarized here
http://portal.acm.org/citation.cfm?id20344).
Unfortunately, your two links to top conferences are pretty much inaccurate pictures of
the CS conference field (for example, the deakin link puts GECCO as the top conference in
one of the major categories, which is basically laughable). And while we might all love
wikisym, it from an academic standpoint, it is definitely not a tier one venue.
I cringe to suggest this, but one possible methodology you might follow is to do citation
count filtering, using, e.g. google scholar. Citations give you an indicator of whether
other researchers have found it useful to draw on. Look at the average citation count of
the journal papers, then filter your list of 1500 conference papers down to those papers
that have, say, twice the citations as the average citation count of a journal article.
Honestly though, your best methodology would be to have a small group of HCI researchers,
a small group of AI researchers, and a small group of database researchers who have worked
on wikipedia compile a list of the conference papers that they believe are best
representative of the research that that community has done on wikipedia.
Hope that helps, and sorry to hear you still struggling with this issue.
Best,
Travis
On 3/15/11 11:56 AM, Chitu Okoli wrote:
> James and Travis, you bring up a point that we have struggled back and
> forth with for several months. We really, really would like to include
> conference articles, but we just can't see how we could handle many more
> articles than what we've got now. We've been working on and off on this
> project for over two years now. (You can find works in progress at the
> link at the bottom to my website.) We'd like to get it done eventually,
> and we can only handle so many articles.