Study the effects of diversity of authors and work distribution on Wikipedia articles

Project description: 
  • Data basis: Article content with meta-data on each word, consisting of the original author (and following editors and their changes) + additional metadata as needed about edits, editors, talk pages, etc. Can also be used to create profiles of how successful editors are in adding changes (i.e. such that don’t get deleted afterwards) if wanted
  • Goal: Gain empirical insight into (a) how diversity of an article (how many words were written by different editors) is distributed over articles, (b) how invested work (in terms of changes to words) is distributed inside articles (position in text) and (c) how this affects article quality (e.g. in terms of “featured” or “disputed” tags on articles/section). Alternative: find out which editors are especially “successful” in implementing their changes or other analyses of your design on this data. 
  • Method: Analyse the given data with statistical methods, maybe compute additional metrics. 
  • Team: should consist of at least 50% members with good statistical analysis skills and ability to deal with bigger data (several thousand Wikipedia articles) (with e.g. R, Python)