Wikitech-l April 2012

wikitech-l@lists.wikimedia.org

152 participants
188 discussions

by David Gerard

Lots of monitoring going into place: https://en.wikipedia.org/wiki/Wikipedia:List_of_articles_censored_in_Saudi_… http://www.bbc.co.uk/news/uk-politics-17576745 What are the current technical barriers to redirection to https by default? - d.

12 years, 1 month

GSOC 2012

by Sudeep Singh

Hi, I am sudeep. I am final year student at Indian Institute of Technology, Kharagpur in the computer science department. I am interested to apply in the following projects for gsoc 2012 1. Lucene automatic query expansion from wikipedia text 2. Backwards compatibility extension 3. Semantic form rules 4. Index transcluded text in search I have a strong background in Information retrieval and Machine learning. I have worked previously with Yahoo Research Labs in the area of Information retrieval. We extracted association rules and attribite-value pairs from the webpages using unsupervised approach. I have also worked on another project with yahoo, which involved emotion detection of youtube videos, based on the comments of the users. We used various ML, Statisitcs andf IR techniques to achieve our goal. I last year succesfully completed GSOC 2011, with OSGEO and have good experience in Open Source Development. Kindly let me know how shall I proceed with my application. Thanks regards Sudeep

12 years, 1 month

GSOC 2012: Mentees interested in the 'Who's been awesome' proposal and mentors still looking for help

by James Alexander

Hey everyone, I've had an awful lot of interest in the Who's Been Awesome/Get merchandise to reward the community extension I proposed and we can only really take one in the end so I wanted to make sure that everyone knew the score and mentors still looking for help could chime in and let us know. There has been 8 or 9 people ask about the project and we have 1, almost full, proposal so far. Part of that has been me being slow in responding but if you're interested I encourage you to either submit a proposal soon or look at other options (or both!) so that we can get as many people in a possible! If you are still waiting for answers from me or you have other questions feel free to shoot me an email and I'll be setting time aside tomorrow (bed soon) to go through them all. Other mentors if you're still looking for help please let us know so that we can get as many of these great candidates as possible! James -- James Alexander Manager, Merchandise Wikimedia Foundation (415) 839-6885 x6716 @jamesofur

12 years, 1 month

[GSoC] project proposal

by Trinh Hoang Nguyen

Hi there, Could you please have a look at my project proposal for Google Summer of Code https://www.mediawiki.org/wiki/User:Trinhtomsk/GSoC_2012_application Any comments would be appreciated! Thank you. -- Best regards, Trinh Hoang Nguyen

12 years, 1 month

Fwd: WMF Engineering org charts

by Erik Moeller

FYI ---------- Forwarded message ---------- From: Erik Moeller <erik(a)wikimedia.org> Date: Wed, Apr 4, 2012 at 5:34 PM Subject: WMF Engineering org charts To: Wikimedia Foundation Mailing List <foundation-l(a)lists.wikimedia.org> Hi folks, as I mentioned in a response to Liam the other day, we've been working on having org charts generated in a more automatic, scalable form. A contractor, Mark Holmquist, has been working on an open source tool for this the last few weeks to do this. It's still highly experimental. In particular, we're exploring alternative layout options (full horizontal only works well for small org charts). Don't be surprised if it breaks or behaves weirdly, and Mark may be making changes to the labs setup at any time. Most importantly, it's only engineering right now. I've plopped in a couple of examples for the other depts, but ignore those until they're built out, provided my colleagues are happy with the tool. This is not authoritative, and there are almost certainly errors, beyond the aforementioned omission of all other departments. :) Explanations: * The little flags represent locations. Typically they're ccTLD country codes, except for SF for the SF office. * Mousing over a node in the chart gives you extended information. * (E) = employee, (C) = contractor * Dashed border = this is a contractor role that's not part of the overall capacity plan and funded out of the general "outside contractor services" budget The tool makes it possible to have unique URLs for any subset view of the chart, and that's the best way to use it. These are the departments in engineering/product: TechOps: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f5e806d0b3d0d0f2e000009 Features: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f5e806c0b3d0d0f2e000003 Platform: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f5e806c0b3d0d0f2e000007 Mobile: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f6cd85eeef7b93804000002 I18N/R&D: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f6cff2d9f293f1b13000001 Product: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f6d01199f293f1b13000003 All of engineering: http://orgcharts.wmflabs.org:8888/#of-unit-box-for-4f5e806c0b3d0d0f2e000002 Code is currently here and will be moved to WMF git repo soon: http://code.marktraceur.info/?p=wmf-orgchart;a=summary If you'd like to be involved or provide detailed feedback for improvement, please post to the talk page here: https://www.mediawiki.org/wiki/Wikimedia_tools/Org_chart_tool All best, Erik -- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate -- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate

12 years, 1 month

MediaWiki core deployments starting in April, and how that might work

by Rob Lanphier

Hi everyone, This email is going to briefly describe the old SVN workflow, and then use that as a baseline to describe what we should do for Git. I haven't had a chance to coordinate this mail with Chad (or anyone else), so I'll reserve the right for him to completely contradict me here. This is meant to provoke a discussion about how we're really going to use Git, and to establish a plan for taking advantage of the new workflow to move to much more frequent deployments. In the old world, we had this: trunk ├── REL1_17 │ └── 1.17wmf1 (branched from REL1_17) ├── REL1_18 │ └── 1.18wmf1 (branched from REL1_18) └── REL1_19 └── 1.19wmf1 (branched from REL1_19) Tarball releases would come out of the respective REL1_xx branches, and deployments would come out of the 1.xxwmf1 branches. REL1_xx branches have all extensions, and 1.xxwmf1 branches have only Wikimedia production code. Each would be a relatively long lived branch (6-18 months) into which critical fixes and priority features would be merged from trunk. Looking ahead to deployments, there's a couple of different ways to go about this: One plan would be to have a "wmf" branch that does not trail far behind the master. The extensions we deploy to the cluster can be included as submodules for that given branch. The process for deployment at that point will be "merge from master" or "update submodule reference" on the wmf branch. Then on fenari, you will git pull and git submodule update before scapping like you're currently used to. The downside of this approach is that there's not an obvious way to have multiple production branches in play (heterogeneous deploy). Seems solvable (e.g wmf1, wmf2, etc), but that also seems messy. Another possible plan would be to have something *somewhat* closer to what we have today, with new branches off of trunk for each deployment, and deployments happening as frequently as weekly. master ├── 1.20wmf01 ├── 1.20wmf02 ├── 1.20wmf03 ... ├── 1.20wmf11 ├── 1.20wmf12 ├── REL1_20 ├── 1.21wmf01 ├── 1.21wmf02 ├── 1.21wmf03 ... This is how I was envisioning the process working, and just didn't get a chance to sync up with Chad to find out what the issues of this approach would be. Since we don't have an imminent deployment coming from Git, we have a little time to figure this situation out. Regardless of the branching strategy, the goal would be to start as early as April with much more frequent deployments to production. The deployment plan would look something like this: * Deploy 1.20wmf01 to test2 real soon now (say, no later than April 16). * Deploy 1.20wmf01 to mediawiki.org a couple deploy days after that ("deploy day" meaning Monday through Thursday) * Let simmer for some short-ish amount of time (TBD) * Roll out 1.20wmf01 to more wikis, eventually making it to all of them Given the way APC caches and other caching works, I suspect we can't get away with having more than two simultaneous versions out on the production cluster, but we could conceivably have a situation where, for example, a deploy day or two after rolling out 1.20wmf01 out to the last of the wikis, we then roll out 1.20wmf02 out to test2. This topic is partially covered here: https://www.mediawiki.org/wiki/Git/Workflow#Who_can_review.3F_Gerrit_projec… ...but I imagine we'll probably need to revise that based on this conversation and perhaps break this out into a separate page. There's a few of us that plan to meet in a couple of weeks to formalize something here, but perhaps we can get this all hammered out on-list prior to that. Thoughts on this process? Rob

12 years, 1 month

GSOC 2012 : Lucene Automatic Query Expansion from Wikipedia Text

by Gautham Shankar

Hello, I'm Gautham Shankar from India pursuing my 4th year bachelors in computer science and engineering.I find the project proposal "Lucene Automatic Query Expansion from Wikipedia Text" in GSOC 2012 very interesting and would love to work on it. i have created a proposal for the idea https://www.mediawiki.org/wiki/User:Gautham_shankar/Gsoc I have experience in data mining and have built a recommendation framework using the heat diffusion principle which has been tested on the AOL search dataset to recommend better queries that can be typed for a given input query.It has been implemented in java. Since it is a framework it can be used to recommend different types of data. for example the same framework can be used to recommend movies as well as music.Im currently working on an extension of this project to add social network graphs so as to recommend people who have the same interests in movie, music etc when a query is typed. I have also built a web based product "hive" which is a networking platform for members of the power generation industry. The users can share their experiences and it is a open forum where members interact with one another to effectively run their machines and solve common problems. The product has been implemented using PHP, mysql, javascript (inc ajax). Lucene is the search engine and phpbb is used for forums. <https://www.mediawiki.org/wiki/User:Gautham_shankar/Gsoc> it would be very helpful if anyone could give a feedback and guide me in improving the proposal. Eagerly awaiting a response. Regards, Gautham Shankar

12 years, 1 month

(no subject)

by Valentina Faussone

<a href="http://beta2.thenextlevelinc.ca/components/com_virtuemart/shop_image/ps_ima…"> http://beta2.thenextlevelinc.ca/components/com_virtuemart/shop_image/ps_ima…</a>

12 years, 1 month

Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools (Oren Bochman) (Amir E. Aharoni)(Gregory Varnum)

by karthik prasad

Dear Sirs, I am grateful for your valuable feedback and suggestions. I have updated my proposal based on the inputs given by you. The split-up of the deliverables on the ideas page indeed helped me understand the requirements more clearly. The link to my updated proposal is https://www.mediawiki.org/wiki/User:Karthikprasad/gsoc2012proposal I request you and everyone to kindly skim through my proposal once again and suggest changes/additions. I am very excited about this project and working with you; and truth be told, 23rd April seems like ages ahead. Thanking you, Yours sincerely, Karthik > Date: Wed, 4 Apr 2012 11:49:41 +0200 > From: "Oren Bochman" <orenbochman(a)gmail.com> > To: "'Wikimedia developers'" <wikitech-l(a)lists.wikimedia.org> > Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools > Message-ID: <007f01cd1248$42ee6f40$c8cb4dc0$@com> > Content-Type: text/plain; charset="utf-8" > > You do understand correctly! > > The main idea about NLP components is with POS tagger as an example: > > 1. a fall back system that does unsupervised POS tagging. > 2. the ability to plug in an existing POS tagger as these become > available for specific languages. > > I would as supervisor would recommend working with 3 languages. > English, Hebrew, and the GSOC native language. > > If we could get QA from other native speakers we would incorporate them > into the workflow. > > I think that by using a deletion/reversion based heuristic we may also be > able to make a spam corpus to boost the accuracy of the corpuses. > > > Operation Manager > E-mail: oren(a)romai-horizon.com > Mobil: +36 30 866 6706 > > > > R?mai Horizon Kft. > H-1039 Budapest > Kir?lyok ?tja 291. D. ?p. fszt. 2. > Tel: +36 1 492 1492 > Fax: +36 1 266 5529 > > -----Original Message----- > From: wikitech-l-bounces(a)lists.wikimedia.org [mailto: > wikitech-l-bounces(a)lists.wikimedia.org] On Behalf Of Amir E. Aharoni > Sent: Tuesday, April 03, 2012 10:19 PM > To: Wikimedia developers > Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools > > 2012/4/3 karthik prasad <karthikprasad008(a)gmail.com>: > > Hello, > > I am a GSoC aspirant and have compiled a proposal for one of the > > project ideas - Wikipedia Corpus Tools. [Mentor : Oren Bochman] I > > would sincerely appreciate if you could kindly go through it and > > suggest corrections/additions so that I can settle with a coherent > proposal. > > > > Link to my proposal : > > https://www.mediawiki.org/wiki/User:Karthikprasad/gsoc2012proposal > > Nice, but why only English? > > If i understand the proposal correctly, this project is supposed to be > able to work with almost any language with very little effort. > > -- > Amir Elisha Aharoni ? ?????? ????????? ?????????? > http://aharoni.wordpress.com ??We're living in pieces, I want to live in > peace.? ? T. Moore? > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > > ------------------------------ > > > Date: Wed, 4 Apr 2012 12:58:11 +0300 > From: "Amir E. Aharoni" <amir.aharoni(a)mail.huji.ac.il> > To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org> > Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools > Message-ID: > <CACtNa8tS-PifzJS1JsF02k3qW_-7=UK-wDQnVSfLGLufhxnmNw(a)mail.gmail.com > > > Content-Type: text/plain; charset=UTF-8 > > 2012/4/4 Oren Bochman <orenbochman(a)gmail.com>: > > You do understand correctly! > > > > The main idea about NLP components is with POS tagger as an example: > > Just to make sure, POS = part of speech, isn't it? > > It's one of the most confusing TLAs in computing :) > > > If we could get QA from other native speakers we would incorporate them > into the workflow. > > Good. As long as there is a way to plug other languages and a way for > speakers of other languages to contribute QA, i'm very happy. > > -- > Amir Elisha Aharoni ? ?????? ????????? ?????????? > http://aharoni.wordpress.com > ??We're living in pieces, > I want to live in peace.? ? T. Moore? > Date: Wed, 4 Apr 2012 00:28:29 -0400 From: Gregory Varnum <gregory.varnum(a)gmail.com> To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org> Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools Message-ID: <AC4C429F-A839-4911-BE9B-C8928AA2DD8C(a)gmail.com> Content-Type: text/plain; charset=utf-8 Whoops - I meant that email to be directed to Karthik - although Amir you're welcome to read it as well. :) -greg On Apr 3, 2012, at 11:24 PM, Gregory Varnum <gregory.varnum(a)gmail.com> wrote: > Amir, > > Thank you for your GSOC proposal! :) > > Between now and Google's submission deadline on April 6th - you are invited to further modify your proposals. The GSOC page on MW.org - https://www.mediawiki.org/wiki/GSOC - and our IRC rooms - https://www.mediawiki.org/wiki/MediaWiki_on_IRC > > Looking over your proposal - I think you've got good background information on yourself. However, I think you should flush out more details on the proposed project. Without more familiarity with corpus (and with no links to find that info) - it's hard for everyone to weigh in equally or to make sure your project gets the full consideration you'd like. > > -greg aka varnent > > > On Apr 3, 2012, at 4:18 PM, Amir E. Aharoni <amir.aharoni(a)mail.huji.ac.il> wrote: > >> 2012/4/3 karthik prasad <karthikprasad008(a)gmail.com>: >>> Hello, >>> I am a GSoC aspirant and have compiled a proposal for one of the project >>> ideas - Wikipedia Corpus Tools. [Mentor : Oren Bochman] >>> I would sincerely appreciate if you could kindly go through it and suggest >>> corrections/additions so that I can settle with a coherent proposal. >>> >>> Link to my proposal : >>> https://www.mediawiki.org/wiki/User:Karthikprasad/gsoc2012proposal >> >> Nice, but why only English? >> >> If i understand the proposal correctly, this project is supposed to be >> able to work with almost any language with very little effort. >> >> -- >> Amir Elisha Aharoni ? ?????? ????????? ?????????? >> http://aharoni.wordpress.com >> ??We're living in pieces, >> I want to live in peace.? ? T. Moore? >> >> _______________________________________________ >> Wikitech-l mailing list >> Wikitech-l(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >

12 years, 1 month

GSOC-2012(proposal)Convention Extension

by akshay chugh

Hi everyone, I seek to work on building a Convention extension as part of the Google Summer of Code project.I have set up a proposal for the same , here is the link http://www.mediawiki.org/wiki/User:Chughakshay16/GSOCProposal(2012). I haven't found a mentor to work with me for this project yet, so if anyone feels the need for this extension just the way I do, please feel free to add the feedback to the proposal page, or reply here. More information regarding this extension can be found here :- http://www.mediawiki.org/wiki/User:Chughakshay16/ConventionExtension Thanks , Akshay Chugh (irc - chughakshay16)

12 years, 1 month

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l April 2012