Re: [Wikitech-l] OPW intern looking for feedback!

26 Mar 2013

      On Mar 27, 2013 1:31 AM, "Teresa Cho" tcho708@gmail.com wrote:
...
Hi everyone,
My name is Teresa (or terrrydactyl if you've seen me on IRC) and I've
been interning at Wikimedia for the last few months through the
Outreach Program for Women[1]. My project, Git2Pages[2], is an
extension to pull snippets of code/text from a git repository. I've
been working hard on learning PHP and the MediaWiki
framework/development cycle. My internship is ending soon and I wanted
to reach out to the community and ask for feedback.
Cool stuff!
...
Here's what the program currently does:

User supplies (git) url, filename, branch, startline, endline using

the #snippet tag

Git2Pages.body.php will validate the information and then pass on

the inputs into my library, GitRepository.php

GitRepository will do a sparse checkout on the information, that is,

it will clone the repository but only keep the specified file (this
was implemented to save space)

The repositories will be cloned into a folder that is a md5 hash of

the url + branch to make sure that the program isn't cloning a ton of
copies of the same repository
Why hash it, and not just keep the url + branch encoded to some charset
that is a valid path, saving rare yet hairy collisions?
...

If the repository already exists, the file will be added to the

sparse-checkout file and the program will update the working tree
Will there be a re checkout for a duplicate request? Will the cache of
files ever be cleaned?
...

Once the repo is cloned, the program will go and yank the lines that

the user requested and it'll return the text encased in a <pre> tag.
This is my baseline program. It works (for me at least). I have a few
ideas of what to work on next, but I would really like to know if I'm
going in the right direction. Is this something you would use? How
does my code look, is the implementation up to the MediaWiki coding
standard?    buttt You can find the progression of the code on
gerrit[3].
Here are some ideas of what I might want to implement while still on
the internship:

Instead of a <pre> tag, encase it in a <syntaxhighlight lang> tag if

it's code, maybe add a flag for user to supply the language

Keep a database of all the repositories that a wiki has (though not

sure how to handle deletions)
Here are some problems I might face:

If I update the working tree each time a file from the same

repository is added, then the line numbers may not match the old file

Should I be periodically updating the repositories or perhaps keep

multiple snapshots of the same repository

Cloning an entire repository and keeping only one file does not seem

ideal, but I've yet to find a better solution, the more repositories
being used concurrently the bigger an issue this might be

I'm also worried about security implications of my program. Security

isn't my area of expertise, and I would definitely appreciate some
input from people with a security background
Thanks for taking the time to read this and thanks in advance for any
feedback, bug reports, etc.
Have a great day,
Teresa
http://www.mediawiki.org/wiki/User:Chot
[1] https://www.mediawiki.org/wiki/Outreach_Program_for_Women
[2] http://www.mediawiki.org/wiki/Extension:Git2Pages
[3]
https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Git2Pages,n,...
...

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] OPW intern looking for feedback!