On Sat, Sep 7, 2013 at 5:37 PM, Diederik van Liere <dvanliere@wikimedia.org> wrote:



On Sat, Sep 7, 2013 at 11:33 AM, Laura Hale <laura@fanhistory.com> wrote:



On Sat, Sep 7, 2013 at 5:13 PM, Federico Leva (Nemo) <nemowiki@gmail.com> wrote:
Diederik van Liere, 07/09/2013 16:51:


You think people won't mess up XML?

I don't dare suggesting what format could be better, I just had past experiences make my internal bells ring at the word "CSV" and wondered if the correct format is documented. Thanks for the clarification.


It would depend on who the audience is.  If it is for people with little research experience and with little experience in computer programming, then yes, csv is a must.  It is compatible with open office and Microsoft Excel.  It allows for a certain type of user to interact with it.  I do not believe xml renders as nicely in either program as csv.  Using a more complex file format makes it difficult for that audience to use it.  I would assume that other users would be less likely to need a tool that gets this data because they could build their own an customize the output to their specific needs.  Thus, their concerns seem like they should be secondary. 

My problem now is the output names are complete garbage, and I cannot tell what the heck the file is from the random string generated.
Can you describe your problem in more detail? Perhaps email me the cohort  that you are trying to upload as well.


I have no problem specifically other than the file output names.  If you want to give meaning to 1113d4e4-ca39-457d-af97-52338e7388e4.csv so that just looking at the output, I have some idea what it means, that would be great.  What does it contain compared to 8e59ef55-826f-4d87-830c-629bd67c83cb.csv ? Also 0eb786c7-ff4e-46bc-8489-0ca1c1c383cf.csv ?  I have four cohorts uploaded.  I have no idea what those files are based on file name.

But I would prefer csv as the output because I think for the intended audience who would derive the most benefit from this, it is the best file format.  That is, people who have little experience in doing research and little experience in computer programming who suddenly need to produce metric data to produce reports to justify funding from the FDC, IEG and other WMF grant programs.  If there is a different intended audience, then this needs to be made more explicitly clear so people doing work with that particular cohort of users can steer people away from this tool and towards one more suitable for their needs.

-- 
twitter: purplepopple
blog: ozziesport.com