-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 01.08.2013 14:02, info@gno.de wrote:
As we discussed several times some bot operators do not like having .exe files executed without explicit permission to do it.
I understand this! Thats the reason why you're are asked for permission before the install starts. What should be done else? A value in user-config.py? Another question just for patch? This exclicit .exe file is open-source so you can look at the source code if you expect it to do harm. What can I do else? What would you like to see?
In my opinion it is not a good solution blocking the bot now if the permission is denied because the bot worked 10 years ago without this permission and patching beautifullsoup. That library was included to our code since the first steps of pwb. Where are the problems to include them or what are the licenses problems? Maybe you are right and we did wrong in past. And what are the special necessity for that patch? Externals can be included without running patch.exe. But I do not understand them. I understood that patch as a sample for that technoloy which should be used when needed. Now I only see the exception raising while starting any script of the framework (which was few days ago and not some months behind. Sorry I am not up to day while testing ;) Ok this behaviour seems a good way to let the trunk/compat branch die in honest. In this respect I am with you :D
You are right for the specific case of beautifulsoup.py the patch is not NEEDED it just re-establishes the state we had, with some minor changes done to "our" beautifulsoup.py - but as you said they are not important, thus this patch can be removed. Downloading (everything else except patching) is fine for you?
Including code form an other source or repo is simple but dangerous and not that clever (and may be illegal, depending on license). We could try to write explicit install manual for all external code - as it was in the past, but it was not done strictly (e.g. spelling and beautifulsoup.py are included, but e.g irclib not). Install instruction are spread over several files like docu, scripts and may be else. I wrote such instruction in core for lua but it is a mess somehow.
Regarding the patches (excpet beautifulsoup.py) as already mentioned I tried to minimize them and keep them as simple as possible - therefore they are short but essential - if you cant tell me another way to solve this - please step forward. I just tried to implement the most efficient method. Consider e.g. 'music21' this code is 10-100MB and the patch/changes needed in order to run (without full install of music21 since not needed) contains about 500 bytes - that's all we have to change, shall we now fork the whole music21 just becuase of 500 bytes to change? If we do so, who is in charge of keeping "our" music21 fork up-to-date with the one upstream? (this just creates workload - beleave me, I tried it ;)
(Another thing is that some of the externals used for catimages.py need to be compiled since they are C/C++ source for python libraries.)
This kind of patching code is very well known and very old technology (well-tested) in the linux world - it's a pitty that windows does not know it at all - but nevertheless I (and others) consider it a good technology. It is e.g. very verbose since you can easily understand what the patch is doing since the patch files are in human-readable unified diff format (not binary like setup.exe or else).
If you (and others) now just see an exception raising while starting a script, this is a serious issue and no I do not want to let compat die this way! Please help me to improve the situation since I do not have a win machine in order to test it there. I tried several times (by mails to THIS list) to raise this issue BEFORE merging all the code in. I then activated it for beautifulsoup.py since I was not getting any feedback (anymore) after some time. The feedback regarding beautifulsoup.py and patch.exe was most of the time "what about that damn shit?!" or the like ;) and so it was hard for me to really improve the situation. I WANT to work this out since I still consider this the optimal (though not perfect) solution to several problem with externals and want to make it working. (in fact it is THE ONLY solution I see to include the heterogenous code - including C/C++ code - - of catimages.py)
Aaah yes - another issue is the fact how svn handles externals (single files in a tracked dir are not possible) and also how git handles submodules. And I also conclude that we might need an own system to satisfy all needs here - at least for svn that was the case.
I think WE ALL want to improve the code we have - I need a way to include ANY kind of external code in a SIMPLE and TRACKABLE way. This include HUGE libraries needing SMALL changes to their code - please help me to solve this! Thanks!
Have a nice day and weekend! All the best! DrTrigon
Best xqt
----- Original Nachricht ---- Von: "Dr. Trigon" dr.trigon@surfeu.ch An: pywikipedia-l@lists.wikimedia.org Datum: 01.08.2013 13:29 Betreff: Re: [Pywikipedia-l] Question about externals and patches
I don't like any installation process including rewrite branch/core and I am using pwb.py to have a directory based environment. Maybe for some reasons a installation is necessary but it should be very rare! There is no reason for patching beautifullsoup because we could either use the external library (as something like external) as it is or include it to the framework as it was previously. This is a central library for the compat branch and you cannot run it whithout that stuff. Missing this library stops running the whole framework if some users did not want to install any .exe file. Other patches only depends few scripts or not even a script is affected.
I don't like installation either!! This is not about installing the framework itself, but the dependencies needed! That's a big difference! Nobody forces you to use the externals/__init__.py you CAN do this all manually. But I for my part am very unhappy, if I have to install several packages and search them on the web, find the correct version, try to compile it and all the stuff - just to TEST, whether this pywikipedia framework could be useful to me. It is a good thing if it is able to do this by itself - still if you do it by yourself in advance there is no need for externals to jump in. That's supposed to be more beginner-friendly (it is a one-time help to start using the framework).
It is a bad idea to include code like beautifulsoup.py from another source. That makes confusion (who has written the code originally), is problematic with licenses and gives issues with merging new versions. (cosmetic changes were applied to our version and so it was not clear anymore how to merge this with the one upstream) - so we should NOT include other code into our repos! AND on linux systems there is beautifulsoup already included into the distro repo - so it makes no sense to include it with pywikipedia. (similarly for simplejson, setuptools, httplib2, ...)
And please check whether a patch is necessary for other parts too and keep the code simple. I guess there could be other ways which have the same effect without running any .exe file which some people don't like to. E.g. for music21, pydmtx, zbar I don't understand why a patch is required (and for what scripts they are needed at all).
In fact we discussed about all the stuff already in the past [1] (May 2013 till now - search for 'externals' and 'patch') where I explained all that in detail. (I kept it as simple as possible - I explained already several options - The patches are reduced to a minimum - Forking those packages just creates additional workload - Some of those packages are huge - ...)
[1] http://lists.wikimedia.org/pipermail/pywikipedia-l/
I already mentioned - several times - if you have a better solution (I searched one for months now!) then PLEASE tell me! I do not know any other solution to include all my code and scripts here! If you want me to remove them again, fine - I will do this - but it would have been nice to tell me this before I invested weeks of coding into that stuff! And as you should have been seeing e.g. in the migration to git submodules - externals/__init__.py was in fact an easy, simple and fast solution to work-a-round that stuff. Although we migrated to git, it is still as bad as svn - regarding externals or submodules neither (svn, git) has a good solution! But PLEASE tell me a better one if you know it!!
One last way out that I can think of is, we make it non-automatic; we could do it together with the creation of user-config.py as one explicit step the user has to do manually in the beginning (instead of jumping in automatically) - what about that?
Greetings and all the best! DrTrigon
_______________________________________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
_______________________________________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l