Hi
Do we have a way to know how many articles we have in a specific namespace in a ZIM file... but without the redirects?
Regards Emmanuel
Hi Emmanuel,
On 01/22/2012 04:09 PM, Emmanuel Engelhart wrote:
Do we have a way to know how many articles we have in a specific namespace in a ZIM file... but without the redirects?
does this help you?
http://openzim.org/Special:Statistics http://openzim.org/Special:AllPages
/Manuel
On 22/01/2012 16:17, Manuel Schneider wrote:
Hi Emmanuel,
On 01/22/2012 04:09 PM, Emmanuel Engelhart wrote:
Do we have a way to know how many articles we have in a specific namespace in a ZIM file... but without the redirects?
does this help you?
http://openzim.org/Special:Statistics http://openzim.org/Special:AllPages
Unfortunately ;( I meant "in a ZIM file".
Emmanuel
Hi,
I think it's currently not possible to get the number of articles without redirects. (Except counting all articles which are not redirects, but this would be pretty slow) However, I agree that it would be a useful feature, so we should consider to add it. (Perhaps as metadata?)
Best regards, Christian
Am 22.01.2012 16:09, schrieb Emmanuel Engelhart:
Hi
Do we have a way to know how many articles we have in a specific namespace in a ZIM file... but without the redirects?
Regards Emmanuel
dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
I also think it would be valuable to store these information somewhere in the ZIM files. We need to save many new metadata, for each type of file (picture, video, audio, text)... and why not other types (presentation, ...). IMO the best we can do, is saving how many entries we have per mime-type, so we are sure we have on the format size always the information we need (with the finest granularity). The reader or the zimlib should make the computation and provide the necessary code to make something like getAudioArticleCount().
What do you think?
Emmanuel
A new metadata is certainly the most easiest solution, but I'm not sure this is the best. So this could potentialy add really a lot of new metadat
On 01/22/2012 06:15 PM, Christian Pühringer wrote:
Hi,
I think it's currently not possible to get the number of articles without redirects. (Except counting all articles which are not redirects, but this would be pretty slow) However, I agree that it would be a useful feature, so we should consider to add it. (Perhaps as metadata?)
Best regards, Christian
Am 22.01.2012 16:09, schrieb Emmanuel Engelhart:
Hi
Do we have a way to know how many articles we have in a specific namespace in a ZIM file... but without the redirects?
Regards Emmanuel
dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
Hi Emmanuel,
I am not sure whether I fully understand your proposal: Is your idea to save the information on mime-type level instead of on namespace level? (Redirects have a special mime-type, therefore - as desired - they would not be included in the numbers). Is there a potential issue for mime-type based, that non-article entries may have the same mime-type as article entries? (e.g. can image text be html?) If this is not a real issue, storing mime-type fine for me, but it would be also fine if count is stored on namespace level (that is, entries in one namespace which are not redirects).
Benefit of storing on metadata-level is that articles which are not text or image can be handled, disadvantage is that it is more complex for the application. (Therefore I'd prefer if it is implemented in zimlib)
Where do you want to store the mime-type count information? As metadata or something else?
Best regards, Christian
Am 22.01.2012 20:14, schrieb Emmanuel Engelhart:
I also think it would be valuable to store these information somewhere in the ZIM files. We need to save many new metadata, for each type of file (picture, video, audio, text)... and why not other types (presentation, ...). IMO the best we can do, is saving how many entries we have per mime-type, so we are sure we have on the format size always the information we need (with the finest granularity). The reader or the zimlib should make the computation and provide the necessary code to make something like getAudioArticleCount().
What do you think?
Emmanuel
A new metadata is certainly the most easiest solution, but I'm not sure this is the best. So this could potentialy add really a lot of new metadat
On 01/22/2012 06:15 PM, Christian Pühringer wrote:
Hi,
I think it's currently not possible to get the number of articles without redirects. (Except counting all articles which are not redirects, but this would be pretty slow) However, I agree that it would be a useful feature, so we should consider to add it. (Perhaps as metadata?)
Best regards, Christian
Am 22.01.2012 16:09, schrieb Emmanuel Engelhart:
Hi
Do we have a way to know how many articles we have in a specific namespace in a ZIM file... but without the redirects?
Regards Emmanuel
dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
On 23/01/2012 19:44, Christian Pühringer wrote:
I am not sure whether I fully understand your proposal: Is your idea to save the information on mime-type level instead of on namespace level?
Yes, because we have not namespaces for all type of content, we can not offer this garanty, and maybe apps needs to get this information on mime-type level and not on namespace level.
(Redirects have a special mime-type, therefore - as desired - they would not be included in the numbers). Is there a potential issue for mime-type based, that non-article entries may have the same mime-type as article entries? (e.g. can image text be html?)
Everything can happen, if the ZIM editor/software is not well coded. With my script, I decide on my own which mime-type has each content (article, image, ...).
I do not see any issue, the only point is that if you want to get the number of all images articles (for example), you will need to code somewhere the code which know that make a sum of image/jpeg ; image/gif ; image/png are all image mime-type... ans so one. I think this could be done in the zimlib.
If this is not a real issue, storing mime-type fine for me, but it would be also fine if count is stored on namespace level (that is, entries in one namespace which are not redirects).
Benefit of storing on metadata-level is that articles which are not text or image can be handled, disadvantage is that it is more complex for the application. (Therefore I'd prefer if it is implemented in zimlib)
Yes.
Where do you want to store the mime-type count information? As metadata or something else?
I would propose a new Metadata entry called for example "Counter" http://openzim.org/Metadata
The value would be a string looking like that: image/jpeg=5;image/gif=3;image/png=2...
Emmanuel
Hi,
Am 24.01.2012 18:42, schrieb Emmanuel Engelhart:
I would propose a new Metadata entry called for example "Counter" http://openzim.org/Metadata
The value would be a string looking like that: image/jpeg=5;image/gif=3;image/png=2...
Sounds good, I'd support this proposal.
Christian
Am 24.01.2012 18:42, schrieb Emmanuel Engelhart:
On 23/01/2012 19:44, Christian Pühringer wrote:
I am not sure whether I fully understand your proposal: Is your idea to save the information on mime-type level instead of on namespace level?
Yes, because we have not namespaces for all type of content, we can not offer this garanty, and maybe apps needs to get this information on mime-type level and not on namespace level.
(Redirects have a special mime-type, therefore - as desired - they would not be included in the numbers). Is there a potential issue for mime-type based, that non-article entries may have the same mime-type as article entries? (e.g. can image text be html?)
Everything can happen, if the ZIM editor/software is not well coded. With my script, I decide on my own which mime-type has each content (article, image, ...).
I do not see any issue, the only point is that if you want to get the number of all images articles (for example), you will need to code somewhere the code which know that make a sum of image/jpeg ; image/gif ; image/png are all image mime-type... ans so one. I think this could be done in the zimlib.
If this is not a real issue, storing mime-type fine for me, but it would be also fine if count is stored on namespace level (that is, entries in one namespace which are not redirects).
Benefit of storing on metadata-level is that articles which are not text or image can be handled, disadvantage is that it is more complex for the application. (Therefore I'd prefer if it is implemented in zimlib)
Yes.
Where do you want to store the mime-type count information? As metadata or something else?
I would propose a new Metadata entry called for example "Counter" http://openzim.org/Metadata
The value would be a string looking like that: image/jpeg=5;image/gif=3;image/png=2...
Emmanuel
On 25/01/2012 22:49, Christian Pühringer wrote:
Am 24.01.2012 18:42, schrieb Emmanuel Engelhart:
I would propose a new Metadata entry called for example "Counter" http://openzim.org/Metadata
The value would be a string looking like that: image/jpeg=5;image/gif=3;image/png=2...
Sounds good, I'd support this proposal.
Nobody seems to be against, so I have added this to the format in the wiki: https://openzim.org/index.php?title=Metadata&action=historysubmit&di...
Emmanuel
On 01/28/2012 01:57 PM, Emmanuel Engelhart wrote:
Nobody seems to be against, so I have added this to the format in the wiki: https://openzim.org/index.php?title=Metadata&action=historysubmit&di...
great thanks!
/Manuel
On 28/01/2012 13:57, Emmanuel Engelhart wrote:
On 25/01/2012 22:49, Christian Pühringer wrote:
Am 24.01.2012 18:42, schrieb Emmanuel Engelhart:
I would propose a new Metadata entry called for example "Counter" http://openzim.org/Metadata
The value would be a string looking like that: image/jpeg=5;image/gif=3;image/png=2...
Sounds good, I'd support this proposal.
Nobody seems to be against, so I have added this to the format in the wiki: https://openzim.org/index.php?title=Metadata&action=historysubmit&di...
This feature is now fully implemented on Kiwix side in both ZIM build script and software. So, new ZIM files I build will since now include the new "Counter" metadata.
Emmanuel