Re: [Wikitech-l] Parallel computing project

26 Oct 2010

Ariel T. Glenn wrote:
...
  If one were clever (and I have some code that would
enable one to be
 clever), one could seek to some point in the (bzip2-compressed) file and
 uncompress from there before processing.  Running a bunch of jobs each
 decompressing only their small piece then becomes feasible.  I don't
 have code that does this for gz or 7z; afaik these do not do compression
 in discrete blocks.

 Ariel 
The bzip2recover approach?
I am not sure how much will be the gain after so much bit moving.
Also, I was unable to continue from a flushed point, it may not be so easy.
OTOH, if you already have an index and the blocks end at page boundaries
(which is what I was doing), it becomes trivial.
Remember that the previous block MUST continue up to the point where the
next reader started processing inside the next block. And unlike what
ttsiod said, you do encounter tags split between blocks in a normal
compression.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Parallel computing project