I downloaded:
http://dumps.wikimedia.your.org/other/static_html_dumps/2008-06/en/wikipedi…
using wget and it seems to be fine:
$ _IFL="wikipedia-en-html.tar.7z"
$ ls -l "${_IFL}"
-rw-r--r-- 1 niggahme niggahme 15363543213 Jun 21 2008 wikipedia-en-html.tar.7z
$ file "${_IFL}"
wikipedia-en-html.tar.7z: 7-zip archive data, version 0.2
$ md5sum -b "${_IFL}"
03ce695cbf32a3f8636fa8d3f9c7d12e *wikipedia-en-html.tar.7z
$ sha256sum -b "${_IFL}"
c2794b6371a05017f03e2a345730fd763b1052872290b5c78763978a0b43c747
*wikipedia-en-html.tar.7z
$ sha512sum -b "${_IFL}"
d52a737ceca25ef18272ba70a4a56000a7a0bff92653fb462674333a0855f397c892b8aeb2e11206d391ba4cca48d46f5814d92db4d2096467519de38c5a189c
*wikipedia-en-html.tar.7z
$ 7z l "${_IFL}"
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64
bits,2 CPUs Intel(R) Pentium(R) CPU B940 @ 2.00GHz (206A7),ASM)
Scanning the drive for archives:
1 file, 15363543213 bytes (15 GiB)
Listing archive: wikipedia-en-html.tar.7z
--
Path = wikipedia-en-html.tar.7z
Type = 7z
Physical Size = 15363543213
Headers Size = 100
Method = LZMA:22
Solid = -
Blocks = 1
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2008-06-18 13:02:15 ..... 223674511360 15363543113 wikipedia-en-html.tar
------------------- ----- ------------ ------------ ------------------------
2008-06-18 13:02:15 223674511360 15363543113 1 files
$
But I ca'nt get the name of the compressed/contained file even though
ark and 7z show it to you. Here is my simple piece of code:
String aIFl = "wikipedia-en-html.tar.7z";
File I7ZKFl = new File(aIFl);
if(I7ZKFl.exists()){
try{
SevenZFile SvnZFl = new SevenZFile(I7ZKFl);
SevenZArchiveEntry entry;
int iIx = 0;
while((entry = SvnZFl.getNextEntry()) != null){
System.out.println("// __ [" + iIx + "]: |" + entry +
"|");
System.out.println("// __ .getName() |" + entry.getName() + "|");
System.out.println("// __ .getSize() |" + entry.getSize() + "|");
System.out.println("// __ .getLastModifiedDate() |" +
entry.getLastModifiedDate() + "|");
++iIx;
}// ((entry = SvnZFl.getNextEntry()) != null)
}catch(IOException IOX){ IOX.printStackTrace(System.err); }
}
which, except for the name, its faithful output was:
// __ [0]: |org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry@179d3b25|
// __ .getName() |null|
// __ .getSize() |223674511360|
// __ .getLastModifiedDate() |Wed Jun 18 14:02:15 EDT 2008|
Why is it that I can't get the file name?
Also, if OO works, I should be able to access and process this file
while addressing it like (using an exclamation mark):
wikipedia-en-html.tar.7z!wikipedia-en-html.tar
So, I this point I should be able to go:
String aIFl = "wikipedia-en-html.tar.7z!wikipedia-en-html.tar"
FileInputStream FISTarK = new FileInputStream(new File(aIFl));
TarArchiveInputStream tarInput = new TarArchiveInputStream(FISTarK);
TarArchiveEntry tArKEnt;
while((tArKEnt=tarInput.getNextTarEntry()) != null){
...
}
right?
lbrtchx