This is a discussion on converting .zip to .tar.gz or to .tar.bz2 within the Linux General forums, part of the Linux Forums category; I demand that Keith Keller may or may not have written... [snip] > If you want to extract only certain ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I demand that Keith Keller may or may not have written...
[snip] > If you want to extract only certain files from a tar file, it's easy enough > to do. You still have a possibly long read of the file, but you can > certainly avoid the long write and disk use if you know what files you're > looking for. And I don't see how zip gets around that issue, so I don't > see how zip is any more or less useful than tar/gzip. The fact that it maintains an index at the end of the archive is useful in avoiding the long read. Well, unless the archive is on tape... -- | Darren Salt | linux or ds at | nr. Ashington, | Toon | RISC OS, Linux | youmustbejoking,demon,co,uk | Northumberland | Army | Kill all extremists! Do not clog intellect's sluices with bits of knowledge of questionable uses. |
|
|||
|
Keith Keller wrote:
> On 2007-08-08, CBFalconer <cbfalconer@yahoo.com> wrote: >> Bill Marcum wrote: >>> >>> If the zip archive contains more than one file, you must extract >>> it to a directory and then tar the directory. If the archive >>> contains only one file, you can extract it to standard output and >>> pipe it to gzip or bzip2 without tar. >> >> Which illustrates nicely why I consider zip to be more useful than >> tarring and gzipping. Zip keeps multiple files easily accessible, >> without going through a (possibly) long extraction and write, and >> consumption of untold disk space. > > If you want to extract only certain files from a tar file, it's easy > enough to do. You still have a possibly long read of the file, but you > can certainly avoid the long write and disk use if you know what files > you're looking for. And I don't see how zip gets around that issue, > so I don't see how zip is any more or less useful than tar/gzip. For a tar file, yes. For a gzipped tar file (the OP talked about tar.gz or tar.bz2) any access first requires decompressing the entire large file, and then searching the result. Just because you have adequate disk space doesn't mean it always exists. -- Chuck F (cbfalconer at maineline dot net) Available for consulting/temporary embedded and systems. <http://cbfalconer.home.att.net> -- Posted via a free Usenet account from http://www.teranews.com |
|
|||
|
On 2007-08-09, CBFalconer <cbfalconer@yahoo.com> wrote:
> Keith Keller wrote: >> >> If you want to extract only certain files from a tar file, it's easy >> enough to do. You still have a possibly long read of the file, but you >> can certainly avoid the long write and disk use if you know what files >> you're looking for. And I don't see how zip gets around that issue, >> so I don't see how zip is any more or less useful than tar/gzip. > > For a tar file, yes. For a gzipped tar file (the OP talked about > tar.gz or tar.bz2) any access first requires decompressing the > entire large file, and then searching the result. Just because you > have adequate disk space doesn't mean it always exists. For a gzip'd tar file, any access requires on-the-fly decompression, but not necessarily to disk. It's the moral equivalent of piping gzip -dc (which uses no disk, except perhaps to /tmp) to tar (which can use only the disk you tell it to). For example: tar xzf foo.tar.gz foo/bar/baz.c foo/baz/bar.c extracts at most two files from foo.tar.gz. If foo.tar.gz is large, the tar will take a long time, but no matter how large it won't eat any more space than baz.c and bar.c take. If for some reason your tar doesn't support gzip compression (or bzip2), you can always do gzip -dc foo.tar.gz | tar xf - foo/bar/baz/c foo/baz/bar.c which is the same as above except with two processes instead of one. (And one of my pet peeves is to find my users with foo.tar in their directory after downloading foo.tar.gz. Grr!) --keith -- kkeller-usenet@wombat.san-francisco.ca.us (try just my userid to email me) AOLSFAQ=http://www.therockgarden.ca/aolsfaq.txt see X- headers for PGP signature information |
|
|||
|
>>>>> "Keith" == Keith Keller <kkeller-usenet@wombat.san-francisco.ca.us> writes:
Keith> For a gzip'd tar file, any access requires on-the-fly Keith> decompression, but not necessarily to disk. It's the moral Keith> equivalent of piping gzip -dc (which uses no disk, except Keith> perhaps to /tmp) to tar (which can use only the disk you Keith> tell it to). Keith> For example: Keith> tar xzf foo.tar.gz foo/bar/baz.c foo/baz/bar.c Keith> extracts at most two files from foo.tar.gz. If foo.tar.gz Keith> is large, the tar will take a long time, but no matter how Keith> large it won't eat any more space than baz.c and bar.c Keith> take. Yes. That's why to extract a tiny file from a huge .tar.gz file, you still need to scan through the whole .tar.gz file (unless you're lucky enough to have that tiny file occuring at the very beginning of the archive. For .zip files, there is an index at the end of the file. So, to extract a tiny file from a huge .zip file, the unzip program just need to read the index to get the offset of that compressed file in the ..zip, and then seek directly to that offset to extract the file. Indeed, the index contains also meta-data about the files. So, an "unzip -v" is very quick because it only needs to scan through the index, which is relatively small. OTOH, "tar ztvf" or "tar jtvf" needs to scan through the whole archive to get the file meta-data to display it. So, it's very slow. (I think this is why Sun decided to base .jar files on the .zip format. They don't want to scan through the whole .jar file just to load a certain .class file. So, .zip is a good choice.) -- Lee Sau Dan §õ¦u´° ~{@nJX6X~} E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee |