converting .zip to .tar.gz or to .tar.bz2

This is a discussion on converting .zip to .tar.gz or to .tar.bz2 within the Linux General forums, part of the Linux Forums category; I demand that Keith Keller may or may not have written... [snip] > If you want to extract only certain ...


Go Back   Usenet Forums > Linux Forums > Linux General

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 08-08-2007
Darren Salt
 
Posts: n/a
Default Re: converting .zip to .tar.gz or to .tar.bz2

I demand that Keith Keller may or may not have written...

[snip]
> If you want to extract only certain files from a tar file, it's easy enough
> to do. You still have a possibly long read of the file, but you can
> certainly avoid the long write and disk use if you know what files you're
> looking for. And I don't see how zip gets around that issue, so I don't
> see how zip is any more or less useful than tar/gzip.


The fact that it maintains an index at the end of the archive is useful in
avoiding the long read.

Well, unless the archive is on tape...

--
| Darren Salt | linux or ds at | nr. Ashington, | Toon
| RISC OS, Linux | youmustbejoking,demon,co,uk | Northumberland | Army
| Kill all extremists!

Do not clog intellect's sluices with bits of knowledge of questionable uses.
Reply With Quote
  #12 (permalink)  
Old 08-09-2007
CBFalconer
 
Posts: n/a
Default Re: converting .zip to .tar.gz or to .tar.bz2

Keith Keller wrote:
> On 2007-08-08, CBFalconer <cbfalconer@yahoo.com> wrote:
>> Bill Marcum wrote:
>>>
>>> If the zip archive contains more than one file, you must extract
>>> it to a directory and then tar the directory. If the archive
>>> contains only one file, you can extract it to standard output and
>>> pipe it to gzip or bzip2 without tar.

>>
>> Which illustrates nicely why I consider zip to be more useful than
>> tarring and gzipping. Zip keeps multiple files easily accessible,
>> without going through a (possibly) long extraction and write, and
>> consumption of untold disk space.

>
> If you want to extract only certain files from a tar file, it's easy
> enough to do. You still have a possibly long read of the file, but you
> can certainly avoid the long write and disk use if you know what files
> you're looking for. And I don't see how zip gets around that issue,
> so I don't see how zip is any more or less useful than tar/gzip.


For a tar file, yes. For a gzipped tar file (the OP talked about
tar.gz or tar.bz2) any access first requires decompressing the
entire large file, and then searching the result. Just because you
have adequate disk space doesn't mean it always exists.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>



--
Posted via a free Usenet account from http://www.teranews.com

Reply With Quote
  #13 (permalink)  
Old 08-09-2007
Keith Keller
 
Posts: n/a
Default Re: converting .zip to .tar.gz or to .tar.bz2

On 2007-08-09, CBFalconer <cbfalconer@yahoo.com> wrote:
> Keith Keller wrote:
>>
>> If you want to extract only certain files from a tar file, it's easy
>> enough to do. You still have a possibly long read of the file, but you
>> can certainly avoid the long write and disk use if you know what files
>> you're looking for. And I don't see how zip gets around that issue,
>> so I don't see how zip is any more or less useful than tar/gzip.

>
> For a tar file, yes. For a gzipped tar file (the OP talked about
> tar.gz or tar.bz2) any access first requires decompressing the
> entire large file, and then searching the result. Just because you
> have adequate disk space doesn't mean it always exists.


For a gzip'd tar file, any access requires on-the-fly decompression, but
not necessarily to disk. It's the moral equivalent of piping gzip -dc
(which uses no disk, except perhaps to /tmp) to tar (which can use only
the disk you tell it to).

For example:

tar xzf foo.tar.gz foo/bar/baz.c foo/baz/bar.c

extracts at most two files from foo.tar.gz. If foo.tar.gz is large, the
tar will take a long time, but no matter how large it won't eat any more
space than baz.c and bar.c take.

If for some reason your tar doesn't support gzip compression (or bzip2),
you can always do

gzip -dc foo.tar.gz | tar xf - foo/bar/baz/c foo/baz/bar.c

which is the same as above except with two processes instead of one.
(And one of my pet peeves is to find my users with foo.tar in their
directory after downloading foo.tar.gz. Grr!)


--keith


--
kkeller-usenet@wombat.san-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://www.therockgarden.ca/aolsfaq.txt
see X- headers for PGP signature information

Reply With Quote
  #14 (permalink)  
Old 08-11-2007
LEE Sau Dan
 
Posts: n/a
Default Re: converting .zip to .tar.gz or to .tar.bz2

>>>>> "Keith" == Keith Keller <kkeller-usenet@wombat.san-francisco.ca.us> writes:

Keith> For a gzip'd tar file, any access requires on-the-fly
Keith> decompression, but not necessarily to disk. It's the moral
Keith> equivalent of piping gzip -dc (which uses no disk, except
Keith> perhaps to /tmp) to tar (which can use only the disk you
Keith> tell it to).

Keith> For example:

Keith> tar xzf foo.tar.gz foo/bar/baz.c foo/baz/bar.c

Keith> extracts at most two files from foo.tar.gz. If foo.tar.gz
Keith> is large, the tar will take a long time, but no matter how
Keith> large it won't eat any more space than baz.c and bar.c
Keith> take.

Yes. That's why to extract a tiny file from a huge .tar.gz file, you
still need to scan through the whole .tar.gz file (unless you're lucky
enough to have that tiny file occuring at the very beginning of the
archive.

For .zip files, there is an index at the end of the file. So, to
extract a tiny file from a huge .zip file, the unzip program just need
to read the index to get the offset of that compressed file in the
..zip, and then seek directly to that offset to extract the file.
Indeed, the index contains also meta-data about the files. So, an
"unzip -v" is very quick because it only needs to scan through the
index, which is relatively small. OTOH, "tar ztvf" or "tar jtvf"
needs to scan through the whole archive to get the file meta-data to
display it. So, it's very slow.

(I think this is why Sun decided to base .jar files on the .zip
format. They don't want to scan through the whole .jar file just to
load a certain .class file. So, .zip is a good choice.)



--
Lee Sau Dan §õ¦u´° ~{@nJX6X~}

E-mail: danlee@informatik.uni-freiburg.de
Home page: http://www.informatik.uni-freiburg.de/~danlee
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 06:58 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0