Rsync compression problem - sometimes ineffective?

This is a discussion on Rsync compression problem - sometimes ineffective? within the Rsync forums, part of the Networking and Network Related category; Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server. Backing up user data ...


Go Back   Usenet Forums > Networking and Network Related > Rsync

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 06-12-2008
Bodle, Donald E
 
Posts: n/a
Default Rsync compression problem - sometimes ineffective?


Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server.
Backing up user data from 2 different clients using following:

su - $HOSTID -c 'rsync -azr --timeout=600 --log-file=$DEBUGFILE
--log-file-format="%o %f %b %l %i" --stats --delete --bwlimit=$BANDWDT
--rsh="ssh -P ____" $STAGE $TARGET:$TARGETDIR'

Using "bytes sent"/"literal data" from statistics as a rough estimation
(I know there is overhead in the bytes sent) of the effectiveness of
compression, most days I see reasonable compression, such as from our
summary (X MBytes compressed=bytes sent; XMbytes uncompressed=Literal
data):

rsync $HOSTID transferred 46.20 MBytes compressed (210.45 MBytes
uncompressed)
52 minutes and 6 seconds
45.50 kBps
6,896 files changed out of 81,720 total files (8.44%)

or

rsync $HOSTID transferred 543.53 MBytes compressed (3.66 GBytes
uncompressed)
2 hours, 16 minutes and 38 seconds
89.12 kBps
7,343 files changed out of 79,944 total files (9.19%)

Some days, I see no evidence of compression, such as this:

rsync $HOSTID transferred 52.10 MBytes compressed (50.06 MBytes
uncompressed)
59 minutes and 48 seconds
53.98 kBps
5,350 files changed out of 80,257 total files (6.67%)

or similarly this:

rsync $HOSTID transferred 1007.55 MBytes compressed (1004.59 MBytes
uncompressed)
3 hours, 38 minutes and 47 seconds
92.27 kBps
9,888 files changed out of 79,306 total files (12.47%)


My initial thought was that days of no apparent compression were when
the majority of the changed files were small files (like when gzipping a
small ASCII file doubles it size) or already compressed files. But so
far I haven't been able to confirm this. I'm not sure this logic
applies since rsync compresses data blocks (at least as I understand
it), and those blocks would be fairly consistent in size (I think). Is
this general understanding of rsync's compression correct?

I searched the samba.org local archives first, and then Internet wide,
using +rsync +compression +problem, but didn't find any similar posts.
Less restrictive searches didn't help any either. I also didn't see
anything in the FAQ or current issues and debugging areas.

Has anyone seen this sort of behaviour before? Can you offer
suggestions of additional diagnostics to attempt? What additional
information might be useful to support my contention that this is
related to the data being changed on those "uncompressed" days?

Thanks

Donald E. Bodle, Jr.
Sr. Systems Developer
The Reynolds and Reynolds Co.
(937) 485-1954

Are you okay with today, if tomorrow is the end?
- Superchick (So Bright)

This message is confidential and may contain confidential information.
It is intended only for the individual[s] named herein. If this message
is being sent from a member of the legal department, it may also be
legally privileged. If you are not the named addressee[s] you must
delete this email immediately. Do not disseminate, distribute or copy.


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply With Quote
  #2 (permalink)  
Old 06-12-2008
Matt McCutchen
 
Posts: n/a
Default Re: Rsync compression problem - sometimes ineffective?

On Thu, 2008-06-12 at 13:35 -0400, Bodle, Donald E wrote:
> Using "bytes sent"/"literal data" from statistics as a rough estimation
> (I know there is overhead in the bytes sent) of the effectiveness of
> compression, most days I see reasonable compression


> My initial thought was that days of no apparent compression were when
> the majority of the changed files were small files (like when gzipping a
> small ASCII file doubles it size) or already compressed files. But so
> far I haven't been able to confirm this. I'm not sure this logic
> applies since rsync compresses data blocks (at least as I understand
> it), and those blocks would be fairly consistent in size (I think). Is
> this general understanding of rsync's compression correct?


My guess is that the files are already compressed.

To see the actual size (compressed if applicable) of the delta rsync is
sending for each file, use the %b log option, e.g.,
--out-format='%b %i %n%L' . You can compare those numbers with and
without compression to see which deltas aren't compressing as well as
you expect. Unfortunately, %b only seems to work on a run that really
updates a destination, so you'll have to use a throwaway destination
(perhaps with --compare-dest to the real one) for the tests; %b ought to
work in --only-write-batch mode. To investigate why a particular delta
isn't compressing, you could use rdiff to write the delta to a file and
then look at the data inside.

Matt

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEABECAAYFAkhRc6oACgkQC+xSYN/RlfvAwgCgth/jjgdOzr3O7fxhME4t7snS
Mc0AniC1apgSMMjMa0SNTfRYAQDj8t1n
=GnuM
-----END PGP SIGNATURE-----

Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 01:10 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0