This is a discussion on Rsync compression problem - sometimes ineffective? within the Rsync forums, part of the Networking and Network Related category; Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server. Backing up user data ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server. Backing up user data from 2 different clients using following: su - $HOSTID -c 'rsync -azr --timeout=600 --log-file=$DEBUGFILE --log-file-format="%o %f %b %l %i" --stats --delete --bwlimit=$BANDWDT --rsh="ssh -P ____" $STAGE $TARGET:$TARGETDIR' Using "bytes sent"/"literal data" from statistics as a rough estimation (I know there is overhead in the bytes sent) of the effectiveness of compression, most days I see reasonable compression, such as from our summary (X MBytes compressed=bytes sent; XMbytes uncompressed=Literal data): rsync $HOSTID transferred 46.20 MBytes compressed (210.45 MBytes uncompressed) 52 minutes and 6 seconds 45.50 kBps 6,896 files changed out of 81,720 total files (8.44%) or rsync $HOSTID transferred 543.53 MBytes compressed (3.66 GBytes uncompressed) 2 hours, 16 minutes and 38 seconds 89.12 kBps 7,343 files changed out of 79,944 total files (9.19%) Some days, I see no evidence of compression, such as this: rsync $HOSTID transferred 52.10 MBytes compressed (50.06 MBytes uncompressed) 59 minutes and 48 seconds 53.98 kBps 5,350 files changed out of 80,257 total files (6.67%) or similarly this: rsync $HOSTID transferred 1007.55 MBytes compressed (1004.59 MBytes uncompressed) 3 hours, 38 minutes and 47 seconds 92.27 kBps 9,888 files changed out of 79,306 total files (12.47%) My initial thought was that days of no apparent compression were when the majority of the changed files were small files (like when gzipping a small ASCII file doubles it size) or already compressed files. But so far I haven't been able to confirm this. I'm not sure this logic applies since rsync compresses data blocks (at least as I understand it), and those blocks would be fairly consistent in size (I think). Is this general understanding of rsync's compression correct? I searched the samba.org local archives first, and then Internet wide, using +rsync +compression +problem, but didn't find any similar posts. Less restrictive searches didn't help any either. I also didn't see anything in the FAQ or current issues and debugging areas. Has anyone seen this sort of behaviour before? Can you offer suggestions of additional diagnostics to attempt? What additional information might be useful to support my contention that this is related to the data being changed on those "uncompressed" days? Thanks Donald E. Bodle, Jr. Sr. Systems Developer The Reynolds and Reynolds Co. (937) 485-1954 Are you okay with today, if tomorrow is the end? - Superchick (So Bright) This message is confidential and may contain confidential information. It is intended only for the individual[s] named herein. If this message is being sent from a member of the legal department, it may also be legally privileged. If you are not the named addressee[s] you must delete this email immediately. Do not disseminate, distribute or copy. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html |
|
|||
|
On Thu, 2008-06-12 at 13:35 -0400, Bodle, Donald E wrote:
> Using "bytes sent"/"literal data" from statistics as a rough estimation > (I know there is overhead in the bytes sent) of the effectiveness of > compression, most days I see reasonable compression > My initial thought was that days of no apparent compression were when > the majority of the changed files were small files (like when gzipping a > small ASCII file doubles it size) or already compressed files. But so > far I haven't been able to confirm this. I'm not sure this logic > applies since rsync compresses data blocks (at least as I understand > it), and those blocks would be fairly consistent in size (I think). Is > this general understanding of rsync's compression correct? My guess is that the files are already compressed. To see the actual size (compressed if applicable) of the delta rsync is sending for each file, use the %b log option, e.g., --out-format='%b %i %n%L' . You can compare those numbers with and without compression to see which deltas aren't compressing as well as you expect. Unfortunately, %b only seems to work on a run that really updates a destination, so you'll have to use a throwaway destination (perhaps with --compare-dest to the real one) for the tests; %b ought to work in --only-write-batch mode. To investigate why a particular delta isn't compressing, you could use rdiff to write the delta to a file and then look at the data inside. Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAkhRc6oACgkQC+xSYN/RlfvAwgCgth/jjgdOzr3O7fxhME4t7snS Mc0AniC1apgSMMjMa0SNTfRYAQDj8t1n =GnuM -----END PGP SIGNATURE----- |