public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] Fast checksumming of whole partitions
@ 2010-06-05  6:39 meino.cramer
  2010-06-05  7:19 ` [gentoo-user] " Nikos Chantziaras
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: meino.cramer @ 2010-06-05  6:39 UTC (permalink / raw
  To: Gentoo


Hi,

 this night dd copies the contents of my first
 1TB disk to my second 1TB disk (same Model).

 (dd if=/devsda of=/dev/sdb bs=4096)

 I want to verify, that the copy is identical.

 I tried (or: I am still trying) to checksum 
 the first disk with

 whirlpooldeep /dev/sda

 whch seems to work but is DAMN slow (in relation
 to checksumming 1TB in whole).

 Is there any faster and reliable way to checksum
 whole paritions (not on "per file" base)???

 Thank you very much in advance for any help!

 Best regards,
 mcc

-- 
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05  6:39 [gentoo-user] Fast checksumming of whole partitions meino.cramer
@ 2010-06-05  7:19 ` Nikos Chantziaras
  2010-06-07 15:48   ` meino.cramer
  2010-06-05  7:32 ` [gentoo-user] " Andrea Conti
  2010-06-05 17:39 ` [gentoo-user] " 7v5w7go9ub0o
  2 siblings, 1 reply; 25+ messages in thread
From: Nikos Chantziaras @ 2010-06-05  7:19 UTC (permalink / raw
  To: gentoo-user

On 06/05/2010 09:39 AM, meino.cramer@gmx.de wrote:
>
> Hi,
>
>   this night dd copies the contents of my first
>   1TB disk to my second 1TB disk (same Model).
>
>   (dd if=/devsda of=/dev/sdb bs=4096)
>
>   I want to verify, that the copy is identical.
>
>   I tried (or: I am still trying) to checksum
>   the first disk with
>
>   whirlpooldeep /dev/sda
>
>   whch seems to work but is DAMN slow (in relation
>   to checksumming 1TB in whole).
>
>   Is there any faster and reliable way to checksum
>   whole paritions (not on "per file" base)???
>
>   Thank you very much in advance for any help!
>
>   Best regards,
>   mcc

Constructing a checksum means reading every byte off the partition.  So 
it's slower as a copy to /dev/null, never faster (because the checksum 
calculation also needs time.)

So in order to determine whether it's really slow, compare the time 
needed to dd the whole partition to /dev/null to the time needed for 
checksumming it.  Then post the times here and an expert might then tell 
whether this can be improved at all or not.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Fast checksumming of whole partitions
  2010-06-05  6:39 [gentoo-user] Fast checksumming of whole partitions meino.cramer
  2010-06-05  7:19 ` [gentoo-user] " Nikos Chantziaras
@ 2010-06-05  7:32 ` Andrea Conti
  2010-06-05 17:39 ` [gentoo-user] " 7v5w7go9ub0o
  2 siblings, 0 replies; 25+ messages in thread
From: Andrea Conti @ 2010-06-05  7:32 UTC (permalink / raw
  To: gentoo-user

>  Is there any faster and reliable way to checksum
>  whole paritions (not on "per file" base)???

It depends on where your bottleneck is...

If you're cpu-bound you can try with a faster hash: md5sum or even
md4sum would be a good choice (collision resistance is irrelevant in
this application).

On the other hand, if you're limited by disk throughput (which is most
likely) there is not much you can do. After all, you have to read the
data in order to hash it, and that takes time.

andrea



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05  6:39 [gentoo-user] Fast checksumming of whole partitions meino.cramer
  2010-06-05  7:19 ` [gentoo-user] " Nikos Chantziaras
  2010-06-05  7:32 ` [gentoo-user] " Andrea Conti
@ 2010-06-05 17:39 ` 7v5w7go9ub0o
  2010-06-05 19:23   ` meino.cramer
  2 siblings, 1 reply; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-05 17:39 UTC (permalink / raw
  To: for list

On 06/05/10 02:39, meino.cramer@gmx.de wrote:
[]
>
> Is there any faster and reliable way to checksum whole paritions (not
> on "per file" base)???


FWIW, portage has a tool called "dcfldd" that works well for me. It is
dd with the addition of:

      *   Hashing on-the-fly - dcfldd can hash the input data as it is
being transferred, helping to ensure data integrity.
      * Status output - dcfldd can update the user of its progress in
terms of the amount of data transferred and how much longer operation
will take.
      * Flexible disk wipes - dcfldd can be used to wipe disks quickly and
with a known pattern if desired.
      * Image/wipe Verify - dcfldd can verify that a target drive is a
bit-for-bit match of the specified input file or pattern.
      * Multiple outputs - dcfldd can output to multiple files or disks at
the same time.
      * Split output - dcfldd can split output to multiple files with more
configurability than the split command.
      * Piped output and logs - dcfldd can send all its log data and
output to commands as well as files natively.


e.g. when I copy my HD, I get a copy status report and hash by using the
following commands:

#!/bin/bash
dcfldd if=/dev/sda bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/sdb
dcfldd if=/dev/sdb bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/null

When they've completed, I'll visually compare the two hashes (you can
automate this.) You can get fancier and do the Verify instead of the hashes.

HTH

(p.s.  Part of your answer is setting the best blocksize for dd or
dcfldd.

I'd presume it the smaller of your available memory, or the buffer size
on your HD?...... someone please correct me on this!?)






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05 17:39 ` [gentoo-user] " 7v5w7go9ub0o
@ 2010-06-05 19:23   ` meino.cramer
  2010-06-05 20:11     ` Manuel Klemenz
  2010-06-05 23:44     ` 7v5w7go9ub0o
  0 siblings, 2 replies; 25+ messages in thread
From: meino.cramer @ 2010-06-05 19:23 UTC (permalink / raw
  To: gentoo-user

7v5w7go9ub0o <7v5w7go9ub0o@gmail.com> [10-06-05 20:22]:
> On 06/05/10 02:39, meino.cramer@gmx.de wrote:
> []
> >
> > Is there any faster and reliable way to checksum whole paritions (not
> > on "per file" base)???
> 
> 
> FWIW, portage has a tool called "dcfldd" that works well for me. It is
> dd with the addition of:
> 
>       *   Hashing on-the-fly - dcfldd can hash the input data as it is
> being transferred, helping to ensure data integrity.
>       * Status output - dcfldd can update the user of its progress in
> terms of the amount of data transferred and how much longer operation
> will take.
>       * Flexible disk wipes - dcfldd can be used to wipe disks quickly and
> with a known pattern if desired.
>       * Image/wipe Verify - dcfldd can verify that a target drive is a
> bit-for-bit match of the specified input file or pattern.
>       * Multiple outputs - dcfldd can output to multiple files or disks at
> the same time.
>       * Split output - dcfldd can split output to multiple files with more
> configurability than the split command.
>       * Piped output and logs - dcfldd can send all its log data and
> output to commands as well as files natively.
> 
> 
> e.g. when I copy my HD, I get a copy status report and hash by using the
> following commands:
> 
> #!/bin/bash
> dcfldd if=/dev/sda bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/sdb
> dcfldd if=/dev/sdb bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/null
> 
> When they've completed, I'll visually compare the two hashes (you can
> automate this.) You can get fancier and do the Verify instead of the hashes.
> 
> HTH
> 
> (p.s.  Part of your answer is setting the best blocksize for dd or
> dcfldd.
> 
> I'd presume it the smaller of your available memory, or the buffer size
> on your HD?...... someone please correct me on this!?)
> 
> 

That looks really interesting. The only problem I have with this is
that I have to have /dev/sda as /dev/sdb idle (not mounted) and
because of that I use knoppix as temporary system to boot. And I
dont think that knoppix has this tool "on board".

Or is there a way to do such copies from a one disk to another
while one disk is booted???

Best regards,
mcc

-- 
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05 19:23   ` meino.cramer
@ 2010-06-05 20:11     ` Manuel Klemenz
  2010-06-06 19:02       ` 7v5w7go9ub0o
  2010-06-05 23:44     ` 7v5w7go9ub0o
  1 sibling, 1 reply; 25+ messages in thread
From: Manuel Klemenz @ 2010-06-05 20:11 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: Text/Plain, Size: 2791 bytes --]

I'm calculating checksums over partitions just by calling
# md5sum /dev/sda1
or for the complete disk (incl. partition table + all partitions)
# md5sum /dev/sda

that's it :) - works with any distro/liveDVD

-- 
Cheers,
Manuel Klemenz

On Saturday 05 June 2010 21:23:31 meino.cramer@gmx.de wrote:
> 7v5w7go9ub0o <7v5w7go9ub0o@gmail.com> [10-06-05 20:22]:
> > On 06/05/10 02:39, meino.cramer@gmx.de wrote:
> > []
> > 
> > > Is there any faster and reliable way to checksum whole paritions (not
> > > on "per file" base)???
> > 
> > FWIW, portage has a tool called "dcfldd" that works well for me. It is
> > 
> > dd with the addition of:
> >       *   Hashing on-the-fly - dcfldd can hash the input data as it is
> > 
> > being transferred, helping to ensure data integrity.
> > 
> >       * Status output - dcfldd can update the user of its progress in
> > 
> > terms of the amount of data transferred and how much longer operation
> > will take.
> > 
> >       * Flexible disk wipes - dcfldd can be used to wipe disks quickly
> >       and
> > 
> > with a known pattern if desired.
> > 
> >       * Image/wipe Verify - dcfldd can verify that a target drive is a
> > 
> > bit-for-bit match of the specified input file or pattern.
> > 
> >       * Multiple outputs - dcfldd can output to multiple files or disks
> >       at
> > 
> > the same time.
> > 
> >       * Split output - dcfldd can split output to multiple files with
> >       more
> > 
> > configurability than the split command.
> > 
> >       * Piped output and logs - dcfldd can send all its log data and
> > 
> > output to commands as well as files natively.
> > 
> > 
> > e.g. when I copy my HD, I get a copy status report and hash by using the
> > following commands:
> > 
> > #!/bin/bash
> > dcfldd if=/dev/sda bs=4096k sizeprobe=if status=on hashwindow=0
> > of=/dev/sdb dcfldd if=/dev/sdb bs=4096k sizeprobe=if status=on
> > hashwindow=0 of=/dev/null
> > 
> > When they've completed, I'll visually compare the two hashes (you can
> > automate this.) You can get fancier and do the Verify instead of the
> > hashes.
> > 
> > HTH
> > 
> > (p.s.  Part of your answer is setting the best blocksize for dd or
> > dcfldd.
> > 
> > I'd presume it the smaller of your available memory, or the buffer size
> > on your HD?...... someone please correct me on this!?)
> 
> That looks really interesting. The only problem I have with this is
> that I have to have /dev/sda as /dev/sdb idle (not mounted) and
> because of that I use knoppix as temporary system to boot. And I
> dont think that knoppix has this tool "on board".
> 
> Or is there a way to do such copies from a one disk to another
> while one disk is booted???
> 
> Best regards,
> mcc

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05 19:23   ` meino.cramer
  2010-06-05 20:11     ` Manuel Klemenz
@ 2010-06-05 23:44     ` 7v5w7go9ub0o
  2010-06-06 10:19       ` Andrea Conti
  1 sibling, 1 reply; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-05 23:44 UTC (permalink / raw
  To: for list

On 06/05/10 15:23, meino.cramer@gmx.de wrote:

[]
> That looks really interesting. The only problem I have with this is
> that I have to have /dev/sda as /dev/sdb idle (not mounted) and
> because of that I use knoppix as temporary system to boot. And I dont
> think that knoppix has this tool "on board".

Just boot up knoppix, mount root partition that contains dcfldd, go to
wherever the executable is located (e.g. /usr/bin/dcfldd):

1. boot up knoppix
2. create a partition: mkdir /work
3. mount /work to the root partition: mount /dev/sdc /work
4. cd /work/usr/bin
5. run dcfldd: ./dcfldd

If your root partition is encrypted (e.g. mine is), then place a copy of
dcfldd on the boot partition; no boot partition, put a copy on its own
dedicated little partition.

Of course, you can always put a copy on a USB jumpdrive. As a last
alternative, download and compile a copy while in knoppix.

> Or is there a way to do such copies from a one disk to another while
>  one disk is booted???

Sure, but the running disk/sector would have temporary files that would
not consistently hash when you did the hash check. If you do this, try
it in linux without bringing up X. This might avoid copying some software
"locks" that could block startup on the copied disk/sector.

HTH



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05 23:44     ` 7v5w7go9ub0o
@ 2010-06-06 10:19       ` Andrea Conti
  2010-06-06 16:55         ` Mick
  2010-06-06 18:55         ` 7v5w7go9ub0o
  0 siblings, 2 replies; 25+ messages in thread
From: Andrea Conti @ 2010-06-06 10:19 UTC (permalink / raw
  To: gentoo-user

> 1. boot up knoppix
> 2. create a partition: mkdir /work
> 3. mount /work to the root partition: mount /dev/sdc /work
> 4. cd /work/usr/bin
> 5. run dcfldd: ./dcfldd

This is fine, provided that

1- if the root partition is [part of] what you're copying, you *must*
mount it read-only (mount -o ro /dev/sdc /work)

2- the dcfldd executable is linked statically. If it uses dynamic
linking, your "live" system -- knoppix in this case -- must have exactly
the same library versions (especially glibc) as the gentoo system.

>> Or is there a way to do such copies from a one disk to another while
>>  one disk is booted???

The point is not with being "booted" (i.e., part of the running system)
or not: you *cannot* reliably perform a sector-by-sector copy of any
write-mounted partition without special support either at the FS or
block device level (i.e. snapshots).

> Sure, but the running disk/sector would have temporary files that would
> not consistently hash when you did the hash check.

That is only a minor part of the problem. The real issue is that if
*anything* writes to the source partition while you are halfway through
copy, you risk ending up with inconsistencies in the filesystem
metadata. Doing a fsck on the copy will probably fix that, but you risk
losing or corrupting data.

And no, hashing as described in the previous post will *not* catch any
differences in this case, as the "source" hash is computed from what is
read during the copy (which, barring hardware problems, is what gets
written on the target disk) and not from the whole contents of the
source partition after the copy (or at any single point in time).

> If you do this, try it in linux without bringing up X.

That's definitely not enough: at the very least, boot up in single-user
mode and remount all your partitions read-only (mount -o remount,ro).
This will break things on a running system (e.g anything that writes to
/var and /tmp will throw errors or stop working), but it will allow you
to produce consistent partition images.

andrea



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 10:19       ` Andrea Conti
@ 2010-06-06 16:55         ` Mick
  2010-06-06 18:55         ` 7v5w7go9ub0o
  1 sibling, 0 replies; 25+ messages in thread
From: Mick @ 2010-06-06 16:55 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: Text/Plain, Size: 2490 bytes --]

On Sunday 06 June 2010 11:19:57 Andrea Conti wrote:
> > 1. boot up knoppix
> > 2. create a partition: mkdir /work
> > 3. mount /work to the root partition: mount /dev/sdc /work
> > 4. cd /work/usr/bin
> > 5. run dcfldd: ./dcfldd
> 
> This is fine, provided that
> 
> 1- if the root partition is [part of] what you're copying, you *must*
> mount it read-only (mount -o ro /dev/sdc /work)
> 
> 2- the dcfldd executable is linked statically. If it uses dynamic
> linking, your "live" system -- knoppix in this case -- must have exactly
> the same library versions (especially glibc) as the gentoo system.
> 
> >> Or is there a way to do such copies from a one disk to another while
> >>  one disk is booted???
> 
> The point is not with being "booted" (i.e., part of the running system)
> or not: you *cannot* reliably perform a sector-by-sector copy of any
> write-mounted partition without special support either at the FS or
> block device level (i.e. snapshots).
> 
> > Sure, but the running disk/sector would have temporary files that would
> > not consistently hash when you did the hash check.
> 
> That is only a minor part of the problem. The real issue is that if
> *anything* writes to the source partition while you are halfway through
> copy, you risk ending up with inconsistencies in the filesystem
> metadata. Doing a fsck on the copy will probably fix that, but you risk
> losing or corrupting data.
> 
> And no, hashing as described in the previous post will *not* catch any
> differences in this case, as the "source" hash is computed from what is
> read during the copy (which, barring hardware problems, is what gets
> written on the target disk) and not from the whole contents of the
> source partition after the copy (or at any single point in time).
> 
> > If you do this, try it in linux without bringing up X.
> 
> That's definitely not enough: at the very least, boot up in single-user
> mode and remount all your partitions read-only (mount -o remount,ro).
> This will break things on a running system (e.g anything that writes to
> /var and /tmp will throw errors or stop working), but it will allow you
> to produce consistent partition images.

It may be worth trying 'apt-get install dcfldd' after you su to root with 
Knoppix.  As long as Knoppix does not need a lorry load of dependencies you 
may be able to quickly install the .deb binary you need and move on with the 
task in hand.
-- 
Regards,
Mick

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 10:19       ` Andrea Conti
  2010-06-06 16:55         ` Mick
@ 2010-06-06 18:55         ` 7v5w7go9ub0o
  2010-06-06 20:00           ` Mick
  2010-06-06 20:45           ` Andrea Conti
  1 sibling, 2 replies; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-06 18:55 UTC (permalink / raw
  To: for list

On 06/06/10 06:19, Andrea Conti wrote:
>> 1. boot up knoppix 2. create a partition: mkdir /work 3. mount
>> /work to the root partition: mount /dev/sdc /work 4. cd
>> /work/usr/bin 5. run dcfldd: ./dcfldd
>
> This is fine, provided that
>
> 1- if the root partition is [part of] what you're copying, you
> *must* mount it read-only (mount -o ro /dev/sdc /work)

Not from my experience; I simply mount, exec, and go - Works fine, be it
a partition or a disk copy (though it seems likely that the last access
dates would be changed if forensics is an issue).
>
> 2- the dcfldd executable is linked statically. If it uses dynamic
> linking, your "live" system -- knoppix in this case -- must have
> exactly the same library versions (especially glibc) as the gentoo
> system.

Good point. I've been using a contemporary Gentoo live disk and the
libraries happen to be compatible.................

# ldd /usr/bin/dcfldd
         linux-vdso.so.1 =>  (0x00006cdd998b6000)
         libc.so.6 => /lib/libc.so.6 (0x00006cdd99341000)
         /lib64/ld-linux-x86-64.so.2 (0x00006cdd9969b000)



Based on this thread, I'll be running my backups from a
statically-linked copy of dcfldd on a "jumpdisk" (backup copy on the
boot sector).

- Any advice on the dd "blocksize" parameter?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05 20:11     ` Manuel Klemenz
@ 2010-06-06 19:02       ` 7v5w7go9ub0o
  2010-06-06 19:47         ` Joerg Schilling
  2010-06-06 22:46         ` Neil Bothwick
  0 siblings, 2 replies; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-06 19:02 UTC (permalink / raw
  To: for list

On 06/05/10 16:11, Manuel Klemenz wrote:
> I'm calculating checksums over partitions just by calling # md5sum
> /dev/sda1 or for the complete disk (incl. partition table + all
> partitions) # md5sum /dev/sda
>
> that's it :) - works with any distro/liveDVD
>

Yep...... don't have to fool with an oddball program (dcfldd). So if
you're dd'ing a disk, you need to:

1. dd the source to the destination.
2. md5sum the source
3. md5sum the destination.

(3 passes on a big disk(s) takes a long time.)

But if you use dcfldd instead of dd for the copy, then you'll get both
the copy and the md5 on the first pass.

1. dcfldd the source to the destination; get the md5.
2. md5sum the destination.

And if you use dcfldd instead of md5sum to run the destination hash, you
can specify a large (e.g. 4 gig) blocksize - cutting back on disk I/O,
wear and tear, and time required to hash the destination.






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 19:02       ` 7v5w7go9ub0o
@ 2010-06-06 19:47         ` Joerg Schilling
  2010-06-06 22:43           ` 7v5w7go9ub0o
  2010-06-06 22:46         ` Neil Bothwick
  1 sibling, 1 reply; 25+ messages in thread
From: Joerg Schilling @ 2010-06-06 19:47 UTC (permalink / raw
  To: gentoo-user

7v5w7go9ub0o <7v5w7go9ub0o@gmail.com> wrote:

> On 06/05/10 16:11, Manuel Klemenz wrote:
> > I'm calculating checksums over partitions just by calling # md5sum
> > /dev/sda1 or for the complete disk (incl. partition table + all
> > partitions) # md5sum /dev/sda
> >
> > that's it :) - works with any distro/liveDVD
> >
>
> Yep...... don't have to fool with an oddball program (dcfldd). So if
> you're dd'ing a disk, you need to:
>
> 1. dd the source to the destination.
> 2. md5sum the source
> 3. md5sum the destination.

Why not just call:

	sdd if=/dev/something bs=1m -md5 -onull

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de                (uni)  
       joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 18:55         ` 7v5w7go9ub0o
@ 2010-06-06 20:00           ` Mick
  2010-06-06 20:45           ` Andrea Conti
  1 sibling, 0 replies; 25+ messages in thread
From: Mick @ 2010-06-06 20:00 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: Text/Plain, Size: 491 bytes --]

On Sunday 06 June 2010 19:55:57 7v5w7go9ub0o wrote:

> Based on this thread, I'll be running my backups from a
> statically-linked copy of dcfldd on a "jumpdisk" (backup copy on the
> boot sector).
> 
> - Any advice on the dd "blocksize" parameter?

Only to say that 4096 gets things done sooo much faster than 512.  I don't 
really know what is appropriate and if the buffer of the drive should be 
brought into consideration and the bs adjusted to match.

-- 
Regards,
Mick

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 18:55         ` 7v5w7go9ub0o
  2010-06-06 20:00           ` Mick
@ 2010-06-06 20:45           ` Andrea Conti
  2010-06-06 23:06             ` 7v5w7go9ub0o
  1 sibling, 1 reply; 25+ messages in thread
From: Andrea Conti @ 2010-06-06 20:45 UTC (permalink / raw
  To: gentoo-user

>> 1- if the root partition is [part of] what you're copying, you
>> *must* mount it read-only (mount -o ro /dev/sdc /work)
> 
> Not from my experience; I simply mount, exec, and go - Works fine

Let's say you are 50% done copying a partition, when something writes to
it. If the write only affects the first half, which has alredy been
copied, the target will consistently reflect the "old" state; if on the
other hand the write only affects the second half, which has not been
copied yet, the target will consistently reflect the "new" state.
The problem is that with any write affecting both halves your copy will
contain a mix of the two states and thus will be inconsistent.

I am not saying that copying a rw-mounted partition will always fail:
you might be able to get away with it if nothing is actively writing to
the source partition. In any case you will not see any errors during the
copy.

andrea



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 19:47         ` Joerg Schilling
@ 2010-06-06 22:43           ` 7v5w7go9ub0o
  2010-06-06 23:12             ` Joerg Schilling
  0 siblings, 1 reply; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-06 22:43 UTC (permalink / raw
  To: for list

On 06/06/10 15:47, Joerg Schilling wrote:
> 7v5w7go9ub0o<7v5w7go9ub0o@gmail.com>  wrote:
>
>> On 06/05/10 16:11, Manuel Klemenz wrote:
>>> I'm calculating checksums over partitions just by calling #
>>> md5sum /dev/sda1 or for the complete disk (incl. partition table
>>>  + all partitions) # md5sum /dev/sda
>>>
>>> that's it :) - works with any distro/liveDVD
>>>
>>
>> Yep...... don't have to fool with an oddball program (dcfldd). So
>> if you're dd'ing a disk, you need to:
>>
>> 1. dd the source to the destination. 2. md5sum the source 3. md5sum
>> the destination.
>
> Why not just call:
>
> sdd if=/dev/something bs=1m -md5 -onull

err...... what is sdd?

If it is significantly faster than dd/dcfldd, then sdd may be the magic
bullet! E.G. one would:

1. sdd if=/dev/something bs=xx -md5 -o /dev/somethingout
2. sdd if=/dev/somethingout bs=xx  -md5 -o null

Of course, one might ask, "is it on Knopix?"




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 19:02       ` 7v5w7go9ub0o
  2010-06-06 19:47         ` Joerg Schilling
@ 2010-06-06 22:46         ` Neil Bothwick
  2010-06-07  1:04           ` 7v5w7go9ub0o
  1 sibling, 1 reply; 25+ messages in thread
From: Neil Bothwick @ 2010-06-06 22:46 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 689 bytes --]

On Sun, 06 Jun 2010 15:02:10 -0400, 7v5w7go9ub0o wrote:

> Yep...... don't have to fool with an oddball program (dcfldd). So if
> you're dd'ing a disk, you need to:
> 
> 1. dd the source to the destination.
> 2. md5sum the source
> 3. md5sum the destination.
> 
> (3 passes on a big disk(s) takes a long time.)
> 
> But if you use dcfldd instead of dd for the copy, then you'll get both
> the copy and the md5 on the first pass.
> 
> 1. dcfldd the source to the destination; get the md5.
> 2. md5sum the destination.

You can use tee to send the dd output to both the destination and md5sum.


-- 
Neil Bothwick

Math and alcohol don't mix. Don't drink and derive.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 20:45           ` Andrea Conti
@ 2010-06-06 23:06             ` 7v5w7go9ub0o
  0 siblings, 0 replies; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-06 23:06 UTC (permalink / raw
  To: for list

On 06/06/10 16:45, Andrea Conti wrote:
>>> 1- if the root partition is [part of] what you're copying, you
>>> *must* mount it read-only (mount -o ro /dev/sdc /work)
>>
>> Not from my experience; I simply mount, exec, and go - Works fine
>
> Let's say you are 50% done copying a partition, when something
> writes to it. If the write only affects the first half, which has
> alredy been copied, the target will consistently reflect the "old"
> state; if on the other hand the write only affects the second half,
> which has not been copied yet, the target will consistently reflect
> the "new" state. The problem is that with any write affecting both
> halves your copy will contain a mix of the two states and thus will
> be inconsistent.

Should that happen, I certainly agree that the copies would be
inconsistent... but I don't know what would cause the live OS to write
anything to it (other than update the last access date/time - which
occurs early on).

At any rate, should that happen, the hashes would disagree and I'd
reject the copy. Thus far the whole-disk hashes have always agreed

Now, if this were a forensic investigation, then you're absolutely right
- even updating an access time would be unacceptable; regardless that
the changed source and copied destination hash the same.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 22:43           ` 7v5w7go9ub0o
@ 2010-06-06 23:12             ` Joerg Schilling
  0 siblings, 0 replies; 25+ messages in thread
From: Joerg Schilling @ 2010-06-06 23:12 UTC (permalink / raw
  To: gentoo-user

7v5w7go9ub0o <7v5w7go9ub0o@gmail.com> wrote:

> > sdd if=/dev/something bs=1m -md5 -onull
>
> err...... what is sdd?

sdd is e.g. in ftp://ftp.berlios.de/pub/schily/

sdd is the oldest free "dd" reimplementation and it introduced intermediate
statistics with signals from TTY in 1984 already (this is what *BSD now offers 
with ^T).

> If it is significantly faster than dd/dcfldd, then sdd may be the magic
> bullet! E.G. one would:
>
> 1. sdd if=/dev/something bs=xx -md5 -o /dev/somethingout
> 2. sdd if=/dev/somethingout bs=xx  -md5 -o null

See man page....
and try out

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de                (uni)  
       joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-06 22:46         ` Neil Bothwick
@ 2010-06-07  1:04           ` 7v5w7go9ub0o
  0 siblings, 0 replies; 25+ messages in thread
From: 7v5w7go9ub0o @ 2010-06-07  1:04 UTC (permalink / raw
  To: for list

On 06/06/10 18:46, Neil Bothwick wrote:
[]
>
> You can use tee to send the dd output to both the destination and md5sum.


Dang!  I'd forgotten about tee; a core utility!

Yep....... using tee as you described would be the classic way to md5sum 
the source; use ubiquitous dd and md5sum to quickly hash the copy.

By any chance have you played with sdd?

<quote>


sdd is a replacement for a program called 'dd'.

sdd is much faster than dd in cases where input block size (ibs) is not 
equal
to the output block size (obs).

sdd does not have some of the design bugs of dd that cause the following
command to fail:

dd if=/dev/rdsk/c0t0d0s2 bs=126k | rsh otherhost 'dd ibs=4k obs=126k 
of=/dev/rdsk/c0t0d0s2'

The output disk will not be equal to the input disk because the dd command
on 'otherhost' will read fragments of 'ibs' and fill them up to 4kB.

Other features not found on 'dd':

          - Statistics are much better readable as from 'dd'.
            Output is not # of full blocks + # of partial blocks (not 
useful)
            But: # of full blocks + # of bytes from partial blocks = 
full amount

          sparky joerg > sdd if=/dev/rdsk/c0t0d0s0 bs=126k of=/dev/null -t
          sdd: Read  268 records + 22528 bytes (total of 34600960 bytes 
= 33790.00k).
          sdd: Wrote 268 records + 22528 bytes (total of 34600960 bytes 
= 33790.00k).
          sdd: Total time 10.601sec (3187 kBytes/sec)

          - rmt support for if= & of=

            call sdd -inull bs=63k count=1000 of=ntape@somehost:/dev/null
            to to TCP speed tests.

          - output file is sync'd before doing statistic report.
            This enforces NFS data integrity and helps to get right numbers
            when doing filesystem write perfornamance tests.

          - Timing available, -time option will print transfer speed

          - Timing & Statistics available at any time with SIGQUIT (^\)

          - Can seek on input and output

          - Fast null input
          - Fast null output

          - Reblocking on pipes does not fill small input blocks to
            input block size

          - Debug printing
          - Progress printing

<>




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-05  7:19 ` [gentoo-user] " Nikos Chantziaras
@ 2010-06-07 15:48   ` meino.cramer
  2010-06-07 17:10     ` walt
  2010-06-07 17:31     ` Andrea Conti
  0 siblings, 2 replies; 25+ messages in thread
From: meino.cramer @ 2010-06-07 15:48 UTC (permalink / raw
  To: gentoo-user

Nikos Chantziaras <realnc@arcor.de> [10-06-05 10:08]:
> On 06/05/2010 09:39 AM, meino.cramer@gmx.de wrote:
> >
> >Hi,
> >
> >  this night dd copies the contents of my first
> >  1TB disk to my second 1TB disk (same Model).
> >
> >  (dd if=/devsda of=/dev/sdb bs=4096)
> >
> >  I want to verify, that the copy is identical.
> >
> >  I tried (or: I am still trying) to checksum
> >  the first disk with
> >
> >  whirlpooldeep /dev/sda
> >
> >  whch seems to work but is DAMN slow (in relation
> >  to checksumming 1TB in whole).
> >
> >  Is there any faster and reliable way to checksum
> >  whole paritions (not on "per file" base)???
> >
> >  Thank you very much in advance for any help!
> >
> >  Best regards,
> >  mcc
> 
> Constructing a checksum means reading every byte off the partition.  So 
> it's slower as a copy to /dev/null, never faster (because the checksum 
> calculation also needs time.)
> 
> So in order to determine whether it's really slow, compare the time 
> needed to dd the whole partition to /dev/null to the time needed for 
> checksumming it.  Then post the times here and an expert might then 
> tell whether this can be improved at all or not.
> 

Since the following is only some "info in between" I answer not one
of the last posting, which only means, that this post somehow is
related to this fred but not as a direct reply.

I downloaded the current "Parted magic" which 
- copies itsself completly to RAM if wanted and runs from there
  which gives you back the dvd/cdrom drive for other things.
- contains a lot of get-out-of-a-desaster-tools combined
  with tools to play low level games with harddiscs
- contains dcfldd (!).

With this and dcfldd I copied one 1TB WD10EARS disk on a SATA1
controller to another harddisk of the same model with this 
timings (sda ==> sdb):

real: 293m17.265s
user: 113m59.072s
sys : 64m6.605s

Checksumming the second disk while copying its contents to /dev/null
reveals this timnings (sdb==>null:

real: 253m57.517s
user: 113m51.988s
sys : 32m21.381s

Again: The transfer was via SATA1 and the disks were jumpered to use 
SATA1 speeds only.

Despite a lot of CRC-error reported via dmesg/kernel logs the copy
was identical to the original. As mentioned, I think the via pata
conflicts with via sata in earlier kernels, since I do not see
these messages with 2.6.34.00.

I will check for sdd on the Parted Magic iso and post later what I did
find.

Does anyone has experiences with gparted?
Is it recommended (I need support for ext4)?

Best regards,
mcc

-- 
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-07 15:48   ` meino.cramer
@ 2010-06-07 17:10     ` walt
  2010-06-07 18:47       ` meino.cramer
  2010-06-07 17:31     ` Andrea Conti
  1 sibling, 1 reply; 25+ messages in thread
From: walt @ 2010-06-07 17:10 UTC (permalink / raw
  To: gentoo-user

On 06/07/2010 08:48 AM, meino.cramer@gmx.de wrote:

> Does anyone has experiences with gparted?
> Is it recommended (I need support for ext4)?

I've used it many times for DOS and ext3.  The program says it supports ext4
but I've never tried ext4 so I can't comment.

Gparted is just a gui front-end for parted, which copies individual partitions
but not whole disks.  If you have only one partition then it should be equivalent,
I think, to copying the disk.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-07 15:48   ` meino.cramer
  2010-06-07 17:10     ` walt
@ 2010-06-07 17:31     ` Andrea Conti
  2010-06-07 18:54       ` meino.cramer
  1 sibling, 1 reply; 25+ messages in thread
From: Andrea Conti @ 2010-06-07 17:31 UTC (permalink / raw
  To: gentoo-user

> Does anyone has experiences with gparted?

I have no experience with Parted Magic, but I have used a lot the
Gparted live CD (http://gparted.sourceforge.net/livecd.php). No idea on
how the two compare.

As for gparted (which is a lot more than a gui for parted), I have used
it on ext4 a couple of times and it managed to complete the copy without
destroying anything, so I'd say it works. On the other hand, I copied
*lots* of ext3 and ntfs partitions, and it never failed me.

BTW, by default gparted does *not* do sector-by-sector partition copies.
In my opinion this is a much better approach if you do not need a
bit-exact copy of the original (e.g. if you're doing forensics or
debugging filesystems), but in the end it's up to you.

andrea



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-07 17:10     ` walt
@ 2010-06-07 18:47       ` meino.cramer
  0 siblings, 0 replies; 25+ messages in thread
From: meino.cramer @ 2010-06-07 18:47 UTC (permalink / raw
  To: gentoo-user

walt <w41ter@gmail.com> [10-06-07 20:04]:
> On 06/07/2010 08:48 AM, meino.cramer@gmx.de wrote:
> 
> >Does anyone has experiences with gparted?
> >Is it recommended (I need support for ext4)?
> 
> I've used it many times for DOS and ext3.  The program says it supports 
> ext4
> but I've never tried ext4 so I can't comment.
> 
> Gparted is just a gui front-end for parted, which copies individual 
> partitions
> but not whole disks.  If you have only one partition then it should be 
> equivalent,
> I think, to copying the disk.
> 

There are three additional questions for me. Since I now have
successfully copied one disk to another I have to decide
whether I simply repartition the first and copy the contents
from the second to the first or to gparted the first one.

I tend to do the first thing, since it seems difficult to me
to check, whether the result of processing the disk with gparted
is correct.

Therefore the three questions for me are:
1) Gparted or repartitioning?
2) What is the most efficient way of copying the contents of one
   partition to another one on per-file-basis, which preserves as
   much as possible of file attributes including the file times?
3) Excluding typos and other fatal errors: Will it will always
   preserve and leave intact the contents (for example) of
   the first four (1,2,3,6)  partitions when repartitioning as follows:

   Old         ==>       New
   1   100M              100M
   2   100G              100G
   3   100G              100G
   5   extended          extended
   6   100G              100G
   7   200G              100G
   8   300G              200G
   9   400G              350G
  10   ----              250G


  ???

   Excluding any kind of experience, which I do not have
   in this last case I tend to say "Yes" ... but ...

   Thank you very much in advance for any help!

   Best regards,
   mcc
   

-- 
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-07 17:31     ` Andrea Conti
@ 2010-06-07 18:54       ` meino.cramer
  2010-06-11 22:41         ` Mick
  0 siblings, 1 reply; 25+ messages in thread
From: meino.cramer @ 2010-06-07 18:54 UTC (permalink / raw
  To: gentoo-user

Andrea Conti <alyf@alyf.net> [10-06-07 20:28]:
> > Does anyone has experiences with gparted?
> 
> I have no experience with Parted Magic, but I have used a lot the
> Gparted live CD (http://gparted.sourceforge.net/livecd.php). No idea on
> how the two compare.
> 
> As for gparted (which is a lot more than a gui for parted), I have used
> it on ext4 a couple of times and it managed to complete the copy without
> destroying anything, so I'd say it works. On the other hand, I copied
> *lots* of ext3 and ntfs partitions, and it never failed me.
> 
> BTW, by default gparted does *not* do sector-by-sector partition copies.
> In my opinion this is a much better approach if you do not need a
> bit-exact copy of the original (e.g. if you're doing forensics or
> debugging filesystems), but in the end it's up to you.
> 
> andrea
> 

Hi Andreas,

do I understand right here:
Instead of using dcfldd or simply dd to copy one disk to
another for backup reasons it is much better/faster to use gparted
to simply copy all partitions from sda to sdb using sda1=>sdb1,
sda2=>sdb2,... and so forth?
Is there something like a "batch job" or scripting interface
for gparted so that I can give gparted the complete copy job
once, go to sleep and next moring the copy is done?
Is gparted able to shut down the system after finishing its
work or to end itsself so a little script can detect the
job is done and halt the system then?

Best regards,
mcc


-- 
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [gentoo-user] Re: Fast checksumming of whole partitions
  2010-06-07 18:54       ` meino.cramer
@ 2010-06-11 22:41         ` Mick
  0 siblings, 0 replies; 25+ messages in thread
From: Mick @ 2010-06-11 22:41 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: Text/Plain, Size: 1860 bytes --]

On Monday 07 June 2010 19:54:04 meino.cramer@gmx.de wrote:
> Andrea Conti <alyf@alyf.net> [10-06-07 20:28]:
> > > Does anyone has experiences with gparted?
> >
> > I have no experience with Parted Magic, but I have used a lot the
> > Gparted live CD (http://gparted.sourceforge.net/livecd.php). No idea on
> > how the two compare.
> >
> > As for gparted (which is a lot more than a gui for parted), I have used
> > it on ext4 a couple of times and it managed to complete the copy without
> > destroying anything, so I'd say it works. On the other hand, I copied
> > *lots* of ext3 and ntfs partitions, and it never failed me.
> >
> > BTW, by default gparted does *not* do sector-by-sector partition copies.
> > In my opinion this is a much better approach if you do not need a
> > bit-exact copy of the original (e.g. if you're doing forensics or
> > debugging filesystems), but in the end it's up to you.
> >
> > andrea
> 
> Hi Andreas,
> 
> do I understand right here:
> Instead of using dcfldd or simply dd to copy one disk to
> another for backup reasons it is much better/faster to use gparted
> to simply copy all partitions from sda to sdb using sda1=>sdb1,
> sda2=>sdb2,... and so forth?
> Is there something like a "batch job" or scripting interface
> for gparted so that I can give gparted the complete copy job
> once, go to sleep and next moring the copy is done?
> Is gparted able to shut down the system after finishing its
> work or to end itsself so a little script can detect the
> job is done and halt the system then?

Am I being old-fashioned, or is there anything wrong with rsync (if not 
tar/star) for this purpose?  Make sure to add the relevant option for sparse 
files and only bits and bytes with data will be copied over.  Therefore it 
will be faster than dd at any rate.
-- 
Regards,
Mick

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-06-11 22:42 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-05  6:39 [gentoo-user] Fast checksumming of whole partitions meino.cramer
2010-06-05  7:19 ` [gentoo-user] " Nikos Chantziaras
2010-06-07 15:48   ` meino.cramer
2010-06-07 17:10     ` walt
2010-06-07 18:47       ` meino.cramer
2010-06-07 17:31     ` Andrea Conti
2010-06-07 18:54       ` meino.cramer
2010-06-11 22:41         ` Mick
2010-06-05  7:32 ` [gentoo-user] " Andrea Conti
2010-06-05 17:39 ` [gentoo-user] " 7v5w7go9ub0o
2010-06-05 19:23   ` meino.cramer
2010-06-05 20:11     ` Manuel Klemenz
2010-06-06 19:02       ` 7v5w7go9ub0o
2010-06-06 19:47         ` Joerg Schilling
2010-06-06 22:43           ` 7v5w7go9ub0o
2010-06-06 23:12             ` Joerg Schilling
2010-06-06 22:46         ` Neil Bothwick
2010-06-07  1:04           ` 7v5w7go9ub0o
2010-06-05 23:44     ` 7v5w7go9ub0o
2010-06-06 10:19       ` Andrea Conti
2010-06-06 16:55         ` Mick
2010-06-06 18:55         ` 7v5w7go9ub0o
2010-06-06 20:00           ` Mick
2010-06-06 20:45           ` Andrea Conti
2010-06-06 23:06             ` 7v5w7go9ub0o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox