On Saturday, 14 October 2023 04:21:23 BST Dale wrote:
> Frank Steinmetzger wrote:
> > Am Thu, Oct 12, 2023 at 08:35:21PM -0500 schrieb Dale:
> >> Frank Steinmetzger wrote:
> >>> Am Thu, Oct 12, 2023 at 10:44:39PM +0100 schrieb Michael:
> >>>> Why don't you test throughput without encryption to confirm your
> >>>> assumption?>>> 
> >>> What does `cryptsetup benchmark` say? I used to use a Celeron G1840 in
> >>> my
> >>> NAS, which is Intel Haswell without AES_NI. It was able to do ~ 150 MB/s
> >>> raw encryption throughput when transferring to or from a LUKS’ed image
> >>> in a ramdisk, so almost 150 % of gigabit ethernet speed.
> >> 
> >> […]
> >> I've never used that benchmark.  Didn't know it exists.  This is the
> >> results.  Keep in mind, fireball is my main rig.  The FX-8350 thingy. 
> >> The NAS is currently the old 770T system.  Sometimes it is a old Dell
> >> Inspiron but not this time.  ;-)
> >> 
> >> root@fireball / # cryptsetup benchmark
> >> […]
> >> #     Algorithm |       Key |      Encryption |      Decryption
> >>         aes-cbc        128b        63.8 MiB/s        51.4 MiB/s
> >>     serpent-cbc        128b        90.9 MiB/s       307.6 MiB/s
> >>     twofish-cbc        128b       200.4 MiB/s       218.4 MiB/s
> >>         aes-cbc        256b        54.6 MiB/s        37.5 MiB/s
> >>     serpent-cbc        256b        90.4 MiB/s       302.6 MiB/s
> >>     twofish-cbc        256b       198.2 MiB/s       216.7 MiB/s
> >>         aes-xts        256b        68.0 MiB/s        45.0 MiB/s
> >>     serpent-xts        256b       231.9 MiB/s       227.6 MiB/s
> >>     twofish-xts        256b       191.8 MiB/s       163.1 MiB/s
> >>         aes-xts        512b        42.4 MiB/s        18.9 MiB/s
> >>     serpent-xts        512b       100.9 MiB/s       124.6 MiB/s
> >>     twofish-xts        512b       154.8 MiB/s       173.3 MiB/s
> >> root@fireball / #
> > 
> > Phew, this looks veeeery slow. As you can clearly see, this is not enough
> > to even saturate Gbit ethernet. Unfortunately, I don’t have any benchmark
> > data left over from the mentioned celeron.
> > (Perhaps that’s why the industry chose to implement AES in hardware,
> > because it was the slowest of the bunch.)
> > 
> > It looks like there is no hardware acceleration involved. But according to
> > https://en.wikipedia.org/wiki/List_of_AMD_FX_processors#Piledriver-based
> > and https://www.cpu-world.com/CPUs/Bulldozer/AMD-FX-Series%20FX-8350.html
> > it has the extension. I’d say something is amiss in your kernel.

Yes, I also think AES_NI has not been enabled in Dale's kernel config.

I just ran 'cryptsetup benchmark' on an A10-7850K APU (Kaveri Steamroller core 
as opposed to the 2 year older FX-8350 Vishera Piledriver core) and aes-xts 
fares much better;

# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1      1028015 iterations per second for 256-bit key
PBKDF2-sha256    1464491 iterations per second for 256-bit key
PBKDF2-sha512    1123875 iterations per second for 256-bit key
PBKDF2-ripemd160  708497 iterations per second for 256-bit key
PBKDF2-whirlpool  389515 iterations per second for 256-bit key
argon2i       5 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-
bit key (requested 2000 ms time)
argon2id      5 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-
bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       586.6 MiB/s      2169.8 MiB/s
    serpent-cbc        128b        88.6 MiB/s       330.6 MiB/s
    twofish-cbc        128b       203.1 MiB/s       277.6 MiB/s
        aes-cbc        256b       443.0 MiB/s      1712.5 MiB/s
    serpent-cbc        256b        89.7 MiB/s       329.8 MiB/s
    twofish-cbc        256b       204.0 MiB/s       277.3 MiB/s
        aes-xts        256b      1840.9 MiB/s      1857.6 MiB/s <==
    serpent-xts        256b       288.4 MiB/s       299.6 MiB/s
    twofish-xts        256b       240.6 MiB/s       252.9 MiB/s
        aes-xts        512b      1459.3 MiB/s      1474.2 MiB/s <==
    serpent-xts        512b       291.6 MiB/s       299.0 MiB/s
    twofish-xts        512b       242.8 MiB/s       252.7 MiB/s


Whether 256 or 512 bit aes-xts performance would fill up a 1Gbps pipe.  
Without AES_NI the performance on this CPU is ~10 times slower.  I expect the 
FX-8350 would produce comparable results once the kernel crypto options are 
sorted.


> >>> Ah right, you use NFS. If not, I’d have suggested not to use rsync over
> >>> ssh, because that would indeed introduce a lot of encryption overhead.
> >> 
> >> I thought nfs was the proper way.  I use ssh and I use rsync,
> >> separately.  Didn't know they can be used together tho. 
> > 
> > When you do `rsync -ai source host:/path/to/destination/`, you use ssh for
> > transport.
> 
> Well, I may be doing this all wrong.  First, I ssh into the NAS box.  I
> do the decrypt stuff and mount the LV in it's proper place.  Then I
> switch back to a Konsole tab for Fireball.  I mount the NAS box to a
> mount point on Fireball with nfs thingy.  From there I use rsync to copy
> from one point to the other.  I mostly use this command and options for
> restore. 
> 
> 
> rsync -uivr --progress  /mnt/1/ /mnt/2/
> 
> 
> Sometimes that varies a bit depending on exactly what I am copying and
> from where to where.  Example, when updating my backups, I include the
> --delete option because if I delete a file, I almost always want it gone
> on the backup too.  I also shortened the source and target.  That should
> give you a good idea how wrong I'm doing this tho.  ROFL  :/ 

Perhaps you have configured the rsync options to suit your backup needs, but 
why have you chosen '-u'?  Do you expect to have files in your NAS which are 
*newer* than your Fireball fs?

Wouldn't '-a' be more appropriate?  You could add '-A' and 'X' to include any 
ACLs and extended attributes, '-H' to copy hard links rather than making 
separate copies.


> >>>>> I still think encryption is slowing it down some.  As you say tho,
> >>>>> ethernet isn't helping which is why I may look into other options
> >>>>> later,
> >>>>> faster ethernet or fiber if I can find something cheap enough.
> > 
> > What do you mean with “ethernet is not helping”? As we could see above,
> > your AES throughput cannot keep up with Gbit.
> 
> Well, I was thinking the ethernet might be slowing things at times.  I'm
> not sure on that tho.  I do know the CPU fan ramps up to a good speed
> for a while then goes back to basically what it is when idle.  After a
> short time, it speeds up again and repeats.  It has done this throughout
> the whole restore process. 

Your original hunch was correct - you need to enable hardware acceleration for 
crypto in your kernel.  The 1Gbps network link is not saturated as things 
stand.  The speed up and down of your CPU could be caused by thermal 
hysteresis on the fan control circuit, or because it is waiting for the 
receiving end to process the data already sent and buffered awaiting to be 
written to disk. 


> >> That may explain why I don't see as much load on my main rig then.  It
> >> has the extra instructions.  I'm not sure if the 770T does or not.

With hardware acceleration the A10-7850K APU shows between 10-25% CPU load in 
GkrellM during the cryptsetup benchmark.


> > The mobo should have no influence on crypto performance.
> > 
> >>   It
> >> has Ubuntu so I can't run the Gentoo CPU flag thingy.  So, I checked
> >> /proc/cpuinfo and it doesn't show it on the 770T but my main rig
> >> Fireball does.  So, it seems Fireball has it, older 770T NAS box does
> >> not.  That could be a bottleneck.  Maybe. 
> > 
> > But interestingly, the NAS box shows higher AES throughput than fireball,
> > probably through raw performance. (What processor does it have?)
> 
> That's interesting.  I thought that to but thought maybe I was reading
> the results wrong.  It has a Phenom II X4 955.  Keep in mind, the NAS
> box has Ubuntu on it.  It's not a kernel I built or configured.  If you
> think it is missing something, it just may be. Building a new kernel
> could get interesting tho.  May need a hammer.  o_O 
> 
> So to make sure I get this, you're saying the old 770T NAS box is
> performing better on encryption than my slightly newer rig with aes
> support on the CPU?  That would be interesting.  If so, that 770T may be
> a dedicated NAS box thingy.  Once I get done building a new rig and all.

If you build a new box, you can retire the Phenom and use the FX-8350 box as a 
NAS server for your backups, *after* you have configured encryption in its 
kernel.


> Just a FYI.  My restore from backup has finished.  To test anything, I
> may have to get a bag of tricks.  I guess I could find a large file,
> several GBs in size, and copy, delete, copy, delete etc to get some
> results.  I'm about to connect some external hard drives to restore some
> smaller directories, my smaller directories are still quite large tho. 
> Those drives attach directly to my system, no ethernet.  I'm curious to
> see if the data throughput behaves the same way.  I seem to recall in
> another thread that it does.

In the first instance fix your kernel and reboot before you test anything 
else.  You should see a considerable improvement, as far as the receiving end 
allows.