On Saturday, 14 October 2023 04:21:23 BST Dale wrote: > Frank Steinmetzger wrote: > > Am Thu, Oct 12, 2023 at 08:35:21PM -0500 schrieb Dale: > >> Frank Steinmetzger wrote: > >>> Am Thu, Oct 12, 2023 at 10:44:39PM +0100 schrieb Michael: > >>>> Why don't you test throughput without encryption to confirm your > >>>> assumption?>>> > >>> What does `cryptsetup benchmark` say? I used to use a Celeron G1840 in > >>> my > >>> NAS, which is Intel Haswell without AES_NI. It was able to do ~ 150 MB/s > >>> raw encryption throughput when transferring to or from a LUKS’ed image > >>> in a ramdisk, so almost 150 % of gigabit ethernet speed. > >> > >> […] > >> I've never used that benchmark. Didn't know it exists. This is the > >> results. Keep in mind, fireball is my main rig. The FX-8350 thingy. > >> The NAS is currently the old 770T system. Sometimes it is a old Dell > >> Inspiron but not this time. ;-) > >> > >> root@fireball / # cryptsetup benchmark > >> […] > >> # Algorithm | Key | Encryption | Decryption > >> aes-cbc 128b 63.8 MiB/s 51.4 MiB/s > >> serpent-cbc 128b 90.9 MiB/s 307.6 MiB/s > >> twofish-cbc 128b 200.4 MiB/s 218.4 MiB/s > >> aes-cbc 256b 54.6 MiB/s 37.5 MiB/s > >> serpent-cbc 256b 90.4 MiB/s 302.6 MiB/s > >> twofish-cbc 256b 198.2 MiB/s 216.7 MiB/s > >> aes-xts 256b 68.0 MiB/s 45.0 MiB/s > >> serpent-xts 256b 231.9 MiB/s 227.6 MiB/s > >> twofish-xts 256b 191.8 MiB/s 163.1 MiB/s > >> aes-xts 512b 42.4 MiB/s 18.9 MiB/s > >> serpent-xts 512b 100.9 MiB/s 124.6 MiB/s > >> twofish-xts 512b 154.8 MiB/s 173.3 MiB/s > >> root@fireball / # > > > > Phew, this looks veeeery slow. As you can clearly see, this is not enough > > to even saturate Gbit ethernet. Unfortunately, I don’t have any benchmark > > data left over from the mentioned celeron. > > (Perhaps that’s why the industry chose to implement AES in hardware, > > because it was the slowest of the bunch.) > > > > It looks like there is no hardware acceleration involved. But according to > > https://en.wikipedia.org/wiki/List_of_AMD_FX_processors#Piledriver-based > > and https://www.cpu-world.com/CPUs/Bulldozer/AMD-FX-Series%20FX-8350.html > > it has the extension. I’d say something is amiss in your kernel. Yes, I also think AES_NI has not been enabled in Dale's kernel config. I just ran 'cryptsetup benchmark' on an A10-7850K APU (Kaveri Steamroller core as opposed to the 2 year older FX-8350 Vishera Piledriver core) and aes-xts fares much better; # Tests are approximate using memory only (no storage IO). PBKDF2-sha1 1028015 iterations per second for 256-bit key PBKDF2-sha256 1464491 iterations per second for 256-bit key PBKDF2-sha512 1123875 iterations per second for 256-bit key PBKDF2-ripemd160 708497 iterations per second for 256-bit key PBKDF2-whirlpool 389515 iterations per second for 256-bit key argon2i 5 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256- bit key (requested 2000 ms time) argon2id 5 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256- bit key (requested 2000 ms time) # Algorithm | Key | Encryption | Decryption aes-cbc 128b 586.6 MiB/s 2169.8 MiB/s serpent-cbc 128b 88.6 MiB/s 330.6 MiB/s twofish-cbc 128b 203.1 MiB/s 277.6 MiB/s aes-cbc 256b 443.0 MiB/s 1712.5 MiB/s serpent-cbc 256b 89.7 MiB/s 329.8 MiB/s twofish-cbc 256b 204.0 MiB/s 277.3 MiB/s aes-xts 256b 1840.9 MiB/s 1857.6 MiB/s <== serpent-xts 256b 288.4 MiB/s 299.6 MiB/s twofish-xts 256b 240.6 MiB/s 252.9 MiB/s aes-xts 512b 1459.3 MiB/s 1474.2 MiB/s <== serpent-xts 512b 291.6 MiB/s 299.0 MiB/s twofish-xts 512b 242.8 MiB/s 252.7 MiB/s Whether 256 or 512 bit aes-xts performance would fill up a 1Gbps pipe. Without AES_NI the performance on this CPU is ~10 times slower. I expect the FX-8350 would produce comparable results once the kernel crypto options are sorted. > >>> Ah right, you use NFS. If not, I’d have suggested not to use rsync over > >>> ssh, because that would indeed introduce a lot of encryption overhead. > >> > >> I thought nfs was the proper way. I use ssh and I use rsync, > >> separately. Didn't know they can be used together tho. > > > > When you do `rsync -ai source host:/path/to/destination/`, you use ssh for > > transport. > > Well, I may be doing this all wrong. First, I ssh into the NAS box. I > do the decrypt stuff and mount the LV in it's proper place. Then I > switch back to a Konsole tab for Fireball. I mount the NAS box to a > mount point on Fireball with nfs thingy. From there I use rsync to copy > from one point to the other. I mostly use this command and options for > restore. > > > rsync -uivr --progress /mnt/1/ /mnt/2/ > > > Sometimes that varies a bit depending on exactly what I am copying and > from where to where. Example, when updating my backups, I include the > --delete option because if I delete a file, I almost always want it gone > on the backup too. I also shortened the source and target. That should > give you a good idea how wrong I'm doing this tho. ROFL :/ Perhaps you have configured the rsync options to suit your backup needs, but why have you chosen '-u'? Do you expect to have files in your NAS which are *newer* than your Fireball fs? Wouldn't '-a' be more appropriate? You could add '-A' and 'X' to include any ACLs and extended attributes, '-H' to copy hard links rather than making separate copies. > >>>>> I still think encryption is slowing it down some. As you say tho, > >>>>> ethernet isn't helping which is why I may look into other options > >>>>> later, > >>>>> faster ethernet or fiber if I can find something cheap enough. > > > > What do you mean with “ethernet is not helping”? As we could see above, > > your AES throughput cannot keep up with Gbit. > > Well, I was thinking the ethernet might be slowing things at times. I'm > not sure on that tho. I do know the CPU fan ramps up to a good speed > for a while then goes back to basically what it is when idle. After a > short time, it speeds up again and repeats. It has done this throughout > the whole restore process. Your original hunch was correct - you need to enable hardware acceleration for crypto in your kernel. The 1Gbps network link is not saturated as things stand. The speed up and down of your CPU could be caused by thermal hysteresis on the fan control circuit, or because it is waiting for the receiving end to process the data already sent and buffered awaiting to be written to disk. > >> That may explain why I don't see as much load on my main rig then. It > >> has the extra instructions. I'm not sure if the 770T does or not. With hardware acceleration the A10-7850K APU shows between 10-25% CPU load in GkrellM during the cryptsetup benchmark. > > The mobo should have no influence on crypto performance. > > > >> It > >> has Ubuntu so I can't run the Gentoo CPU flag thingy. So, I checked > >> /proc/cpuinfo and it doesn't show it on the 770T but my main rig > >> Fireball does. So, it seems Fireball has it, older 770T NAS box does > >> not. That could be a bottleneck. Maybe. > > > > But interestingly, the NAS box shows higher AES throughput than fireball, > > probably through raw performance. (What processor does it have?) > > That's interesting. I thought that to but thought maybe I was reading > the results wrong. It has a Phenom II X4 955. Keep in mind, the NAS > box has Ubuntu on it. It's not a kernel I built or configured. If you > think it is missing something, it just may be. Building a new kernel > could get interesting tho. May need a hammer. o_O > > So to make sure I get this, you're saying the old 770T NAS box is > performing better on encryption than my slightly newer rig with aes > support on the CPU? That would be interesting. If so, that 770T may be > a dedicated NAS box thingy. Once I get done building a new rig and all. If you build a new box, you can retire the Phenom and use the FX-8350 box as a NAS server for your backups, *after* you have configured encryption in its kernel. > Just a FYI. My restore from backup has finished. To test anything, I > may have to get a bag of tricks. I guess I could find a large file, > several GBs in size, and copy, delete, copy, delete etc to get some > results. I'm about to connect some external hard drives to restore some > smaller directories, my smaller directories are still quite large tho. > Those drives attach directly to my system, no ethernet. I'm curious to > see if the data throughput behaves the same way. I seem to recall in > another thread that it does. In the first instance fix your kernel and reboot before you test anything else. You should see a considerable improvement, as far as the receiving end allows.