crypto_buffer_check() is in the hot path in /dev/crypto. It is run for every ioctl that does useful work. We noticed around 15% increased throughput (with 32-threads) when this check is not done. And the scalability went up from 18.8X to 21.2X. We need to make this check less costly and/or get it out of the hot path.
With default /dev/crypto driver:
------------------------------------
# ./pk11aesperf -n10000 -s8192 -S0
Overall Throughput:
===================
Finished 10000 ops in 602654745 nanosecs (602.65 ms)
Data Rate: 16593.25 ops/sec 135931897.19 bytes/second
Using: 1 children (each 10000 operations)
# ./pk11aesperf -c32 -n10000 -s8192 -S0
Overall Throughput:
===================
Finished 320000 ops in 1023233722 nanosecs (1023.23 ms)
Data Rate: 312734.03 ops/sec 2561917175.66 bytes/second
Using: 32 children (each 10000 operations)
With modified /dev/crypto driver (no crypto_buffer_check())
-----------------------------------------------------------
# ./pk11aesperf -n10000 -s8192 -S0
Overall Throughput:
===================
Finished 10000 ops in 590102111 nanosecs (590.10 ms)
Data Rate: 16946.22 ops/sec 138823431.33 bytes/second
Using: 1 children (each 10000 operations)
# ./pk11aesperf -c32 -n10000 -s8192 -S0
Overall Throughput:
===================
Finished 320000 ops in 889146488 nanosecs (889.15 ms)
Data Rate: 359895.70 ops/sec 2948265607.60 bytes/second
Using: 32 children (each 10000 operations)