Hardware cryptographic acceleration performance on SAMA5D27-SOM1

This forum is for users of Microchip MPUs and who are interested in using Linux OS.

Moderator: nferre

CleberPeter
Location: Brazil
Posts: 19
Joined: Tue May 14, 2019 7:57 pm

Hardware cryptographic acceleration performance on SAMA5D27-SOM1

Wed Sep 16, 2020 2:06 pm

kernel version: 4.14.88
yocto branch: thud
openssl: 1.1.1d
cryptodev: 1.10

I'm following this https://www.linux4sam.org/bin/view/Linu ... yptoConfig.

Below is the output from the openssl benchmark before enabling hardware acceleration:

Code: Select all

time -v openssl speed -evp aes-128-cbc -elapsed -mr
	Command being timed: "openssl speed -evp aes-128-cbc -elapsed -mr"
	User time (seconds): 17.98
	System time (seconds): 0.03
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.06s

The output below produced after the above command proves that acceleration is not used.

Code: Select all

cat /proc/interrupts 
                  
 27:          0  atmel-aic5  12 Level     atmel-sha
 28:          0  atmel-aic5   9 Level     atmel-aes
 49:          0  atmel-aic5  11 Level     atmel-tdes

Enabling acceleration:

Code: Select all

modprobe cryptodev

Re-running the benchmark:

Code: Select all

time -v openssl speed -evp aes-128-cbc -elapsed -mr
        Command being timed: "openssl speed -evp aes-128-cbc -elapsed -mr"
	User time (seconds): 0.30
	System time (seconds): 10.03
	Percent of CPU this job got: 56%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.28s

The proof that hardware acceleration was used is shown below:

Code: Select all

cat /proc/interrupts

27:          0  atmel-aic5  12 Level     atmel-sha
28:        273  atmel-aic5   9 Level     atmel-aes
49:          0  atmel-aic5  11 Level     atmel-tdes

There was a reduction in CPU usage from 100% to 56%.

Are these results consistent? Shouldn't cryptographic acceleration mean lower CPU usage?

Does anyone have a similar test for comparison?

I appreciate any comments.
blue_z
Location: USA
Posts: 2117
Joined: Thu Apr 19, 2007 10:15 pm

Re: Hardware cryptographic acceleration performance on SAMA5D27-SOM1

Thu Sep 17, 2020 9:44 pm

CleberPeter wrote: Are these results consistent?
Are you just taking one measurement?
I was taught that three samples was the minimum to conduct an experiment.

CleberPeter wrote: There was a reduction in CPU usage from 100% to 56%.
....
Shouldn't cryptographic acceleration mean lower CPU usage?
I can read your question either of two ways, but neither makes sense (i.e. "lower" than what reference point?).
(A) You misunderstand the results, since clearly less CPU time is spent in user mode in the SW case versus the HW case.
(B) You have a preconceived notion, and expect a larger difference that what you got.

The times for the SW case confirm that the "benchmark" is a computational-intensive process: essentailly 100% of CPU time is spent in user mode.
The times for the HW case seem to indicate that the "benchmark" has transformed a computational-intensive process into an I/O-intensive process, shrinking gross CPU time in user mode down to 2% while increasing gross CPU time in system mode way up to to 55% (presumably servicing the I/O requests). But there is also now 43% gross CPU time free and available for other processes.

So which numbers (and modes) are you choosing to compare for "lower CPU usage"?

CleberPeter wrote: I appreciate any comments.
Are you familiar with the old Indian fable of six blind men and the elephant?
The results you obtained from that command should not be used for quantitative comparisons.
If you do not understand what the 'openssl speed' command actually does, then comparing the 'time' results could be like the evaluation of one blind man.

Even the command options used are dubious choices.
Why is the verbose response of the 'time' command used?
Why is the 'mr' option used to produce "machine readable output"?

If you used the 'openssl speed' command to produce human-readable output, then you might realize that six (6) tests are actually performed.
Then you might realize that the software case is actually faster than hardware case when hashing small blocks.
Then you might realize that the software case is only slower than hardware case when hashing large blocks.

Bottom line:
The results you obtained so far confirm that you can utilize the crypto hardware from userspace.
Use the 'time' data with discernment since they are only gross numbers from one perspective.
Instead of

Code: Select all

# time -v openssl speed -evp aes-128-cbc -elapsed -mr
try

Code: Select all

# openssl speed -evp aes-128-cbc
several times for each SW/HW case.


Regards
CleberPeter
Location: Brazil
Posts: 19
Joined: Tue May 14, 2019 7:57 pm

Re: Hardware cryptographic acceleration performance on SAMA5D27-SOM1

Mon Sep 21, 2020 2:45 pm

Thanks for your answer.

In fact, I didn't notice the parameters used in the openssl speed command.

Really the command:

Code: Select all

openssl speed -evp aes-128-cbc 

Provides a more suitable output for the intended analysis.

without hardware acceleration:

Code: Select all

openssl speed -evp aes-128-cbc :
Doing aes-128-cbc for 3s on 16 size blocks: 1581030 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 64 size blocks: 516529 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 140162 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 35774 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 8192 size blocks: 4502 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 16384 size blocks: 2251 aes-128-cbc's in 2.99s

with hardware acceleration:

Code: Select all

modprobe cryptodev
openssl speed -evp aes-128-cbc :
Doing aes-128-cbc for 3s on 16 size blocks: 66981 aes-128-cbc's in 0.08s
Doing aes-128-cbc for 3s on 64 size blocks: 62641 aes-128-cbc's in 0.05s
Doing aes-128-cbc for 3s on 256 size blocks: 33821 aes-128-cbc's in 0.06s
Doing aes-128-cbc for 3s on 1024 size blocks: 26659 aes-128-cbc's in 0.02s
Doing aes-128-cbc for 3s on 8192 size blocks: 12776 aes-128-cbc's in 0.00s
Doing aes-128-cbc for 3s on 16384 size blocks: 8134 aes-128-cbc's in 0.00s

Conclusion:

The speed command will always produce high CPU usage regardless of whether it has acceleration or not. A more adequate approach to performance analysis is to check how many times the algorithm can be executed in 3 seconds.

Return to “LINUX”

Who is online

Users browsing this forum: Bing [Bot] and 10 guests