Cache configuration on SAMA5D2

Discussion around products based on ARM Cortex-A5 core.

Moderator: nferre

pbugalski
Posts: 17
Joined: Thu Nov 29, 2018 4:28 pm

Cache configuration on SAMA5D2

Fri Oct 11, 2019 5:51 pm

Hi,

I'm trying to optimize code execution speed a little and I've found some differences between cache configuration used in Atmel SDK and Linux.
First difference is in exlusive/non-exclusive mode. As far as I understand Atmel SDK uses exclusive mode (https://github.com/atmelcorp/atmel-soft ... che_l2cc.c line 416):

Code: Select all

	/* Set exclusive mode */
	l2cache_set_exclusive();
Default setting for L1 cache is non-exclusive, so it seems to not fit well.

It looks like linux4sam uses non-exclusive mode for L1 and L2 cache and this configuration gives better efficiency.
So my question is - how L1 and L2 should be configured? Why code in Atmel SDK uses exclusive mode? Is it correct to have different modes in L1 and L2 cache?

Best Regards,
Piotr Bugalski
blue_z
Location: USA
Posts: 2007
Joined: Thu Apr 19, 2007 10:15 pm

Re: Cache configuration on SAMA5D2

Sat Oct 12, 2019 1:25 am

pbugalski wrote: I'm trying to optimize code execution speed a little ...
You know that algorithm selection and good coding would make the most significant performance contributions, right?

To be clear you are referring to CPU cache.

pbugalski wrote: First difference is in exlusive/non-exclusive mode.
...
Is it correct to have different modes in L1 and L2 cache?
Exclusion/inclusion policy is not a mode assigned to individual levels of CPU cache but rather applies to the entire cache.
The policy describes how each level of the cache maintains data with respect to the other cache level(s).
Study CPU cache: Exclusive versus inclusive and Cache inclusion policy.

For the SAMA5D2 this cache policy is programmable.
Note that the L2 Cache Controller is optional to the ARM Cortex-A5 processor at the IP level as well as at run-time.
Hence the the L1 cache is described in the Cortex-A5 TRM while the L2 Cache Controller is described in the Microchip SoC datasheet.
The (same) exclusion/inclusion policy must be configured in both the Cortex-A5 processor and in the L2 Cache Controller.
The exclusion/inclusion policy only has meaning when the L2 cache has been enabled (along with L1).

pbugalski wrote: Default setting for L1 cache is non-exclusive, so it seems to not fit well.
That last half makes no sense.

I'm quite sure, that with full knowledge of CPU cache sizes and replacement policies, programs can be written to demonstrate the superior or inferior performance of either exclusion or inclusion cache policy.

Regards
pbugalski
Posts: 17
Joined: Thu Nov 29, 2018 4:28 pm

Re: Cache configuration on SAMA5D2

Sat Oct 12, 2019 2:08 am

blue_z wrote:
Sat Oct 12, 2019 1:25 am
pbugalski wrote: I'm trying to optimize code execution speed a little ...
You know that algorithm selection and good coding would make the most significant performance contributions, right?

I'm comparing efficiency using the same application, but with different cache settings so algorithms are not relevant here.

blue_z wrote:
Sat Oct 12, 2019 1:25 am
pbugalski wrote: First difference is in exlusive/non-exclusive mode.
...
Is it correct to have different modes in L1 and L2 cache?
Exclusion/inclusion policy is not a mode assigned to individual levels of CPU cache but rather applies to the entire cache.
The policy describes how each level of the cache maintains data with respect to the other cache level(s).
Study CPU cache: Exclusive versus inclusive and Cache inclusion policy.

For the SAMA5D2 this cache policy is programmable.
Note that the L2 Cache Controller is optional to the ARM Cortex-A5 processor at the IP level as well as at run-time.
Hence the the L1 cache is described in the Cortex-A5 TRM while the L2 Cache Controller is described in the Microchip SoC datasheet.
The (same) exclusion/inclusion policy must be defined in both the Cortex-A5 processor and in the L2 Cache Controller.
The exclusion/inclusion policy only has meaning when the L2 cache has been enabled (along with L1).

It was also my guess that policy has to be set the same way in both caches. It also means atmel-software-package has issue in cache configuration.

blue_z wrote:
Sat Oct 12, 2019 1:25 am

pbugalski wrote: Default setting for L1 cache is non-exclusive, so it seems to not fit well.
That last half makes no sense.

I'm quite sure, that with full knowledge of cache sizes and replacement policies, programs can be written to demonstrate the superior or inferior performance of either exclusion or inclusion cache policy.

Regards

I tried to say exactly what you have commented above - L1 with default setup as exclusive doesn't fit L2 set by atmel-software-package as non-exclusive.

I'm not trying to find best or worst application showing how cache works. Final application works using Linux and I can see much worse efficiency when cache setup from atmel-software-package is used.
It looks like Linux works much faster when non-exclusive policy is used, but I'm wondering why exclusive mode was chosen in atmel-software-package. What was the reason to do so?
Please have a look at l2cache_l2cc.c file (https://github.com/atmelcorp/atmel-soft ... che_l2cc.c), l2cc_configure function (line 388-420) doesn't even have an option to choose non-exclusive mode.
blue_z
Location: USA
Posts: 2007
Joined: Thu Apr 19, 2007 10:15 pm

Re: Cache configuration on SAMA5D2

Tue Oct 15, 2019 2:06 am

pbugalski wrote: It was also my guess that policy has to be set the same way in both caches.
There's no need to guess. It is explicitly stated in the ARM Cortex-A5 TRM, section 8.1.7.

pbugalski wrote: It also means atmel-software-package has issue in cache configuration.
...
I tried to say exactly what you have commented above - L1 with default setup as exclusive doesn't fit L2 set by atmel-software-package as non-exclusive.
Rather than "fit", a more appropriate word is "match".

Have you actually inspected the registers to confirm that L1 and L2 caches are enabled and have mismatched modes at runtime?

pbugalski wrote: What was the reason to do so?
You'll have to ask Microchip/Atmel for that type of question.

pbugalski wrote: Please have a look at l2cache_l2cc.c file (https://github.com/atmelcorp/atmel-soft ... che_l2cc.c), l2cc_configure function (line 388-420) doesn't even have an option to choose non-exclusive mode.
Just because a routine exists does not mean that it is executed.

Your questions seems to be based on code inspection rather than actual runtime verification of the HW configuration.
IOW the performance difference you observe could also be explained by L2 cache is not enabled by the Softpack (which I vaguely recall being the case at one point in time) rather than mismatched/conflicting modes.

My code inspection (today) indicates that L2 cache is only enabled in the Softpack by the low_power_mode example.

Regards
pbugalski
Posts: 17
Joined: Thu Nov 29, 2018 4:28 pm

Re: Cache configuration on SAMA5D2

Tue Oct 15, 2019 7:35 am

blue_z wrote:
Tue Oct 15, 2019 2:06 am
There's no need to guess. It is explicitly stated in the ARM Cortex-A5 TRM, section 8.1.7.
Rather than "fit", a more appropriate word is "match".
Have you actually inspected the registers to confirm that L1 and L2 caches are enabled and have mismatched modes at runtime?
I've inspected registers values, read the code and run some benchmarks. But code I'm using is not pure softpack.
I'm working on a project for about three years and we've used Atmel SDK as a basis to write our application. During that time there were a lot of API changes in Atmel SDK / softpack so we don't use softpack as it's available now, but we have a lot of code concepts from it.
Especially L2 cache configuration we are using was copied from softpack. Unfortunately it has some disadvantage I've mentioned.
Thank you for language improvements, unfortunately I'm not English native speaker and I have to apologive for mistakes I'm doing. But I hope you can understand my question despite language imperfection.

blue_z wrote:
Tue Oct 15, 2019 2:06 am
You'll have to ask Microchip/Atmel for that type of question.

Just because a routine exists does not mean that it is executed.
Your questions seems to be based on code inspection rather than actual runtime verification of the HW configuration.
IOW the performance difference you observe could also be explained by L2 cache is not enabled by the Softpack (which I vaguely recall being the case at one point in time) rather than mismatched/conflicting modes.

My code inspection (today) indicates that L2 cache is only enabled in the Softpack by the low_power_mode example.

Regards
You are perfectly right that sama5d2 code use l2cc only in low power mode, but as I mentioned earlier, I'm using softpack as a part of a bit bigger project. We need to use L2 cache to run Linux at reasonable speed.

I'm asking about L2 and L1 cache configuration not to point issues in dead-code. At the moment I'm trying to find why Linux started with bootloader written using softpack code is less effective than the same Linux used with at91bootstrap. I've identified L1-L2 cache misconfiguration and also different mode (exclusive vs non-exclusive). I expected that it's better to ask why for example exclusive mode was used - maybe there were some important concept behind it, which I'm just not aware about?

I don't have access to benchmarks at the moment, but I can send results later if you are interested. Generally when Linux is running with L2 cache from softpack, standard C memcpy bandwidth is two times lower than when configuration from at91bootstrap is used. In my case it's important factor - especially because customer application is written in java and uses huge amount of memory.
And just to clarify benchmark method - I'm using tinymembench (https://github.com/ssvb/tinymembench) and some other standard Linux tools available in yocto.

Best Regards
blue_z
Location: USA
Posts: 2007
Joined: Thu Apr 19, 2007 10:15 pm

Re: Cache configuration on SAMA5D2

Wed Oct 16, 2019 2:44 am

pbugalski wrote: But code I'm using is not pure softpack.
...
..., but as I mentioned earlier, I'm using softpack as a part of a bit bigger project.
...
At the moment I'm trying to find why ...
Bad grammar and spelling do not bother me as much as misleading/extraneous questions (such as your two prior posts).

pbugalski wrote: I've identified L1-L2 cache misconfiguration and also different mode (exclusive vs non-exclusive).
Out of many Softpack examples, AFAIK there is only one example that actually does enable L2 cache and use "Exclusive mode".
It's an outlier example.
Despite the cache requirement stated in the TRM, you hold up that one example as if it were a standard and suitable for reuse in a boot program.
So be careful which configuration difference you call a misconfiguration.

pbugalski wrote: At the moment I'm trying to find why Linux started with bootloader written using softpack code is less effective than the same Linux used with at91bootstrap.
...
Generally when Linux is running with L2 cache from softpack, standard C memcpy bandwidth is two times lower than when configuration from at91bootstrap is used.
So finally we get something resembling the actual problem statement.

The L2 cache controller is treated as a peripheral by the Linux kernel, e.g. it is described in the Device Tree and has its own device driver.
On kernel startup peripherals are expected to be quiescent, i.e. not enabled.

Does your implementation of "softpack boot" conform to this requirement?

FYI the Linux kernel driver will not be able to properly configure the ARM L2 cache controller if that controller is already enabled.

Regards
pbugalski
Posts: 17
Joined: Thu Nov 29, 2018 4:28 pm

Re: Cache configuration on SAMA5D2

Wed Oct 16, 2019 3:17 am

I don't want Linux kernel to configure L2 cache and this is the reason why I need to configure it manually. When Linux runs in non-secure world it has no direct access to L2CC-PL310, so bootloader or whatever starts kernel has to handle that.
It looks like linux driver for l2-cache can work correctly with pre-configured L2 cache so the only problem is how to configure it to get similar efficiency as at91bootstrap has.

Regards
blue_z
Location: USA
Posts: 2007
Joined: Thu Apr 19, 2007 10:15 pm

Re: Cache configuration on SAMA5D2

Thu Oct 17, 2019 12:08 am

Seems like you continue to provide salient information in piecemeal manner.
Is your system using ARM TrustZone?
pbugalski
Posts: 17
Joined: Thu Nov 29, 2018 4:28 pm

Re: Cache configuration on SAMA5D2

Thu Oct 17, 2019 7:49 am

I didn't find TrustZone usage as important information. Linux kernel will work finally in non-secure world, but I don't see any connection between TrustZone and cache configuration issues.
Unfortunately I'm not authorized to publish all details about this project, it's not highly secret but still commercial product :(

Return to “SAMA5D Cortex-A5 MPU”

Who is online

Users browsing this forum: No registered users and 1 guest