Understanding the RAM requirements

Hello.

The documentation tells about a list of supported RAM modules, yet @mcbridematt mentions this in the “Where and what ECC RAM to buy” topic:

All ‘modern’ RAM modules (speed 2100MT/s or greater) should work, if in doubt, post a model number and/or link.

You can use non-ECC modules as well.

Moreover, he tells this in Updated list of SSDs and DDR4s:

Generally all SODIMMs and NVMe SSDs currently on the market should work without issue.
(There are some early/legacy SODIMM designs not supported, and one vendor who has put out a couple of ‘broken’ cards).

Kingston KSM32SED8/32HC is a pain to get where I am at, so I went with Samsung M471A4G43MB1-CTD (non-ECC) for now.

At the same time, I want to understand the technical aspect mentioned in the documentation.

Let’s start here:

When BL2/TF is compiled in debug mode, the raw card parameter will be printed out at boot:

Can I expect to see most or any ram stick of the same form-factor to be available on USB-UART?

If an unimplemented SO-DIMM fails to boot despite support for the reference design in BL2, we can usually determine the cause of the problem from the full (debug) boot log.

Let’s say I buy some ECC memory in a local shop, what is the workflow? Do I post about it if there are issues and do we figure things out from that point on?

The clock adjust and write level values are found via NXP’s CodeWarrior QCVS DDR tool - which determines the most stable values by iteration on the actual hardware.

I am curious, however, I did not use CodeWarrior yet. Could you explain the principle a bit more? Does it run some proprietary code that can’t be implemented in the device? Could something similar be implemented, added as some debug / test mode, and the firmware updated from that point on? Iterating one or two values doesn’t look too complex, and capturing results doesn’t introduce complexity either. If I am correct, the interaction happens over UART. (Correct me if it’s JTAG or some proprietary debug interface, though.)

My curiosity boils down to my friend being initially interested, but then very confused by the documentation part about RAM, so I need a bit more clarity on that concept as I really like the device and want to recommend it to others if the comments about most RAM modules being supported are true!

Yes, the memory initialization happens after the UART, so you will always see something on the console.

Here is an example of a module which didn’t work: Fail boot using both Flash and recovery SD

To troubleshoot, I will compare the boot output to another card of the same physical layout (G1). In this case, the manufacturer incorrectly programmed the bit mappings (dq_mapping) in the module’s SPD, causing the memory initialization to fail.

(DQ mapping allows individual bits of the DDR4 memory bus to be swapped to improve physical routing. These are all set out in the original designs uploaded to JEDEC, so there is no reason to mess with them!)

All the issues I have seen have been related to incorrect SPD data, either incorrect DQ mappings or even modules claiming to be entirely different physical designs! There are a couple of workarounds already in the firmware (ignoring the SPD / forcing specific settings) for some known ‘broken’ modules.

What CodeWarrior is doing is finding the optimal parameters (Clock Adjust + Write Levelling) for the specific memory layout, which is a function of how the DDR traces (wires) are routed on the mainboard and memory module. There is a ‘window’ of parameters that do work, so CodeWarrior will select the middle of the window as the optimal setting.

If you look at slide 38 in this NXP presentation, that basically shows the view in CodeWarrior when doing the calibration runs.

This process runs over JTAG, AFAIK it basically tries each possible setting combination (with some ‘educated guessing’ derived from the physical trace lengths we give it) and sees which ones operate successfully.

Thanks for a quick response!

you will always see something on the console.

That explains a lot. At the very least, now I can look around for some ECC modules and try those to tell what worked and what didn’t.

Here is an example of a module which didn’t work: Fail boot using both Flash and recovery SD

Great, so I can expect “something” to be readable and provide some info.

These are all set out in the original designs uploaded to JEDEC, so there is no reason to mess with them!

Oh, that’s quite an unusual RAM stick, then.

All the issues I have seen have been related to incorrect SPD data, either incorrect DQ mappings or even modules claiming to be entirely different physical designs!

I presume I may have luck with more well-known brands, so I’ll look for those. Thanks.

This process runs over JTAG, AFAIK it basically tries each possible setting combination (with some ‘educated guessing’ derived from the physical trace lengths we give it) and sees which ones operate successfully.

Noted. I guess setting that up may be a quite expensive. :slight_smile: