3.7 KiB
Flashloaders
What do flashloaders do
The on-chip FLASH of STM32 needs to be written once a byte/half word/word/double word, which would lead to a unbearably long flashing time if the process is solely done by stlink
from the host side. Flashloaders are introduced to cooperate with stlink
so that the flashing process is divided into two stages. In the first stage, stlink
loads flashloaders and flash data to SRAM where busy check is not applied. In the second stage, flashloaders are kick-started, writing data from SRAM to FLASH, where a busy check is applied. Thus the write-check_if_busy cycle of flashing is done solely by STM32 chip, which saves considerable time in communications between stlink
and STM32.
As SRAM is usually less in size than FLASH, stlink
only flashes one page (may be less if SRAM is insufficient) at a time. The whole flashing process may consist of server launches of flashloaders.
The flashing process
st-flash
loads compiled binary of corresponding flashloader to SRAM by callingstlink_flash_loader_init
insrc/flash_loader.c
st-flash
erases corresponding flash page by callingstlink_erase_flash_page
incommon.c
.st-flash
callsstlink_flash_loader_run
inflash_loader.c
. In this function- buffer of one flash page is written to SRAM following the flashloader
- the buffer start address (in SRAM) is written to register
r0
- the target start address (in FLASH, page aligned) is written to register
r1
- the buffer size is written to register
r2
- the start address (for now 0x20000000) of flash loader is written to
r15
(pc
) - After that, launching the flashloader and waiting for a halted core (triggered by our flashloader) and confirming that flashing is completed with a zeroed
r2
- flashloader part: much like a
memcpy
with busy check- copy a single unit of data from SRAM to FLASH
- (for most devices) wait until flash is not busy
- trigger a breakpoint which halts the core when finished
Constraints
Thus for developers who want to modify flashloaders, the following constraints should be satisfied.
- only thumb-1 (for stm32f0 etc) or (thumb-1 and thumb-2) (for stm32f1 etc) instructions can be used, no ARM instructions.
- no stack, since it may overwrite buffer data.
- for most devices, after writing a single unit data, wait until FLASH is not busy.
- for some devices, check if there are any errors during flashing process.
- respect unit size of a single copy.
- after flashing, trigger a breakpint to halt the core.
- a sucessful run ends with
r2
set to zero when halted. - be sure that flashloaders are at least be capable of running at 0x20000000 (the base address of SRAM)
For devices that need to wait until the flash is not busy, check FLASH_SR_BUSY bit. For devices that need to check if there is any errors during flash, check FLASH_SR_(X)ERR where X
can be any error state
FLASH_SR related offset and copy unit size may be found in ST official reference manuals and/or some header files in other open source projects. Clean room document provides some of them.
Debug tricks
If you find some flashloaders to be broken or you need to write a new flashloader for new devices, the following tricks may help.
- Modify
WAIT_ROUNDS
marco to a bigger value so that you will have time to kill st-flash when it is waiting for a halted core. - run
st-flash
and kill it after the flashloader is loaded to SRAM - launch
st-util
andgdb
/lldb
- set a breakpoint at the base address of SRAM
- jump to the base address and start your debug
The tricks work because by this means, most work (flash unlock, flash erase, load flashloader to SRAM) would have been done automatically, saving time to construct a debug environment.