Merge pull request #932 from chenguokai/develop

[refactoring] BSD-License-compliant flashloader rewrite
pull/1031/head
nightwalker-87 2020-05-06 22:56:22 +02:00 zatwierdzone przez GitHub
commit 9b207de880
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: 4AEE18F83AFDEB23
13 zmienionych plików z 722 dodań i 421 usunięć

Wyświetl plik

@ -0,0 +1,54 @@
# Flashloaders
## What do flashloaders do
The on-chip FLASH of STM32 needs to be written once a byte/half word/word/double word, which would lead to a unbearably long flashing time if the process is solely done by `stlink` from the host side. Flashloaders are introduced to cooperate with `stlink` so that the flashing process is divided into two stages. In the first stage, `stlink` loads flashloaders and flash data to SRAM where busy check is not applied. In the second stage, flashloaders are kick-started, writing data from SRAM to FLASH, where a busy check is applied. Thus the write-check\_if\_busy cycle of flashing is done solely by STM32 chip, which saves considerable time in communications between `stlink` and STM32.
As SRAM is usually less in size than FLASH, `stlink` only flashes one page (may be less if SRAM is insufficient) at a time. The whole flashing process may consist of server launches of flashloaders.
## The flahsing process
1. `st-flash` loads compiled binary of corresponding flashloader to SRAM by calling `stlink_flash_loader_init` in `src/flash_loader.c`
2. `st-flash` erases corresponding flash page by calling `stlink_erase_flash_page` in `common.c`.
3. `st-flash` calls `stlink_flash_loader_run` in `flash_loader.c`. In this function
+ buffer of one flash page is written to SRAM following the flashloader
+ the buffer start address (in SRAM) is written to register `r0`
+ the target start address (in FLASH, page aligned) is written to register `r1`
+ the buffer size is written to register `r2`
+ the start address (for now 0x20000000) of flash loader is written to `r15` (`pc`)
+ After that, launching the flashloader and waiting for a halted core (triggered by our flashloader) and confirming that flashing is completed with a zeroed `r2`
4. flashloader part: much like a `memcpy` with busy check
+ copy a single unit of data from SRAM to FLASH
+ (for most devices) wait until flash is not busy
+ trigger a breakpoint which halts the core when finished
## Constraints
Thus for developers who want to modify flashloaders, the following constraints should be satisfied.
* only thumb-1 (for stm32f0 etc) or (thumb-1 and thumb-2) (for stm32f1 etc) instructions can be used, no ARM instructions.
* no stack, since it may overwrite buffer data.
* for most devices, after writing a single unit data, wait until FLASH is not busy.
* for some devices, check if there are any errors during flashing process.
* respect unit size of a single copy.
* after flashing, trigger a breakpint to halt the core.
* a sucessful run ends with `r2` set to zero when halted.
* be sure that flashloaders are at least be capable of running at 0x20000000 (the base address of SRAM)
For devices that need to wait until the flash is not busy, check FLASH_SR_BUSY bit. For devices that need to check if there is any errors during flash, check FLASH\_SR\_(X)ERR where `X` can be any error state
FLASH_SR related offset and copy unit size may be found in ST official reference manuals and/or some header files in other open source projects. Clean room document provides some of them.
## Debug tricks
If you find some flashloaders to be broken or you need to write a new flashloader for new devices, the following tricks may help.
1. Modify `WAIT_ROUNDS` marco to a bigger value so that you will have time to kill st-flash when it is waiting for a halted core.
2. run `st-flash` and kill it after the flashloader is loaded to SRAM
3. launch `st-util` and `gdb`/`lldb`
4. set a breakpoint at the base address of SRAM
5. jump to the base address and start your debug
The tricks work because by this means, most work (flash unlock, flash erase, load flashloader to SRAM) would have been done automatically, saving time to construct a debug environment.

Wyświetl plik

@ -0,0 +1,38 @@
# Note that according to the original GPLed code, compiling is noted to be
# as simple as gcc -c, this fails with my tests where this will lead to a wrong
# address read by the program.
# This makefile will save your time from dealing with compile errors
# Adjust CC if needed
CC = /opt/local/gcc-arm-none-eabi-8-2018-q4-major/bin/arm-none-eabi-gcc
CFLAGS_thumb1 = -mcpu=Cortex-M0 -Tlinker.ld -ffreestanding -nostdlib
CFLAGS_thumb2 = -mcpu=Cortex-M3 -Tlinker.ld -ffreestanding -nostdlib
all: stm32vl.o stm32f0.o stm32l.o stm32f4.o stm32f4_lv.o stm32l4.o stm32f7.o stm32f7_lv.o
stm32vl.o: stm32f0.s
$(CC) stm32f0.s $(CFLAGS_thumb2) -o stm32vl.o
stm32f0.o: stm32f0.s
$(CC) stm32f0.s $(CFLAGS_thumb1) -o stm32f0.o
stm32l.o: stm32lx.s
$(CC) stm32lx.s $(CFLAGS_thumb2) -o stm32l.o
stm32f4.o: stm32f4.s
$(CC) stm32f4.s $(CFLAGS_thumb2) -o stm32f4.o
stm32f4_lv.o: stm32f4lv.s
$(CC) stm32f4lv.s $(CFLAGS_thumb2) -o stm32f4_lv.o
stm32l4.o: stm32l4.s
$(CC) stm32l4.s $(CFLAGS_thumb2) -o stm32l4.o
stm32f7.o: stm32f7.s
$(CC) stm32f7.s $(CFLAGS_thumb2) -o stm32f7.o
stm32f7_lv.o: stm32f7lv.s
$(CC) stm32f7lv.s $(CFLAGS_thumb2) -o stm32f7_lv.o
clean:
rm *.o

Wyświetl plik

@ -0,0 +1,233 @@
Original Chinese version can be found below.
# Clean Room Documentation English Version
Code is situated in section `.text`
Shall add a compile directive at the head: `.syntax unified`
**Calling convention**:
All parameters would be passed over registers
`r0`: the base address of the copy source
`r1`: the base address of the copy destination
`r2`: the total word (4 bytes) count to be copied (with expeptions)
**What the program is expected to do**:
Copy data from source to destination, after which trigger a breakpint to exit. Before exit, `r2` must be cleared to zero to indicate that the copy is done.
**Limitation**: No stack operations are permitted. Registers ranging from `r3` to `r12` are free to use. Note that `r13` is `sp`(stack pointer), `r14` is `lr`(commonly used to store jump address), `r15` is `pc`(program counter).
**Requirement**: After every single copy, wait until the flash finishes. The detailed single copy length and the way to check can be found below. Address of `flash_base` shall be two-bytes aligned.
## stm32f0.s
**Exception**: `r2` stores the total half word (2 bytes) count to be copied
`flash_base`: 0x40022000
`FLASH_CR`: offset from `flash_base` is 16
`FLASH_SR`: offset from `flash_base` is 12
**Reference**: [https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f0.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f0.h)
[https://www.st.com/resource/en/reference_manual/dm00031936-stm32f0x1stm32f0x2stm32f0x8-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00031936-stm32f0x1stm32f0x2stm32f0x8-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
**Special requirements**:
Before every copy, read a word from FLASH_CR, set the lowest bit to 1 and write back. Copy one half word each time.
How to wait for the write process: read a word from FLASH_SR, loop until the content is not 1. After that, check FLASH_SR, proceed if the content is 4, otherwise exit.
Exit: after the copying process and before triggering the breakpoint, clear the lowest bit in FLASH_CR.
## stm32f4.s
`flash_base`: 0x40023c00
`FLASH_SR`: offset from `flash_base` is 0xe (14)
**Reference**: [https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h)
[https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf](https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf)
**Special requirements**:
Copy one word each time.
How to wait for the write process: read a half word from FLASH_SR, loop until the content is not 1.
## stm32f4lv.s
`flash_base`: 0x40023c00
`FLASH_SR`: offset from `flash_base` is 0xe (14)
**Reference**: [https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h)
[https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf](https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf)
**Special Requirements**:
Copy one byte each time.
How to wait from the write process: read a half word from FLASH_SR, loop until the content is not 1.
## stm32f7.s
**Reference**: [https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h)
[https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
Mostly same with `stm32f4.s`. Require establishing a memory barrier after every copy and before checking for finished writing by `dsb sy`
## stm32f7lv.s
**Reference**: [https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h)
[https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
**Special Requirements**:
Mostly same with `stm32f7.s`. Copy one byte each time.
## stm32l0x.s
**Special Requirements**:
Copy one word each time. No wait for write.
## stm32l4.s
**Exception**: r2 stores the double word count to be copied.
`flash_base`: 0x40022000
`FLASH_BSY`: offset from `flash_base` is 0x12
**Reference**: [https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32l4.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32l4.h)
[https://www.st.com/resource/en/reference_manual/dm00310109-stm32l4-series-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00310109-stm32l4-series-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
**Special Requirements**:
Copy one double word each time (More than one registers are allowed).
How to wait for the write process: read a half word from `FLASH_BSY`, loop until the lowest bit turns non-1.
## stm32lx.s
Same with stm32l0x.s.
# 净室工程文档-原始中文版
代码位于的section`.text`
编译制导添加`.syntax unified`
传入参数约定:
参数全部通过寄存器传递
`r0`: 拷贝源点起始地址
`r1`: 拷贝终点起始地址
`r2`: 拷贝word4字节数(存在例外)
程序功能:将数据从源点拷贝到终点,在拷贝完毕后触发断点以结束执行,结束时`r2`值需清零表明传输完毕。
限制:不可使用栈,可自由使用的临时寄存器为`R3`到`R12`。`R13`为`sp`stack pointer`R14`为lr一般用于储存跳转地址`R15`为`pc`program counter
要求每完成一次拷贝需等待flash完成写入单次拷贝宽度、检查写入完成的方式见每个文件的具体要求。
特殊地址`flash_base`存放地址需2字节对齐。
## stm32f0.s
例外:`r2`:拷贝half word2字节
特殊地址定义:`flash_base`:定义为0x40022000
`FLASH_CR`: 相对`flash_base`的offset为16
`FLASH_SR`: 相对`flash_base`的offset为12
参考:[https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f0.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f0.h)
[https://www.st.com/resource/en/reference_manual/dm00031936-stm32f0x1stm32f0x2stm32f0x8-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00031936-stm32f0x1stm32f0x2stm32f0x8-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
特殊要求:
每次拷贝开始前需要读出FLASH_CR处的4字节内容将其最低bit设置为1写回FLASH_CR。
每次写入数据宽度为2字节半字
每完成一次写入需等待flash完成写入检查方式为读取FLASH_SR处4字节内容若取值为1则说明写入尚未完成需继续轮询等待否则需要检查FLASH_SR处值是否为4若非4则应直接准备退出。
退出全部拷贝执行完毕后触发断点前将FLASH_CR处4字节内容最低bit清为0写回FLASH_CR。
## stm32f4.s
特殊地址定义: `flash_base`定义为0x40023c00
`FLASH_SR`:相对flash_base的offset为0xe14
参考:[https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h)
[https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf](https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf)
特殊要求:
每次写入的数据宽度为4字节
每完成一次写入需等待flash完成写入检查方式为读取FLASH_SR处2字节内容若取值为1则说明写入尚未完成需继续轮询等待。
## stm32f4lv.s
特殊地址定义:`flash_base`定义为0x40023c00
`FLASH_SR`:相对`flash_base`的offset为0xe (14)
参考:[https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f4.h)
[https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf](https://www.st.com/content/ccc/resource/technical/document/reference_manual/3d/6d/5a/66/b4/99/40/d4/DM00031020.pdf/files/DM00031020.pdf/jcr:content/translations/en.DM00031020.pdf)
特殊要求:
每次写入的数据宽度为1字节1/4字
每完成一次写入需等待flash完成写入检查方式为读取FLASH_SR处2字节内容若取值为1则说明写入尚未完成需继续轮询等待。
## stm32f7.s
参考:[https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h)
[https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
要求同stm32f4.s额外要求在每次拷贝执行完毕、flash写入成功检测前执行`dsb sy`指令以建立内存屏障。
## stm32f7lv.s
参考:[https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32f7.h)
[https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00124865-stm32f75xxx-and-stm32f74xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
要求基本同stm32f7.s差异要求为每次写入的数据宽度为1字节1/4字
## stm32l0x.s
特殊要求:
每次写入的数据宽度为4字节
无需实现检查flash写入完成功能
## stm32l4.s
例外:`r2` 拷贝双字8字节
特殊地址定义:`flash_base`: 0x40022000
`FLASH_BSY`相对flash_base的offset为0x12
参考:[https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32l4.h](https://chromium.googlesource.com/chromiumos/platform/ec/+/master/chip/stm32/registers-stm32l4.h)
[https://www.st.com/resource/en/reference_manual/dm00310109-stm32l4-series-advanced-armbased-32bit-mcus-stmicroelectronics.pdf](https://www.st.com/resource/en/reference_manual/dm00310109-stm32l4-series-advanced-armbased-32bit-mcus-stmicroelectronics.pdf)
拷贝方式一次性拷贝连续的8个字节使用两个连续寄存器作中转并写入
每完成一次写入需等待flash完成写入检查方式为读取FLASH_BSY处半字2字节若其最低位非1可继续拷贝。
## stm32lx.s
要求与stm32l0x.s相同

Wyświetl plik

@ -0,0 +1,9 @@
/*. Entry Point *./
ENTRY( copy )
/*. Specify the memory areas .*/
MEMORY
{
RAM ( xrw) : ORIGIN = 0x20000000 , LENGTH = 64K
}

Wyświetl plik

@ -1,32 +1,63 @@
/* Adopted from STM AN4065 stm32f0xx_flash.c:FLASH_ProgramWord */
.syntax unified
.text
write:
ldr r4, STM32_FLASH_BASE
mov r5, #1 /* FLASH_CR_PG, FLASH_SR_BUSY */
mov r6, #4 /* PGERR */
write_half_word:
ldr r3, [r4, #16] /* FLASH->CR */
orr r3, r5
str r3, [r4, #16] /* FLASH->CR |= FLASH_CR_PG */
ldrh r3, [r0] /* r3 = *sram */
strh r3, [r1] /* *flash = r3 */
busy:
ldr r3, [r4, #12] /* FLASH->SR */
tst r3, r5 /* FLASH_SR_BUSY */
beq busy
.global copy
copy:
ldr r7, =flash_base
ldr r4, [r7]
ldr r7, =flash_off_cr
ldr r6, [r7]
adds r6, r6, r4
ldr r7, =flash_off_sr
ldr r5, [r7]
adds r5, r5, r4
tst r3, r6 /* PGERR */
bne exit
loop:
# FLASH_CR ^= 1
ldr r7, =0x1
ldr r3, [r6]
orrs r3, r3, r7
str r3, [r6]
# copy 2 bytes
ldrh r3, [r0]
strh r3, [r1]
ldr r7, =2
adds r0, r0, r7
adds r1, r1, r7
# wait if FLASH_SR == 1
mywait:
ldr r7, =0x1
ldr r3, [r5]
tst r3, r7
beq mywait
# exit if FLASH_SR != 4
ldr r7, =0x4
tst r3, r7
bne exit
# loop if r2 != 0
ldr r7, =0x1
subs r2, r2, r7
cmp r2, #0
bne loop
add r0, r0, #2 /* sram += 2 */
add r1, r1, #2 /* flash += 2 */
sub r2, r2, #0x01 /* count-- */
cmp r2, #0
bne write_half_word
exit:
ldr r3, [r4, #16] /* FLASH->CR */
bic r3, r5
str r3, [r4, #16] /* FLASH->CR &= ~FLASH_CR_PG */
bkpt #0x00
# FLASH_CR &= ~1
ldr r7, =0x1
ldr r3, [r6]
bics r3, r3, r7
str r3, [r6]
STM32_FLASH_BASE: .word 0x40022000
bkpt
.align 2
flash_base:
.word 0x40022000
flash_off_cr:
.word 0x10
flash_off_sr:
.word 0x0c

Wyświetl plik

@ -1,32 +1,36 @@
.global start
.syntax unified
.syntax unified
.text
@ r0 = source
@ r1 = target
@ r2 = wordcount
@ r3 = flash_base
@ r4 = temp
.global copy
copy:
ldr r12, flash_base
ldr r10, flash_off_sr
add r10, r10, r12
start:
ldr r3, flash_base
next:
cbz r2, done
ldr r4, [r0]
str r4, [r1]
loop:
# copy 4 bytes
ldr r3, [r0]
str r3, [r1]
wait:
ldrh r4, [r3, #0x0e]
tst.w r4, #1
bne wait
add r0, r0, #4
add r1, r1, #4
add r0, #4
add r1, #4
sub r2, #1
b next
done:
# wait if FLASH_SR == 1
mywait:
ldrh r3, [r10]
tst r3, #0x1
beq mywait
# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
exit:
bkpt
.align 2
.align 2
flash_base:
.word 0x40023c00
.word 0x40023c00
flash_off_sr:
.word 0x0e

Wyświetl plik

@ -1,33 +1,42 @@
.global start
.syntax unified
.syntax unified
.text
@ r0 = source
@ r1 = target
@ r2 = wordcount
@ r3 = flash_base
@ r4 = temp
.global copy
copy:
ldr r12, flash_base
ldr r10, flash_off_sr
add r10, r10, r12
start:
lsls r2, r2, #2
ldr r3, flash_base
next:
cbz r2, done
ldrb r4, [r0]
strb r4, [r1]
# tip 1: original r2 indicates the count of 4 bytes need to copy,
# but we can only copy one byte each time.
# as we have no flash larger than 1GB, we do a little trick here.
# tip 2: r2 is always a power of 2
mov r2, r2, lsl#2
wait:
ldrh r4, [r3, #0x0e]
tst.w r4, #1
bne wait
loop:
# copy 1 byte
ldrb r3, [r0]
strb r3, [r1]
add r0, #1
add r1, #1
sub r2, #1
b next
done:
add r0, r0, #1
add r1, r1, #1
# wait if FLASH_SR == 1
mywait:
ldrh r3, [r10]
tst r3, #0x1
beq mywait
# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
exit:
bkpt
.align 2
.align 2
flash_base:
.word 0x40023c00
.word 0x40023c00
flash_off_sr:
.word 0x0e

Wyświetl plik

@ -1,33 +1,39 @@
.global start
.syntax unified
.syntax unified
.text
@ r0 = source
@ r1 = target
@ r2 = wordcount
@ r3 = flash_base
@ r4 = temp
.global copy
copy:
ldr r12, flash_base
ldr r10, flash_off_sr
add r10, r10, r12
start:
ldr r3, flash_base
next:
cbz r2, done
ldr r4, [r0]
str r4, [r1]
dsb sy
loop:
# copy 4 bytes
ldr r3, [r0]
str r3, [r1]
wait:
ldrh r4, [r3, #0x0e]
tst.w r4, #1
bne wait
add r0, r0, #4
add r1, r1, #4
add r0, #4
add r1, #4
sub r2, #1
b next
done:
# memory barrier
dsb sy
# wait if FLASH_SR == 1
mywait:
ldrh r3, [r10]
tst r3, #0x1
beq mywait
# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
exit:
bkpt
.align 2
.align 2
flash_base:
.word 0x40023c00
.word 0x40023c00
flash_off_sr:
.word 0x0e

Wyświetl plik

@ -1,34 +1,45 @@
.global start
.syntax unified
.syntax unified
.text
@ r0 = source
@ r1 = target
@ r2 = wordcount
@ r3 = flash_base
@ r4 = temp
.global copy
copy:
ldr r12, flash_base
ldr r10, flash_off_sr
add r10, r10, r12
start:
lsls r2, r2, #2
ldr r3, flash_base
next:
cbz r2, done
ldrb r4, [r0]
strb r4, [r1]
dsb sy
# tip 1: original r2 indicates the count in 4 bytes need to copy,
# but we can only copy one byte each time.
# as we have no flash larger than 1GB, we do a little trick here.
# tip 2: r2 is always a power of 2
mov r2, r2, lsl#2
wait:
ldrh r4, [r3, #0x0e]
tst.w r4, #1
bne wait
loop:
# copy 1 byte
ldrb r3, [r0]
strb r3, [r1]
add r0, #1
add r1, #1
sub r2, #1
b next
done:
add r0, r0, #1
add r1, r1, #1
# memory barrier
dsb sy
# wait if FLASH_SR == 1
mywait:
ldrh r3, [r10]
tst r3, #0x1
beq mywait
# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
exit:
bkpt
.align 2
.align 2
flash_base:
.word 0x40023c00
.word 0x40023c00
flash_off_sr:
.word 0x0e

Wyświetl plik

@ -1,64 +1,22 @@
/***************************************************************************
* Copyright (C) 2010 by Spencer Oliver *
* spen@spen-soft.co.uk *
* *
* Copyright (C) 2011 Øyvind Harboe *
* oyvind.harboe@zylin.com *
* *
* Copyright (C) 2011 Clement Burin des Roziers *
* clement.burin-des-roziers@hikob.com *
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. *
* *
* This program is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU General Public License for more details. *
* *
* You should have received a copy of the GNU General Public License *
* along with this program; if not, write to the *
* Free Software Foundation, Inc., *
* 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
***************************************************************************/
// Build : arm-eabi-gcc -c stm32lx.S
.text
.syntax unified
.cpu cortex-m0plus
.thumb
.thumb_func
.global write
.text
/*
r0 - source address
r1 - destination address
r2 - count
*/
.global copy
copy:
loop:
# copy 4 bytes
ldr r3, [r0]
str r3, [r1]
// Go to compare
b test_done
ldr r7, =4
add r0, r0, r7
add r1, r1, r7
write_word:
// Load one word from address in r0, increment by 4
ldr r4, [r0]
// Store the word to address in r1, increment by 4
str r4, [r1]
// Decrement r2
subs r2, #1
adds r1, #4
// does not matter, only first addr is important
// next 15 bytes are in sequnce RM0367 page 66
adds r0, #4
# loop if r2 != 0
ldr r7, =1
subs r2, r2, r7
cmp r2, #0
bne loop
test_done:
// Test r2
cmp r2, #0
// Loop if not zero
bcc.n write_word
// Set breakpoint to exit
bkpt #0x00
exit:
bkpt

Wyświetl plik

@ -1,39 +1,38 @@
.global start
.syntax unified
.syntax unified
.text
@ Adapted from stm32f4.s
@ STM32L4's flash controller expects double-word writes, has the flash
@ controller mapped in a different location with the registers we care about
@ moved down from the base address, and has BSY moved to bit 16 of SR.
@ r0 = source
@ r1 = target
@ r2 = wordcount
@ r3 = flash_base
@ r4 = temp
@ r5 = temp
.global copy
copy:
ldr r12, flash_base
ldr r10, flash_off_bsy
add r10, r10, r12
start:
ldr r3, flash_base
next:
cbz r2, done
ldr r4, [r0] /* copy doubleword from source to target */
ldr r5, [r0, #4]
str r4, [r1]
str r5, [r1, #4]
loop:
# copy 8 bytes
ldr r3, [r0]
ldr r4, [r0, #4]
str r3, [r1]
str r4, [r1, #4]
wait:
ldrh r4, [r3, #0x12] /* high half of status register */
tst r4, #1 /* BSY = bit 16 */
bne wait
add r0, r0, #8
add r1, r1, #8
add r0, #8
add r1, #8
sub r2, #1
b next
done:
# wait if FLASH_BSY[0b] == 1
mywait:
ldrh r3, [r10]
tst r3, #0x1
beq mywait
# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
exit:
bkpt
.align 2
.align 2
flash_base:
.word 0x40022000
flash_off_bsy:
.word 0x12

Wyświetl plik

@ -1,60 +1,22 @@
/***************************************************************************
* Copyright (C) 2010 by Spencer Oliver *
* spen@spen-soft.co.uk *
* *
* Copyright (C) 2011 Øyvind Harboe *
* oyvind.harboe@zylin.com *
* *
* Copyright (C) 2011 Clement Burin des Roziers *
* clement.burin-des-roziers@hikob.com *
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. *
* *
* This program is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU General Public License for more details. *
* *
* You should have received a copy of the GNU General Public License *
* along with this program; if not, write to the *
* Free Software Foundation, Inc., *
* 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
***************************************************************************/
// Build : arm-eabi-gcc -c stm32lx.s
.text
.syntax unified
.cpu cortex-m3
.thumb
.thumb_func
.global write
.text
/*
r0 - source address
r1 - destination address
r2 - output, remaining word count
*/
.global copy
copy:
loop:
# copy 4 bytes
ldr r3, [r0]
str r3, [r1]
// Go to compare
b test_done
ldr r7, =4
add r0, r0, r7
add r1, r1, r7
write_word:
// Load one word from address in r0, increment by 4
ldr.w ip, [r0], #4
// Store the word to address in r1, increment by 4
str.w ip, [r1], #4
// Decrement r2
subs r2, #1
# loop if r2 != 0
ldr r7, =1
subs r2, r2, r7
cmp r2, #0
bne loop
test_done:
// Test r2
cmp r2, #0
// Loop if not zero
bhi write_word
// Set breakpoint to exit
bkpt #0x00
exit:
bkpt

Wyświetl plik

@ -7,26 +7,36 @@
#define FLASH_REGS_BANK2_OFS 0x40
#define FLASH_BANK2_START_ADDR 0x08080000
/* from openocd, contrib/loaders/flash/stm32.s */
static const uint8_t loader_code_stm32vl[] = {
0x08, 0x4c, /* ldr r4, STM32_FLASH_BASE */
0x1c, 0x44, /* add r4, r3 */
/* write_half_word: */
0x01, 0x23, /* movs r3, #0x01 */
0x23, 0x61, /* str r3, [r4, #STM32_FLASH_CR_OFFSET] */
0x30, 0xf8, 0x02, 0x3b, /* ldrh r3, [r0], #0x02 */
0x21, 0xf8, 0x02, 0x3b, /* strh r3, [r1], #0x02 */
/* busy: */
0xe3, 0x68, /* ldr r3, [r4, #STM32_FLASH_SR_OFFSET] */
0x13, 0xf0, 0x01, 0x0f, /* tst r3, #0x01 */
0xfb, 0xd0, /* beq busy */
0x13, 0xf0, 0x14, 0x0f, /* tst r3, #0x14 */
0x01, 0xd1, /* bne exit */
0x01, 0x3a, /* subs r2, r2, #0x01 */
0xf0, 0xd1, /* bne write_half_word */
/* exit: */
0x00, 0xbe, /* bkpt #0x00 */
0x00, 0x20, 0x02, 0x40, /* STM32_FLASH_BASE: .word 0x40022000 */
/* DO NOT MODIFY SOURCECODE DIRECTLY, EDIT ASSEMBLY FILES INSTEAD */
/* flashloaders/stm32f0.s -- compiled with thumb2 */
static const uint8_t loader_code_stm32vl[] = {
0x16, 0x4f, 0x3c, 0x68,
0x16, 0x4f, 0x3e, 0x68,
0x36, 0x19, 0x16, 0x4f,
0x3d, 0x68, 0x2d, 0x19,
0x4f, 0xf0, 0x01, 0x07,
0x33, 0x68, 0x3b, 0x43,
0x33, 0x60, 0x03, 0x88,
0x0b, 0x80, 0x4f, 0xf0,
0x02, 0x07, 0xc0, 0x19,
0xc9, 0x19, 0x4f, 0xf0,
0x01, 0x07, 0x2b, 0x68,
0x3b, 0x42, 0xfa, 0xd0,
0x4f, 0xf0, 0x04, 0x07,
0x3b, 0x42, 0x04, 0xd1,
0x4f, 0xf0, 0x01, 0x07,
0xd2, 0x1b, 0x00, 0x2a,
0xe6, 0xd1, 0x4f, 0xf0,
0x01, 0x07, 0x33, 0x68,
0xbb, 0x43, 0x33, 0x60,
0x00, 0xbe, 0x00, 0xbf,
0x00, 0x20, 0x02, 0x40,
0x10, 0x00, 0x00, 0x00,
0x0c, 0x00, 0x00, 0x00,
0x50, 0x00, 0x00, 0x20,
0x54, 0x00, 0x00, 0x20,
0x58, 0x00, 0x00, 0x20
};
/* flashloaders/stm32f0.s -- thumb1 only, same sequence as for STM32VL, bank ignored */
@ -47,157 +57,134 @@ static const uint8_t loader_code_stm32vl[] = {
0x00, 0x30, // nop /* add r0,#0 */
0x00, 0x30, // nop /* add r0,#0 */
#endif
0x0A, 0x4C, // ldr r4, STM32_FLASH_BASE
0x01, 0x25, // mov r5, #1 /* FLASH_CR_PG, FLASH_SR_BUSY */
0x04, 0x26, // mov r6, #4 /* PGERR */
// write_half_word:
0x23, 0x69, // ldr r3, [r4, #16] /* FLASH->CR */
0x2B, 0x43, // orr r3, r5
0x23, 0x61, // str r3, [r4, #16] /* FLASH->CR |= FLASH_CR_PG */
0x03, 0x88, // ldrh r3, [r0] /* r3 = *sram */
0x0B, 0x80, // strh r3, [r1] /* *flash = r3 */
// busy:
0xE3, 0x68, // ldr r3, [r4, #12] /* FLASH->SR */
0x2B, 0x42, // tst r3, r5 /* FLASH_SR_BUSY */
0xFC, 0xD0, // beq busy
0x33, 0x42, // tst r3, r6 /* PGERR */
0x04, 0xD1, // bne exit
0x02, 0x30, // add r0, r0, #2 /* sram += 2 */
0x02, 0x31, // add r1, r1, #2 /* flash += 2 */
0x01, 0x3A, // sub r2, r2, #0x01 /* count-- */
0x00, 0x2A, // cmp r2, #0
0xF0, 0xD1, // bne write_half_word
// exit:
0x23, 0x69, // ldr r3, [r4, #16] /* FLASH->CR */
0xAB, 0x43, // bic r3, r5
0x23, 0x61, // str r3, [r4, #16] /* FLASH->CR &= ~FLASH_CR_PG */
0x00, 0xBE, // bkpt #0x00
0x00, 0x20, 0x02, 0x40, /* STM32_FLASH_BASE: .word 0x40022000 */
0x13, 0x4f, 0x3c, 0x68,
0x13, 0x4f, 0x3e, 0x68,
0x36, 0x19, 0x13, 0x4f,
0x3d, 0x68, 0x2d, 0x19,
0x12, 0x4f, 0x33, 0x68,
0x3b, 0x43, 0x33, 0x60,
0x03, 0x88, 0x0b, 0x80,
0x10, 0x4f, 0xc0, 0x19,
0xc9, 0x19, 0x0e, 0x4f,
0x2b, 0x68, 0x3b, 0x42,
0xfb, 0xd0, 0x0e, 0x4f,
0x3b, 0x42, 0x03, 0xd1,
0x0a, 0x4f, 0xd2, 0x1b,
0x00, 0x2a, 0xeb, 0xd1,
0x08, 0x4f, 0x33, 0x68,
0xbb, 0x43, 0x33, 0x60,
0x00, 0xbe, 0xc0, 0x46,
0x00, 0x20, 0x02, 0x40,
0x10, 0x00, 0x00, 0x00,
0x0c, 0x00, 0x00, 0x00,
0x44, 0x00, 0x00, 0x20,
0x48, 0x00, 0x00, 0x20,
0x4c, 0x00, 0x00, 0x20,
0x01, 0x00, 0x00, 0x00,
0x02, 0x00, 0x00, 0x00,
0x04, 0x00, 0x00, 0x00
};
static const uint8_t loader_code_stm32l[] = {
// flashloaders/stm32lx.s
0x04, 0xe0, // b test_done ; Go to compare
// write_word:
0x04, 0x68, // ldr r4, [r0] ; Load one word from address in r0
0x0c, 0x60, // str r4, [r1] ; Store the word to address in r1
0x04, 0x30, // adds r0, #4 ; Increment r0
0x04, 0x31, // adds r1, #4 ; Increment r1
0x01, 0x3a, // subs r2, #1 ; Decrement r2
// test_done:
0x00, 0x2a, // cmp r2, #0 ; Compare r2 to 0
0xf8, 0xd8, // bhi write_word ; Loop if above 0
0x00, 0xbe, // bkpt #0x00 ; Set breakpoint to exit
0x00, 0x00
0x03, 0x68, 0x0b, 0x60,
0x4f, 0xf0, 0x04, 0x07,
0x38, 0x44, 0x39, 0x44,
0x4f, 0xf0, 0x01, 0x07,
0xd2, 0x1b, 0x00, 0x2a,
0xf4, 0xd1, 0x00, 0xbe,
};
static const uint8_t loader_code_stm32f4[] = {
// flashloaders/stm32f4.s
0x07, 0x4b,
0x62, 0xb1,
0x04, 0x68,
0x0c, 0x60,
0xdc, 0x89,
0x14, 0xf0, 0x01, 0x0f,
0xfb, 0xd1,
0x00, 0xf1, 0x04, 0x00,
0x01, 0xf1, 0x04, 0x01,
0xa2, 0xf1, 0x01, 0x02,
0xf1, 0xe7,
0x00, 0xbe,
0x00, 0x3c, 0x02, 0x40,
0xdf, 0xf8, 0x28, 0xc0,
0xdf, 0xf8, 0x28, 0xa0,
0xe2, 0x44, 0x03, 0x68,
0x0b, 0x60, 0x00, 0xf1,
0x04, 0x00, 0x01, 0xf1,
0x04, 0x01, 0xba, 0xf8,
0x00, 0x30, 0x13, 0xf0,
0x01, 0x0f, 0xfa, 0xd0,
0xa2, 0xf1, 0x01, 0x02,
0x00, 0x2a, 0xf0, 0xd1,
0x00, 0xbe, 0x00, 0xbf,
0x00, 0x3c, 0x02, 0x40,
0x0e, 0x00, 0x00, 0x00
};
static const uint8_t loader_code_stm32f4_lv[] = {
// flashloaders/stm32f4lv.s
0x92, 0x00,
0x08, 0x4b,
0x62, 0xb1,
0x04, 0x78,
0x0c, 0x70,
0xdc, 0x89,
0x14, 0xf0, 0x01, 0x0f,
0xfb, 0xd1,
0x00, 0xf1, 0x01, 0x00,
0x01, 0xf1, 0x01, 0x01,
0xa2, 0xf1, 0x01, 0x02,
0xf1, 0xe7,
0x00, 0xbe,
0x00, 0xbf,
0x00, 0x3c, 0x02, 0x40,
0xdf, 0xf8, 0x2c, 0xc0,
0xdf, 0xf8, 0x2c, 0xa0,
0xe2, 0x44, 0x4f, 0xea,
0x82, 0x02, 0x03, 0x78,
0x0b, 0x70, 0x00, 0xf1,
0x01, 0x00, 0x01, 0xf1,
0x01, 0x01, 0xba, 0xf8,
0x00, 0x30, 0x13, 0xf0,
0x01, 0x0f, 0xfa, 0xd0,
0xa2, 0xf1, 0x01, 0x02,
0x00, 0x2a, 0xf0, 0xd1,
0x00, 0xbe, 0x00, 0xbf,
0x00, 0x3c, 0x02, 0x40,
0x0e, 0x00, 0x00, 0x00
};
static const uint8_t loader_code_stm32l4[] = {
// flashloaders/stm32l4.s
0x08, 0x4b, // start: ldr r3, [pc, #32] ; <flash_base>
0x72, 0xb1, // next: cbz r2, <done>
0x04, 0x68, // ldr r4, [r0, #0]
0x45, 0x68, // ldr r5, [r0, #4]
0x0c, 0x60, // str r4, [r1, #0]
0x4d, 0x60, // str r5, [r1, #4]
0x5c, 0x8a, // wait: ldrh r4, [r3, #18]
0x14, 0xf0, 0x01, 0x0f, // tst.w r4, #1
0xfb, 0xd1, // bne.n <wait>
0x00, 0xf1, 0x08, 0x00, // add.w r0, r0, #8
0x01, 0xf1, 0x08, 0x01, // add.w r1, r1, #8
0xa2, 0xf1, 0x01, 0x02, // sub.w r2, r2, #1
0xef, 0xe7, // b.n <next>
0x00, 0xbe, // done: bkpt 0x0000
0x00, 0x20, 0x02, 0x40 // flash_base: .word 0x40022000
0xdf, 0xf8, 0x2c, 0xc0,
0xdf, 0xf8, 0x2c, 0xa0,
0xe2, 0x44, 0x03, 0x68,
0x44, 0x68, 0x0b, 0x60,
0x4c, 0x60, 0x00, 0xf1,
0x08, 0x00, 0x01, 0xf1,
0x08, 0x01, 0xba, 0xf8,
0x00, 0x30, 0x13, 0xf0,
0x01, 0x0f, 0xfa, 0xd0,
0xa2, 0xf1, 0x01, 0x02,
0x00, 0x2a, 0xee, 0xd1,
0x00, 0xbe, 0x00, 0xbf,
0x00, 0x20, 0x02, 0x40,
0x12, 0x00, 0x00, 0x00
};
static const uint8_t loader_code_stm32f7[] = {
// flashloaders/stm32f7.s
0x08, 0x4b,
0x72, 0xb1,
0x04, 0x68,
0x0c, 0x60,
0xbf, 0xf3, 0x4f, 0x8f, // DSB Memory barrier for in order flash write
0xdc, 0x89,
0x14, 0xf0, 0x01, 0x0f,
0xfb, 0xd1,
0x00, 0xf1, 0x04, 0x00,
0x01, 0xf1, 0x04, 0x01,
0xa2, 0xf1, 0x01, 0x02,
0xef, 0xe7,
0x00, 0xbe, // bkpt #0x00
0x00, 0x3c, 0x02, 0x40,
0xdf, 0xf8, 0x2c, 0xc0,
0xdf, 0xf8, 0x2c, 0xa0,
0xe2, 0x44, 0x03, 0x68,
0x0b, 0x60, 0x00, 0xf1,
0x04, 0x00, 0x01, 0xf1,
0x04, 0x01, 0xbf, 0xf3,
0x4f, 0x8f, 0xba, 0xf8,
0x00, 0x30, 0x13, 0xf0,
0x01, 0x0f, 0xfa, 0xd0,
0xa2, 0xf1, 0x01, 0x02,
0x00, 0x2a, 0xee, 0xd1,
0x00, 0xbe, 0x00, 0xbf,
0x00, 0x3c, 0x02, 0x40,
0x0e, 0x00, 0x00, 0x00
};
static const uint8_t loader_code_stm32f7_lv[] = {
// flashloaders/stm32f7lv.s
0x92, 0x00, // lsls r2, r2, #2
0x09, 0x4b, // ldr r3, [pc, #36] ; (0x20000028 <flash_base>)
// next:
0x72, 0xb1, // cbz r2, 24 <done>
0x04, 0x78, // ldrb r4, [r0, #0]
0x0c, 0x70, // strb r4, [r1, #0]
0xbf, 0xf3, 0x4f, 0x8f, // dsb sy
// wait:
0xdc, 0x89, // ldrh r4, [r3, #14]
0x14, 0xf0, 0x01, 0x0f, // tst.w r4, #1
0xfb, 0xd1, // bne.n e <wait>
0x00, 0xf1, 0x01, 0x00, // add r0, r0, #1
0x01, 0xf1, 0x01, 0x01, // add r1, r1, #1
0xa2, 0xf1, 0x01, 0x02, // sub r2, r2, #1
0xef, 0xe7, // b next
// done:
0x00, 0xbe, // bkpt
0x00, 0xbf, // nop
// flash_base:
0x00, 0x3c, 0x02, 0x40 // .word 0x40023c00
0xdf, 0xf8, 0x30, 0xc0,
0xdf, 0xf8, 0x30, 0xa0,
0xe2, 0x44, 0x4f, 0xea,
0x82, 0x02, 0x03, 0x78,
0x0b, 0x70, 0x00, 0xf1,
0x01, 0x00, 0x01, 0xf1,
0x01, 0x01, 0xbf, 0xf3,
0x4f, 0x8f, 0xba, 0xf8,
0x00, 0x30, 0x13, 0xf0,
0x01, 0x0f, 0xfa, 0xd0,
0xa2, 0xf1, 0x01, 0x02,
0x00, 0x2a, 0xee, 0xd1,
0x00, 0xbe, 0x00, 0xbf,
0x00, 0x3c, 0x02, 0x40,
0x0e, 0x00, 0x00, 0x00
};