**Zach Fredin**
(e2dab52e)
*
at
19 Oct 13:25
*

Update README.md

**Zach Fredin**
(0ed013ba)
*
at
19 Oct 13:12
*

fixed math error :-/

Sounds good. Post your results as a section in the README.md when you can break down the pi loop.

Ok I'll keep digging into the assembly in that case to see where that's coming from.

You're right -- we're still seeing high-30s clock cylces per pi calculation iteration. I swapped MFLOPS and pi loop iterations swapped in my calculation. I calculated clock cycles per FLOP which doesn't really make sense here.

STM32F412, FPU on: ~4.3 seconds for 10,000,000 loops of pi. I got a similar MFLOPS = ~11.6 with a stopwatch, but I still got ~36 cycles using an 84MHz clock. Using this calculation to get cycles per iteration: 8.4e7*4.3/10,000,000.

Did I make a units mistake somewhere? What calculation are you using to find cycles per iteration?

The pi benchmark readme is not yet updated so you can see (verify?) my mistake: https://gitlab.cba.mit.edu/zfredin/stm32f412_core/tree/master/nucleo-f412zg/pi

STM32F412, no FPU: 12.91 seconds for 1,000,000 loops of pi. At 5 FLOPs per loop, that gives us (5E6)/(12.91) = 0.387 MFLOPS.

STM32F412, FPU on: 0.391 seconds for 1,000,000 loops of pi. I calculated that this resulted in 1.95 MFLOPS because I multiplied when I should have divided. (5E6)/(0.391) = 12.79 MFLOPS; at 84 MHz, that is ~6.5 clock cycles per loop. Now we're on the happier side of what I would expect given the Cortex-M4's 14-cycle divide spec.

I'm getting up to speed on the SAMD51; I've got it running sans external crystal at ~160 MHz and saw an NPTS=1,000,000 pi loop time of 0.297 s, or 16.84 MFLOPS. I haven't measured the PLL speed directly so the 160 MHz number could be off. If anything, this result seems low vs the STM32 so more investigation is clearly needed.

If you want to tinker with the SAMD51 using OpenOCD, arm-none-eabi-gdb, Microchip's libraries (handily shared via Adafruit), and a Makefile, I posted some instructions here: https://gitlab.cba.mit.edu/pub/hello-world/atsamd51

**Zach Fredin**
(42436dce)
*
at
09 Oct 14:03
*

added -ffast-math comments

**Zach Fredin**
(58a479d6)
*
at
09 Oct 13:53
*

added single-precision calculation, updated readme.md

**Zach Fredin**
(780a7a33)
*
at
09 Oct 13:38
*

updated README.md

**Zach Fredin**
(bdacf57b)
*
at
09 Oct 13:36
*

added pi benchmark. updated rules.mk for -O3 optimization flag.

**Zach Fredin**
(0d5053d9)
*
at
02 Oct 16:15
*

added usart roundtrip benchmark

**Zach Fredin**
(34849529)
*
at
02 Oct 15:57
*

added usart benchmark

**Zach Fredin**
(e48fba3a)
*
at
02 Oct 14:07
*

added ringtest

**Zach Fredin**
(d21990f3)
*
at
02 Oct 00:58
*

added memory map for f412 board; updated blink example; documentation

**Zach Fredin**
(047a75b2)
*
at
24 Sep 20:16
*

added manuals and tools