It is a performance hit to use gcc's profiling approach for this tiny target. Even more – jtag hardware facility does not perform any profiling functions. However we've got gdb's built-in simulator where we can do anything.
We define new section .profiler which holds all profiling information. We define new pseudo operation .profiler which will instruct assembler to add new profile entry to the object file. Profile should take place at the present address.
Pseudo operation format:
.profiler flags,function_to_profile [, cycle_corrector, extra]
where:
s
x
i
f
l
c
d
I
P
p
E
e
j
a
t
function_to_profile
cycle_corrector
extra
For example:
.global fxx .type fxx,@function fxx: .LFrameOffset_fxx=0x08 .profiler "scdP", fxx ; function entry. ; we also demand stack value to be saved push r11 push r10 push r9 push r8 .profiler "cdpt",fxx,0, .LFrameOffset_fxx ; check stack value at this point ; (this is a prologue end) ; note, that spare var filled with ; the farme size mov r15,r8 ... .profiler cdE,fxx ; check stack pop r8 pop r9 pop r10 pop r11 .profiler xcde,fxx,3 ; exit adds 3 to the cycle counter ret ; cause 'ret' insn takes 3 cycles