Jump to content
Sign in to follow this  
Smorg

How to determine the instructions encoding? [T1, T2,T3,T4]

Recommended Posts

Hi all,

 

I have an instruction:

 

B.N with the instruction being 0xe000. When I look up the branch instruction in the ARM Architecure manual it says there are 4 encodings T1 - T4. The encoding are describing what the segments of the bits mean. So I want to know specifically where is the address of the next jump encoded to? 

 

I know the width specifier is forcing 16-bit encoding. Can anyone help me figure out what encoding it is.

 

Thanks

Share this post


Link to post
Share on other sites

So basically that Branch instruction modifies the PC register so that the next instruction address it goes to is non-linear and specific. I want to know how I could determine what this address will be just like the PC register does. The clue must be in the instruction 0xe000 but in order for me to dissect it I need to know what encoding type I am looking at in the reference manual.

Share this post


Link to post
Share on other sites

What is the processor you are using?

 

Are you saying that the instruction is 'B.N 0xe000'? If that is the case the PC will contain address 0xe000.

 

Instruction encoding encodes the instruction based on many factors. When executing the branch instruction it is calculating the offset based on the current PC address. Then will select the encoding scheme based on that address. So if your label is 0xE000 then it will determine how much offset it needs. Then will select the encoding. (I think there are some other factors also)

 

As you can see, the ARM® v7-M Architecture Reference Manual contains 4 encoding patterns with different offset values (ie: imm8, imm11, etc). Each instruction can only provide limited offset. So the processor needs different encoding.  

 

I hope this helped.

 

-Pradeepa

Share this post


Link to post
Share on other sites

So basically that Branch instruction modifies the PC register so that the next instruction address it goes to is non-linear and specific. I want to know how I could determine what this address will be just like the PC register does. The clue must be in the instruction 0xe000 but in order for me to dissect it I need to know what encoding type I am looking at in the reference manual.

The bit value of the instruction tells you which encoding it is. For example, if it were T1 the first four bits would be 1101 and the instruction value would be 0xd???. Since it's 0xe000 you know it's encoding T2. It can't be T3 or T4 since they'd be 32-bit instructions beginning with 0xf???????.

 

So it appears to be an unconditional branch to an 11-bit immediate offset from PC, where the value of the offset is zero.

 

I'm guessing it's an extremely optimized idle loop in an interrupt-driven application.

Share this post


Link to post
Share on other sites

So it appears to be an unconditional branch to an 11-bit immediate offset from PC, where the value of the offset is zero.

 

I'm guessing it's an extremely optimized idle loop in an interrupt-driven application.

 

@@pabigot

 

Just for a clarification,

 

Since the instruction is 0xE000. First 4 bits (MSb) are 1110 (ie. T2 encoding) and the other 12 bits are zero (imm11= 0). So that means it is jumping to (PC + 0) location. In other words it is looping.

 

I hope that's how you come to the conclusion that it is an optimized idle loop?

Share this post


Link to post
Share on other sites

I hope that's how you come to the conclusion that it is an optimized idle loop?

That was my guess within the analysis time I wanted to spend on the question, yes.

 

In fact, though, it's not that simple:

1. Calculate the PC or Align(PC,4) value of the instruction. The PC value of an instruction is its address plus 4 for a Thumb instruction. The Align(PC,4) value of an instruction is its PC value ANDed with 0xFFFFFFFC to force it to be word-aligned.

Experimentation confirms that it is not a loop:

 

	.syntax	unified
	.arch	armv7-m
	.text
	.thumb
	.thumb_func
	.align	2
	.globl	Test
	.type	Test, %function
Test:
	b	.L_loop1
	nop
.L_loop1:
	nop

	.end
llc[19]$ arm-none-eabi-gcc -c -mthumb -mcpu=cortex-m3  x.S
llc[20]$ arm-none-eabi-objdump -d x.o 
Disassembly of section .text:

00000000 <Test>:
   0:   e000            b.n     4 <Test+0x4>
   2:   bf00            nop
   4:   bf00            nop
   6:   bf00            nop
The decoding information process was correct; the guess as to the effect of the instruction was wrong. Доверяй, но проверяй as they say.

Share this post


Link to post
Share on other sites

Thanks for your input guys. Can someone please explain what (imm*) means in the encoding?

 

@@pabigot

 

Just for a clarification,

 

Since the instruction is 0xE000. First 4 bits (MSb) are 1110 (ie. T2 encoding) and the other 12 bits are zero (imm11= 0). So that means it is jumping to (PC + 0) location. In other words it is looping.

 

I hope that's how you come to the conclusion that it is an optimized idle loop?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...