Research On Instruction Set Coding Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

This report presents an academic research on Instruction Set Coding which essentially determines the operation of microprocessor. Starting by explaining fundamental terms, further discussions from two aspects, Orthogonality and Code Density, are developed by examples and comparisons. SPARC is selected as an example of standard RISC machine while 80x86 is on behalf of novel CISC machine. Besides, less standard ARM Thumb is included as extreme example of high code density. Some conclusions reached finally will contribute to the following processor design.

Basic definitions

Instruction set: "A collection of instructions that present as machine code for program."[]

Othogonality: "Any instructions can use any register by addressing mode."[]

Code density:

"The amount of space that an executable program takes up in memory. Code density is a major requirement in embedded system design since it not only reduces the need for the scarce resource memory but also implicitly improves further important design parameters like power consumption and performance."[g]

A set of instructions can be regarded as orthogonal set since any instruction can use data of any type via any addressing mode without restrictions on the type of registers or addressing mode to be used. Obviously, orthogonal instruction set provides an easy and feasible way to program because just select one combination of opcode, register type, data type, and addressing mode is enough rather than remember fixed format for every individual instruction. On the contrary, to meet some special applications such as floating point number calculation, irregular instruction sets called non-orthogonal set to accommodate a variety of unusual addressing modes result in diverse use of registers for individual instruction.

Usually, one piece of instruction is one word (8 bits) or several words long because a word is the basic unit for data transfer between CPU and memory. An instruction considered as a bit vector can be divided into several bit fields. After looking up several kinds of instruction in different processors, one piece of instruction should typically embrace four sorts of information: operation function, the address of source operand, the address of destination operand, the next instruction clue. The last one is always implicit by PC except the next instruction is not in the sequence such as subroutine, interruption and conditional or unconditional branch.

Programmers are provided with 32 general purpose registers divided into four sets: in, out, local, and global, as shown in Fig.. []

Fig.. register division in SPARC V9[]

All the instructions are encoded as 32 bits long. Only two addressing modes are supported, register indirect mode with index or immediate value. The fist 2 bits opcode identify a basic instruction group. The full function of the instruction is specified in Op2 and Op3. i decides which addressing mode is adopted: i=0-> operand in registers, i=1->constant operand. There are two addressing modes: register indirect with immediate and register indirect with index. If instruction is in first mode, the source operand2 takes 0-4 bits field and the effective address is content of register1+content of register2; in second mode, the immediate operand takes up fully 0-12 bits and the effective address is content of register1+immediate.

Fig.. general instruction format in SPARC[]

This kind of instruction set is non-orthogonal because registers are divided into register files which are dedicated to different functional units. System uses register window to manipulate register allocation.

CISC has more complex instructions to implement function hence shorter program and less memory occupation. RISC has simple instructions but more instructions should be written into program result in more memory consumption. To portable devices, memory space is precious; however, power dissipation mainly comes from memory. Most of mobile devices reply on battery supply so power must be taken into consideration. What is more, more efficient instruction set allows system to be clocked at a lower speed which reduces dynamic power dissipation significantly.

In CISC, every instruction can access registers or memory conveniently. On the other hand, in RISC with Load/Store architecture, only load and store instructions can move data between memory and register. Besides, only data in registers can be operated by instruction in RISC. Since Data operation in register is much faster than in memory, RISC has speed advantage. Therefore, although RISC approach seems memory consumer, execution speed is always faster than CISC.

c. ARM Thumb

In Thumb, Datapath width is 32 bits but the length of instruction is halved to dense 16 bits. The compressed instructions can be decoded to perform the same function as 32 bits ARM instructions. With Thumb technology, the improvement in code density is 30% compared with standard ARM architecture.[] From table.. under approximately the same length code, Thumb takes half memory space than RISC.

However, the instruction decompression provided for decompressing the compressed instructions to be executed into original instructions makes CPU design more complicated[b]. The instruction decompressor includes a plurality of instruction group decoding tables, each being stored with the original instructions of a predetermined type.[b] What is more, Thumb enables more of the most frequently used code to be stored in on-chip memory[].

Furthermore, 16-bit architecture also has a critical disadvantage for embedded processors that no more encoding space for special instructions coined for certain applications.[i]


Generally, deciding which instruction format will be adopted is a trade-off of performance, power and cost. The following design of processor has16 bits addressing space and16 bits databus capacity. Some conclusions are reached to aid the following design:

->The instruction set must be orthogonal which means every instruction can use any register by any mode.

->To implement all the functions in need, the opcodes are encoded uniquely.

->Support three kinds of addressing modes including immediate operand addressing, direct addressing and indirect addressing. Addressing mode takes up several bits in instruction format.

->To simplify design, constraint one instruction length within one word since more than one word long instruction need special register to indicate which part of instruction is overhead right now.