Study On How A Microassembler Works Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

An assembler is a program that takes basic computer instructions and converts them into a pattern of bits that the computer's processor can use to perform its basic operations. Some people call these instructions assembler language and others use the term assembly language.

The output of the assembler program is called the object code or object program relative to the input source program. The sequence of 0's and 1's that constitute the object program is sometimes called machine code.

In the earliest computers, programmers actually wrote programs in machine code, but assembler languages or instruction sets were soon developed to speed up programming. Today, assembler programming is used only where very efficient control over processor operations is needed. It requires knowledge of a particular computer's instruction set, however. Historically, most programs have been written in "higher-level" languages such as COBOL, FORTRAN, PL/I, and C. These languages are easier to learn and faster to write programs with than assembler language. The program that processes the source code written in these languages is called a compiler. Like the assembler, a compiler takes higher-level language statements and reduces them to machine code.


An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other integrated circuits. It implements a symbolic representation of the binary machine codes and other constants needed to program a given CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on mnemonics that symbolize processing steps (instructions), processor registers, memory locations, and other language features. An assembly language is thus specific to a certain physical (or virtual) computer architecture. This is in contrast to most high-level programming languages, which are ideally portable.

A utility program called an assembler is used to translate assembly language statements into the target computer's machine code. The assembler performs a more or less isomorphic translation (a one-to-one mapping) from mnemonic statements into machine instructions and data. This is in contrast with high-level languages, in which a single statement generally results in many machine instructions.

Many sophisticated assemblers offer additional mechanisms to facilitate program development, control the assembly process, and aid debugging. In particular, most modern assemblers include a macro facility and are called macro assemblers







A microassembler (sometimes called a meta-assembler) is a computer program that helps prepare a microprogram to control the low level operation of a computer in much the same way an assembler helps prepare higher level code for a processor. The difference is that the microprogram is usually only developed by the processor manufacturer and works intimately with the hardware. The microprogram defines the instruction set any normal program (including both application programs and operating systems) is written in. The use of a microprogram allows the manufacturer to fix certain mistakes, including working around hardware design errors, without modifying the hardware. Another means of employing microassembler-generated microprograms is in allowing the same hardware to run different instruction sets. After it is assembled, the microprogram is then loaded to a control store to become part of the logic of a CPU's control unit.

Some microassemblers are more generalized and are not targeted at a single computer architecture. For example, through the use of macro-assembler-like capabilities, Digital Equipment Corporation used their MICRO2 microassembler for a very wide range of computer architectures and implementations.


If a given computer implementation supports a writeable control store, the microassembler is usually provided to customers as a means of writing customized microcode.

In the process of microcode assembly it is helpful to verify the microprogram with emulating tools before distribution. Nowadays, microcoding experience a revival, since it is possible to correct and optimize the firmware (i.e. the microcode) of processing units sold, in order for adaptation to operting systems or for bug fixing means. However, a commonly usable microassembler for todays CPUs is not available to manipulate the microcode. Unfortunatly, it is difficult to obtain open knowledge about changing the microcode because of interlectual property reasons.

How microcode can be assembled with a microassembler to control a CPU with own created machine codes on microprogramming basis, can be understood and simulated with e-learning tools likeMikrocodesimulator MikroSim on a didactial point of view.


The microassembler realizes the microassembly language described by Data General [i] with a few extensions deemed desirable for our system. Most of the microassembler was written in Data

General extended ALGOL-60. The Data General ALGOL is well-suited to the purpose since it provides extensive features for string handling and bit manipulation. (The DG ALGOL is not, however,without fault - its integer arithmetic is extremely suspect).The only portion of the microassembler that is not written in ALGOL is that procedure which actually loads WCS and the decode RAM. These load procedures are written in Assembly language.There are two main modules in the microassembler. The first assembles user microprograms, formats output and creates a microload module.

The second loads WCS and the decode RAM with the output produced by the first. Thus it is not necessary to reassemble debugged and tested microroutines. Input to the microassembler consists, naturally enough, of microprograms. Each microinstruction is preceded by a label and all address references in the program are symbolic. [Non-symbolic references are allowed, however, if the user wishes to jump to a microroutine in the standard firmware].

Assembled microinstructions re submitted to the microassembler in free format; that is, each field is separated from its predecessor by 1 or more blanks. There is only one croinstruction per line of input. Microprograms are terminated with a line whose label is END. Any line containing a "*" is a comment. The Memphis State system has only a teletype for hard copy output. Since this device is slow, the special commands -LIST and -UNLIST, can over-ride global listing commands to start and stop listing. Following the microprogram proper,

the user specifies the contents of the decode RAM by providing entry points and two decode addresses. (The addresses are labels which appeared in the microprogram).

The microassembler was written during the summer of 1976 and has been extensively tested by a graduate class in microprogramming. It has made it easier to use the WCS, but does not provide help in the debug or testing phases. In fact, the most common indication of a logical error in a microprogram is for the computer to "crash". We hope to write an Eclipse simulator,

similar to the one described by Larry Wear for the HP2100 . to make the test phase of microprogram development less traumatic.

Example This example is included mainly to show the output of the microassembler and an example program. It is not intended that the reader understand the example without reference to the pertinent DG users guide . The example provides a few floating point operations in microcode. In the example, floating point numbers are represented as two 16-bit Eclipse words as follows:

The four visable general registers of the Eclipse, ACO-AC3 are paired to form two floating point registers, Rag A - (ACO-ACi) and Rag B (AC2 - AC3). The microprogram provides floating store of either register (FSTA, FSTB), floating load of either register (FLDA, FLPB), negation of either register (FNEGA, FNEGB), absolute value of either register (FABSA, FABSB) and floating point add and subtract. Floating point add (FAPB) places the sum of registers A and B into register A, subtract (FAMB) places the difference in register A. One reason for choosing this subset of floating point firmware is that it fits (rather nicely) on two pages of teletype output.

The execution time required for a floating point ADD or SUBTRACT depends on whether the signs of the operands are different and on how many alignment loops are necessary. If, however, the signs are the same and the exponents differ by i, A floating point ADD takes 8.4 microseconds. Add and subtract microroutines share code since A-B = A+(-B). This particular example program was written by a student in the microprogramming course, Timothy McCain.While we don't intend to explain the entire program, a description of a few microinstructions is in order. The microinstruction at FAMB, for example, has the effect of loading into GR2 <0-15> the value 1-0...0. It does this by selecting a constant 128 as the A input (CON in AC field), the constant 128 appears in the TR ADD field. The constant is sent straight through the ALU (A in ALU field), the left and right bytes of the ALU output are swapped (SW in SH field) and the result is loaded (L in L field)19 into the A register (GR2 in AR field). It does not use memory and makes an unconditional branch (NC in ST CNG field) to the instruction at FAPB. The subsequent microinstruction (FAPB) starts memory on location 16 and branches to FSAVPC where the program counter is saved in location 16; this is done by writing the registerspecified in the BR field (PC) into the location which has just been started. It also starts memory on location 17.

The microassembler has been very helpful at Memphis State.The fact that it was written in ALGOL made modification and correction easy. The speed of the microassembler is acceptable; no microprogram can contain more than 256 instructions and no more than two

minutes are required to assemble a microprogram of this size.The microassembler has been used to implement an emulator for an artifical machine which is used to teach compiler design. The

Eclipse is an interesting machine to use to teach microprogramming, however, the paucity of entry points to WCS makes it difficult to do nice emulation examples without swapping decode addresses in and out of the decode RAM.



This document is a brief introduction to Micro-Assembly Language (MAL), the language accepted by the mic1 micro-assembler. It describes the lexical, syntactic, and semantic elements of the language, and gives a few pointers on microprogramming with the mic1 micro-assembler.

Lexical Elements

Like most assembly languages, the Micro-Assembly Language is a line-oriented language. Each micro-instruction is generally defined on a single line of the program file. Blank lines and lines containing only a comment are ignored. The end-of-line is generally significant.

Also, MAL is case-sensitive. For example, "AND" is a reserved word corresponding to a bitwise operation of the mic1 ALU, while "and" is not reserved and may be used as a label.


All comments begin with two slash characters ("//") and continue to the end of the line. Blank lines, and lines consisting only of white space and comments are ignored by the micro-assembler.


Directives for the micro-assmebler begin with a period character (".") and may contain alphabetic characters.

There are two micro-assembler directives: ".default" and ".label". Directives are used to provide guide the behavior of the micro-assembler, and do not correspond with words in the control store. These are defined more fully below.

Reserved Words

The names of registers and control lines are reserved, as are the words "if", "else", "goto", "nop", "AND", "OR", and "NOT". For the mic1 architecture, the following words are reserved and may not be used as statement labels:
























Decimal Literals

Decimal literals used by are one the following numeric strings: "0", "1", "8". These are used as numeric constants in MAL.

Hexadecimal Literals

Hexadecimal literals are strings beginning with "0x" and followed by one or more hexadecimal digits ("0"-"9") or letters ("a"-"f" or "A"-"F"). These are used as addresses or address masks in MAL.

Special Characters

The following characters have special meaning in micro-assembly language:









All characters and tokens which are not specifically described above are disallowed in MAL.

Syntactic Elements

The following grammar describes the language accepted by the mic1 micro assembler. eol, label, and address (hexadecimal numeric literal), are terminal symbols, as are all strings enclosed in double-quotes ("). All other symbols below are non-terminals. "program" is the start symbol.

program ::= line_sequence


line_sequence ::= line line_sequence



line ::= instruction eol

| directive eol

| eol


instruction ::= label statement_sequence

| statement_sequence

| label


directive ::= ".label" label address

| ".default" statement_sequence


statement_sequence ::= statement ";" statement_sequence

| statement ";"

| statement


statement ::= io_statement

| control_statement

| assignment_statement

| nop_statement


io_statement ::= "rd"

| "wr"

| "fetch"


control_statement ::= if_statement

| multiway_branch_statement

| goto_statement


if_statement ::= "if" "(" condition ")" "goto" label ";" "else" "goto" label


condition ::= "N"

| "Z"


multiway_branch_statement ::= "goto" "(" mb_expr ")"


mb_expr ::= "MBR" "OR" address

| "MBR"


goto_statement ::= "goto" label


assignment_statement ::= target "=" assignment_statement

| expr


target ::= c_register

| "N"

| "Z"


c_register ::= "MAR"

| "MDR"

| "PC"

| "SP"

| "LV"

| "CPP"

| "TOS"

| "OPC"

| "H"


expr ::= operation

| operation "<" "<" "8"

| operation ">" ">" "1"


operation ::= a_term "AND" b_term

| b_term "AND" a_term

| a_term "OR" b_term

| b_term "OR" a_term

| "NOT" b_term

| "NOT" a_term

| b_term "+" a_term

| a_term "+" b_term

| b_term "+" "1"

| a_term "+" "1"

| b_term "-" a_term

| "-" a_term

| b_term "-" "1"

| b_term "+" a_term "+" "1"

| a_term "+" b_term "+" "1"

| b_term

| a_term

| "-" "1"

| "0"

| "1"


b_term ::= "MDR"

| "PC"

| "MBR"

| "MBRU"

| "SP"

| "LV"

| "CPP"

| "TOS"

| "OPC"


a_term ::= "H"


nop_statement ::= "nop"



Directive Semantics


The .default directive allows us to specify a default instruction to place in any unused addresses of the control store. For example:

.default goto err1

This would help "catch" any unintended multiway branches which are not explicitly accounted for in the microcode.


Labeled statements are "anchored" at the specified control store address. Any statement having the label will be located at that specific location in the control store. This directive allows the multiway branch statemnts such as "goto (MBR)" or "goto (MBR OR 0x100)" to dispatch to a known location.

.label nop1 0x00

.label bipush1 0x10

Statement Semantics

Lines which contain a label, a statement_sequence, or both a label and a statement_sequence are considered to be specifiers for a micro-instruction, that is, for a word in the control store.

If a line begins with a label, its statement may be the target of an explicit goto.

If a statement contains an explicit goto label, there must be a statement having that label as its statement label.

Register names which appear to the left of an equal sign ("=") correspond to control lines which are enabled to load the register from the C bus.

Register names which appear without an equal sign ("=") or to the right of the right-most equal sign correspond to control lines which are enabled to put register values on the A or B buses as inputs to the ALU.

The tokens "+", "-", "<", ">", "AND", "OR", "NOT", "0", "1", and "8" which appear without an equal sign ("=") or to the right of the right-most equal sign are used to determine which control lines are asserted as inputs to the ALU and the SHIFT register.

"rd", "wr", and "fetch", cause the appropriate bits to be set in the control store for enabling the corresponding control lines.

"if", "else", and "goto" are used to set the JAMN, JAMZ, and JMPC, bits of the micro-instruction, along with the NEXT_ADDRESS field.

"nop" is a place-holder which allows us to have a do-nothing instruction without a label.

For a complete example of a mic1 micro-program see mic1ijvm.mal which implements an interpreter for a simplified (integer) subset of a Java Virtual Machine.