Large Parallel Processing Systems Architecture Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Today it would be seen as a parallel processing tile from which to build large parallel processing systems. Transputer like architectures are now the mean stream of parallel computing.

It was seen in many different ways, depending on the standpoint and knowledge of the person viewing it.

Where Inmos started from when creating the transputer was embodied in the name, derived from trans, meaning across, with the suffix 'puter, from computer. The thinking was that applications were increasingly involving flows of data rather than requiring more structured activities on predefined sets of data, as are characteristic of a "normal" computer. This was the thinking that was creating the digital signal processor (DSP). But where a DSP takes data in from a source, processes it, and passes it on, the transputer had four channels of bi-directional communication, or links. That made it simple to build a two-dimensional array, each transputer linking to four neighbors.


The transputer was an innovative computer design of the 1980s from INMOS, a British semiconductor company based in Bristol. Transputer was the first single chip computer designed for message passing multiprocessor systems .When the transputer was first reviled, many thought this exceptional concept should be the next revolution in microprocessor technology. As you may already have guessed, things didn't happen as expected: today, the transputer this interesting chip has largely forgotten, but it is essential to write about it on this paper.


First generation of them are 16 bit transputers: T212, T222, T225 (The 212 ran at 20MHz both the T222 and T225 ran at 20MHz.); 32 bit transputers without a floating unit: T400, T414, T425, T426 (the T414 was available in 15 and 20MHz varieties, T425 in 20, 25 and 30MHz varieties); 32 bit transputers with a floating unit: T800, T801, T805 (the T805 was also later available as a 30MHz part. All have the same instruction sets, the same architecture and fully compatible communications links. Second Generation 64 bit transputer with a floating unit: T9000. Although the architecture is the same, it is a new design and is very more complex chip than its predecessors.

All the transputers except T9000 has identical architecture. The internal bus connects the processor to local memory and to an external memory interface. The communication links are connected to the bus by an interface. This makes it possible for the processor to work independent of the links. Depending on the type of transputer, the floating point unit and other system services are also connected to this bus. In figure1 T805 is the famous one. It consists of a conventional, RISC processor, a communication subsystem, four Kb of on-chip RAM, four high-speed inter-processor links and a memory interface, system services and a floating point. These functional units will briefly explains in the following sections.

The process:

A process on the transputer is described by several pieces of information, such as workspace, registers, program and priority. Such a process does not have to be a sequential process but can also consist of several sub processes.

The processes on the transputer can be separated in two categories:

Active processes: is a process which is executed or which is waiting for the next to be executed.

Inactive processes: is a process which is suspended at specific time or which is waiting for inter process communication.

2 Registers:

"The transputer has a small number of registers , a workspace register(Wreg),an instruction pointer(Iptr),an operand register(Oerg) and a three register evaluation stack(Areg, Breg, and Creg)"(,operand+register,workspace+register&source=bl&ots=fiv2ktQmIW&sig=AYGCR5W73DgjhP_TsIxyKS6HLkw&hl=ar&ei=IeIXS_jgIM2IkAXqo8TjAw&sa=X&oi=book_result&ct=result&resnum=5&ved=0CBwQ6AEwBA#v=onepage&q=Instruction%20pointer%2Coperand%20register%2Cworkspace%20register&f=false).

The registers Areg, Breg, Creg are used as a stack, rather like early calculators, to hold intermediate results. The registers Areg, Breg and Creg form a stack. Every instruction notionally pops off the stack the items that it is going to work on, then pushes its result back onto the stack. This stack arrangement is what allows most of the instructions to have no operands. The arrangement is like some programmable calculator languages(though such languages are much more limited)",". There is no protection against pushing too many values on the stack that it overflows. (It is left to compilers and assembly code writers.).These features leads to simplified register connection, compact instructions, faster register access.

Iptr, Wreg,Oreg: These are called sequential control registers: Instruction pointer (Iptr), holds the address of the next instruction. Workspace register (Wreg), holds the workspace pointer (Wptr) which is the address an area of memory called the local workspace. Operand register (Oreg), holds the operand for the current instruction. It can't be directly loaded from (or stored in) the data part of the memory

Instruction Set:

All the transputers have the same instruction format.

Instruction Fetch State

In order to fetch the instruction to be executed next:

  1. Iptr must be selected to Input for the address bus in which Iptr contains the address for the next instruction,
  2. memory must be selected to the source for the data bus since the address to be executed next which is kept in Iptr must loaded on the address bus,
  3. Ireg must be set to the output destination for the data bus, and
  4. the next address of the micro-code ROM must be set to 0x001 to go to the instruction decode state.

The specification is given in this state and is described in the micro-code ROM at address 0x000..

Instruction Decode State

The contents of four higher bits of Ireg or Oreg 32bit are used to specify the next instruction to be done. The next address of the micro-code ROM is then determined conditionally according to the instruction decoded.

Instruction Execution State

If the instruction to be executed is finished in one state transition, then the next state will be back to the Instruction Fetch. Instead if the instruction needs other states to complete, then the next address for the micro-code ROM is an appropriate one for the next state.

Floating Point Unit:

"It is almost independent of the rest of the chip. It has its own internal registers, separate from the registers used by integer operation .It execute instructions to perform floating point arithmetic operations, including commonplace operation such as addition or multiplication, and more complicated operations such as evaluation of some transcendental functions like sine or logarithm" ( It has its own evolution stack registers FAreg, FBreg, FCreg. There are 53 floating-point instructions. High level programming language to program is strongly advised rather than assembly. It bases IEEE standards for the floating point format, operations and results: For the 32 bit numbers; 1 bit for sign, 8 bit for exponent, 23 bit for mantissa. For the 64 bit numbers; 1 bit for sign, 11 bits for exponent, 52 bits for mantissa. It also supports such results Inf(infinite), NaN( not a number and not defined).


"The transputer has two timers, one that gives a tick every microsecond and one that gives a tick every 64 microseconds (for the 20 MHz T414). This can be considered another inconvenience because the two timers are associated with a level of priority. Low-priority processes cannot use the high-resolution timer.

This means it can happen that processes run needlessly in high-priority, all because of the fact they have to use the high-resolution timer"(,Transputer, Jacco de Leeuw Arjan de Mes, October 1992 ).

System Services:

"On all INMOS board products the term 'system services' refers to the collection of the reset, analyse, and error signals.

On the IMS B008 the system services for the TRAM in slot 0 can be connected to either the UP system services from another board or the system services controlled by the PC bus interface. System services for the other TRAMs can be connected to the same source as TRAM 0 or to the subsystem port of TRAM 0. As shown in the block diagram the Down and Subsystem services are brought out to the 37 way D-type connector allowing this hierachy to be extended to multi board systems".(


(Communication between processes on the transputer is performed by two instructions input message and output message. The communication which is supported is a point-to-point,unbuffered message-passing scheme. It therefore requires a handshake between processes, which synchronises them. Communications over these links are controlled by autonomous controllers, which have DMA access to the transputers memory)( They are extremely flexible and can be used for, interfacing with peripherals using a link adaptor, an ASIC (Application specific integrated circuit) chip can use a link to read and write directly into a transputer memory at high speed, most common to talk to another processor, usually anther transputer.

Link Communication

The hardware connection of links is simple, short distances. Links are serial port . if you see the figure for each link connection only two tracks are required. In transputer the processor and four links have independent access to the memory. The processor sets up a link and after that it freedom to execute other code while dedicated link logic handles the communication. All these four links can be outputting and inputting while the processor is running code. Of course there may a problem with bandwidth when processor and all links access memory at the same time.

Because the links designed the transputer do not need to be synchronized in order to talk each other.

T9000 Second Generation:

"The T9000 is the latest generation of Transputers from INMOS. It represents an improvement on the existing generation of transputer products in both capability and performance. The T9000 extends the transputer architecture in a number of ways. The most important of these is that the T9000 transputer decouples the physical connec-tivity of a system from its logical connectivity. Between any two directly connected T9000 transputers.

there may be established an almost unlimited number of

The T9000 link system also enables transputers to be connected via a network of C104 packet routers which allows virtual channels to be established from any transputer to any number of other transputers. Other extensions of the architec- ture include the enhancement of the process model to provide per-process error handling facilities and the ability to run programs under memory manage- ment.The T9000 has about ten times the performance of a T805. This improvement derives from a variety of sources including the use of caching, improvements in semiconductor technology, and a highly pipelined, superscalar processor".(,The ,T9000 Transputer)

"It has a 32-bit pipelined processor with a 64-bit FPU and 16 Kbytes of cache. There are four bi-directional serial data links and a Virtual Channel Processor (VCP) allowing efficient T9000-to-T9000 communications. These components are combined onto a single integrated circuit". (, 09 NOV 95, The Application of the T9000 Transputer to the CPLEAR experiment at CERN) Figures:



  • Transputer Application,M.Jane et.,Eds. IOS Press,1992
  • , Do you Know What a Transputer Is? Jan 15th, 2008, Jos Kirps
  • ttp://