This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
There is no denying the fact that we have entered into the world of computer where almost every work is done by computer. There is no exaggeration to say that computer reduces the human effort in each and every kind of work and makes work easier and faster. It is the computer which paved the way for rapid pace of development of the country as well individual. Now a day the growth rate of development is at rapid pace. It is our foremost dutyÂ to know and learnÂ the structure of computer system.
Computer architecture is a vibrant and ever changing field. The assignment was aimed at investigating and researching about different types of memory and their performance level. For this we were supposed to select any computer system and the system that we have chosen and based our assignment on is x86. Now x86 has seen many CPU models and advancement. Our research has concentrated more on the x86 based CPUs in the period 1985-1996.
Now the assignment comprised of doing in depth analysis of the following in the selected computer system(x86 in our case):-
- The use of registers
- The use of cache memory
- The use of RAM
- And the use of memory hierarchy
We have tried to document the best possible report on the prescribed topic while constraining ourselves in the given word limit.
A resister is a single, permanent storage location within the CPU used for a particular, defined purpose. A resister is used to hold a binary value temporarily for storage, for manipulation, and/or for simple calculations. Each register is wired within the CPU to perform its specific role. Registers are basic working components of the CPU.
Reasons for using registers:
In x86 architecture, a processor register is a small amount of storage available on the CPU whose contents can be accessed more quickly than storage available elsewhere. This computer architecture operates on the principle of moving data from main memory into registers, operating on them, and then moving the result back into main memory, so-called load-store architecture.
A CPU must have some working space (temporary storage) that provides very fast access while data reading and writing. This Purpose is solved by the registers. Registers are used in a CPU because it stores data being processed, instructions being executed and memory addresses temporarily. It also serves the purpose of storing information about the status of CPU as well as programs and data while they are being fetched and executed.
It is a register and not memory on which the ALU can perform addition & subtraction, multiplication & division as well as conditional checks. Registers provide the CPU with a platform to store temporary values such as intermediate results of a calculation. Register also helps a computer to distinguish between data and instructions so that it can treat them differently.
In order to achieve maximum efficiency in terms of processing, registers are organised in a specific manner. This register organisation pattern is different for different processor architectures. In an x86 computer there are two types of registers:-
- User Visible Registers (UVR)
- User Invisible Registers (UIR)
User Visible Registers are visible to programmers or in other words they can be accessed directly via instructions or the programs while User Invisible Registers may be accessed indirectly by the instructions or may be accessed exclusively internally.
Types of Registers
User Visible Registers
As we have already discussed what UVRs are we will begin our discussion straightaway with the User Visible Registers found in an x86 as they differ from one architecture to other. An x86 contains 12 User Visible Registers which we can classify in 3 groups named
- Data Registers
- Pointer and Index Registers/ Address Registers
- Segment Registers
These 3 groups of registers themselves contain 4 different registers each.
General Purpose Registers: As the name says, general register are the one we use most of the time Most of the instructions perform on these registers. They can be used for any operation like storing data, address, index, PC, condition etc. General Purpose Registers may be optimised for specific operations.
Data Registers: They are used by the CPU for data manipulation.These registers include Accumulator Registers, Base Registers, Counter Registers and Data Registers.
v Accumulator Registers (AX): stores the result of the last processing step of the ALU.
v Base Registers (BX): It is used as a base pointer for memory access. It also gets some interrupt return values.
v Counter Registers (CX): It is used as a loop counter and for shifts. It too gets some interrupt values.
v Data Registers (DX): It is used by certain instruction for storing data.
- Pointer and Index Registers/ Address Registers: Indexes and pointer are the offset part of address. They have various uses but each register has a specific function. They sometimes with a segment register are used to point to far address (in a 1Mb range).
v Base pointer register (BP): Holds the base address of the stack.
v Stack pointer register (SP): Holds the top address of the stack.
v Source index register (SI): Used for string and memory array copying.
v Destination index register (DI): Used for string, memory array copying and setting and for far pointer addressing.
- Segment Registers: Segment registers contain the segment address of various items. They are only available in 16 bits. They set either by a general register or special instructions. Some of them are vital for the good execution of the program while multi-segment programming. They are Code Segment (CS), Data Segment (DS), Stack Segment (SS) and Extra Segment registers (ES)
User Invisible Registers
In x86 there are two types of User Invisible Registers. They are:
Control and Status Registers:
These registers cannot be referenced directly by our program and are manipulated by the control unit. In an x86 they are three. Program counter, instruction registers, Program status word (PSW).
v Program Counter (PC) or Instruction Pointer (IP): The IP, or program counter, tells the processor, which instruction to execute.
v Instruction Registers (IR): This is dedicated register to store actual current instruction where it can be decoded and cannot be accessed by the programmer.
v Program Status Word (PSW)/Status/Flag registers: They are collection of several 1 bit flag (Boolean variable) to track condition like arithmetic carry and overflow, power failure, internal computer error. Conditions of the latest ALU operations are reflected by them. There are few x86 specific flags used by x86 machines like I/O Protection level, Nested task, Resume, Virtual 8086 Mode. The figure below shows some of the general flags.
- Storage Registers: These are responsible for fetching information from RAM. Unlike most of the above, these are located on separate chips on the CPU and hence not architectural registers. They are:
v Memory Address Register (MAR): It contains the address of current memory location being operated upon.
v Memory Buffer Register (MBR) or Memory Data Register (MDR): It keeps the data placed or to be placed in the memory location pointed by the MAR.
In early days x86 machines like 8088, 8086 had 8 16 bit general purpose registers labeled AX,BX,CX,DX,BP,SP,SI,DI. The first four registers could also be accesses as bytes. Thus instruction could separately access higher and lower order bytes of AX called AH & BH. The x86 architectures (1989-1996) that we are concentrating on have expanded the general purpose registers into 32 bits making them EAX,EBX,ECX,EDX,EBP,ESP,ESI,EDI where E stood for extended. However the lower halves of these registers can still be accessed as previously ensuring compatibility with earlier CPUs. 8 additional 80 bit floating point data registers were introduced x86 CPUs that came later for multimedia instructions. They were called MM registers. Out of these 80 bits 64 bits could be used as floating point registers. But the MM/FP registers could not be used for both purposes simultaneously. Pentium 3 brought with it 8 128 bit registers for the use of SIMD instructions along with an additional SIMD FP control/status registers.
Cache memory is very fast memory that is built into a computer's central processing unit (CPU), or situated next to it on a separate chip. It is in 1989 that that an x86 got a cache on chip and 1995 when x86 got its first integrated L2 cache (Pentium Pro).
Â (Minimum Cache Configuration)
Reasons for using cache memory:
Without the cache memory every time the CPU requested data it would send a request to the main memory which would then be sent back across the memory bus to the CPU. This is a slow process in computing terms. While programs are open and running they use very few resources. When these resources are kept in a cache (high speed built in memory) programs can operate more quickly and efficiently. If everything else being equal, cache is so effective in affecting that system that a computer running faster CPUs with lower cache memory has a lower performance benchmark than a computer running a relatively slower CPU but with more cache. In some x86 like Pentium4 an Execution Trace Cache improves the performance by storing the decoded micro-operation in it.
How Cache Works?
Cache memory works due to a principle known as locality of reference. This principle saysthat at a particular time the most of the memory references will be confined to one or few small regions of memory called as locality. This locality is of two type spatial locality and temporal locality. Spatial locality says an item referenced is likely to referenced soon while the temporal locality says that items close to a referenced item is likely to be referenced soon . A cache memory uses these two principles for predicting what CPU is about to address and before the CPU addresses them it brings them in cache memory so when the CPU needs them it can get it much faster than receiving it from the main memory. It then saves the result after the processor is done with it.
Whenever the byte at a given memory address is needed to be read, the processor attempts to get the data from the cache memory.Â If the cache doesn't have that data, the processor is halted while it is loaded from main memory into the cache. Hit ratio of 90% is common with the amounts of cache in current use which in turn results in increasing in overall processing speed of up to 50%.
Levels of Cache:Â
The x86 computers that we are concentrating on had two levels of memory caching which were built in and integrated with the CPU to speed up the memory access. Both these cache memory are made up of SRAM but are much larger.
Level 1 or primary cache is the fastest memory (8-64 KB) on the PC and is referred as internal cache. It has always been directly into the processor itself. L1 cache is split separately into data and instruction cache. Pentium 3 had 16 KB of each L1 data cache and instruction cache. It is smaller in size but extremely fast with speed same as the processor
Level 2 caches is a slightly slower but larger cache memory during earlier days (1989-1995) x86 computers had an external L2 Cache. But with the advent of Pentium Pro in 1995 the concept an integrated L2 cache came into existence but still it was not located on the same circuit as processor and the L1 cache. Recently used information which could not find a place in the L1 cache is placed here in L2 cache.
Modern day CPUs have dedicated bus connectivity between L1 cache and L2 cache rather than both the caches being connected to the memory bus.
Cache memory organisation is very different from that of regular memory. It is organised into blocks. Each block can store either 8 or 16 bytes of data. This block holds an exact copy of same amount of storage from somewhere in the main memory. Each of these blocks also holds a tag which identifies the address of the memory location of the data of which it is storing a copy of. A typical 64 kb cache memory may consist of about 8000 8-byte blocks, each with a tag. A simplified step by step illustration of use of cache memory is shown in the figure below.
Write strategies for cache memory:
Before a cache line can be replaced, it is important to determine whether the line has been modified. It does so by comparing the contents of memory block and original content and if there is an inconsistency memory is updated. Write strategies depends upon the situation
- On a cache hit
1. Write-through:Â When a cache hit (a write hit) occurs the information is written into both the cache and the memory. It is easier to implement and a read miss here never results in writes into main memory. But at the same time results in more memory bandwidth usage. Inspite of this most of the x86 computers uses it.
2. Write-back (posted write): Main memory is updated only when entry is evicted on every cache hit just the cache gets updated. It uses a dirty bit for tracking whether block is dirty or clean (modified or not modified). It results in very fast write speeds (speed of cache memory) and less memory bandwidth usage But this policy may result into serious inconsistency between cache and main memory.
- On a cache miss
1. No write allocate: Change is just written into the main memory
2. Write allocate (fetch on write): Change is written to the main memory and fetched back into the cache.
Generally to handle both these cases computers use combination of these policies like:
Â· Write through with no write allocate