Chapter One


In this chapter, we introduce the types of communication needed for UART.

During the past decade, the data communications industry has grown at an astronomical rate. Consequently, the need to provide communications between dissimilar computer systems has also increased. Thus to increase an orderly transfer of information between two or more data communication systems the use of different equipment with different needs came to existence.

     A data communications network (Fig1.1) can be as simple as two personal computers connected through a public telecommunications network or it can be comprised of a complex network of one or more main frame computers and hundreds (or even thousands) of remote terminals, personal computers and workstations. Today, data communications networks are used to interconnect virtually all kinds of digital computing equipment such as Automatic Teller Machines (ATMs) to bank computers, personal computers to inform highways such as internet and work stations to main frame computers. Data communications networks are also used for airline and hotel reservation systems and for mass media and news networks such as Associated Press (AP) or United Press International (UPI). The list of applications for data communications networks goes on almost indefinitely.

1.1 Serial and Parallel Data Transmission

Binary information can be transmitted either in parallel or in serial. Figure 1.2 shows the binary code 0110 is transmitted from location A to location B in parallel. As the figure shows, each bit position (A0 to A3) has its own transmission line. Consequently, all four bits can be transmitted simultaneously during the time of a single clock pulse (T). This type of transmission is called parallel-by-bit or serial-by-character.

1.2 Advantages of Serial over Parallel transmission

  • In serial transmission only one bit is transmitted at a time as such wiring will be minimum whereas in parallel transmission, bits will be sent simultaneously calling for need to provide wire for transmission of each bit, which may prove costly and almost worthless for long distance transmissions.
  • It is expected that all bits in parallel transmission reach data terminal equipment at the same instant. But, a definite delay will always exists, which might create a problem of slowing down high-speed devices.
  • There is also a problem of electromagnetic interference.

1.3 A typical example of UART application

1.4 Types of Serial Transmission

There are two primary forms of serial transmission

  • Synchronous
  • Asynchronous

Depending on the modes that are supported by the hardware, the name of the communication sub-system will usually include a A if it supports Asynchronous communication and a S if it supports synchronous communication.

Synchronous Serial Transmission

Synchronous serial transmission requires that the sender and receiver share a common clock with one another or that the sender provide a strobe or other timing signal so that the receiver knows when to read the next bit of data. In most forms of the serial synchronous communication, if there is no data available at a given instant to transmit, a fill character must be sent instead so that data is always being transmitted. Synchronous communication is usually more efficient because only data bits are transmitted between sender and receiver and at the same time, it is very costly, as it requires wiring and circuits to share a clock signal between sender and receiver.

A form of synchronous transmission is used with printers and fixed disk devices. In these systems data is sent on one set of wires while a clock or strobe is sent on a different wire. Printer and fixed disk devices are not normally serial devices because most fixed disk interface standards send an entire word of data for each clock or strobe signal by using a separate wire for each bit of the word. In the PC industry, these are known as parallel devices.

The standard serial communication hardware in the PC does not support synchronous operations. This mode is described here for comparison purpose only.

Asynchronous Serial Transmission

Asynchronous transmission allows data to be transmitted without the sender having to send a clock signal to the receiver, instead, the sender and receiver must agree on timing parameters in advance and special bits are added to each word, which are used to synchronize the sending and receiving units.

1.5 Asynchronous data format

With asynchronous data, each byte is framed between a start and stop bit. Figure 1.5 shows the format used to frame a byte for asynchronous data transmission. The first bit transmitted is the start bit and is always logic 0. The bits are transmitted next beginning with the LSB and continuing through the MSB. The parity bit (if used) is transmitted directly after the MSB of the byte. The last bit transmitted is the stop bit, which is always logic 1. There can be 1, 1.5 or 2 stop bits.

Standard serial data format

Logic 0 is used for the start bit because an idle condition (no data transmission) on a data communication circuit is identified by the transmission of continuous 1s (these are often called idle line 1s). Therefore, the start bit of the first byte is identified by a high to low transition in the received data and the bit that immediately follows the start bit is the LSB of the character code. All stop bits are logic 1s, which guarantees a high to low transition at the beginning of each character. After the start bit is detected, the data and parity bits are clocked into the receiver. If the data are transmitted in real time (i.e., as an operator types data into the computer terminal), the number of idle line 1s between each byte will vary. During this dead time, the receiver will simply wait for the occurrence of another start bit before clocking in the next character.

1.6 Baud

Baud is a measurement of transmission speed in asynchronous communication. Because of advance in modem communication technology, this term is frequently misused when describing the data rates in newer devices.

Traditionally, a baud rate represents the number of bits that are actually being sent over the media, not the amount of data that is actually moved from one DTE device to the other. The baud count includes the overhead bits start, stop and parity that are generated by sending UART and removed by the receiving UART. This means that seven bit words of data actually take 10 bits to be completely transmitted. Therefore, a modem capable of moving 300 bits per second from one place to another can normally only move 30, 7-bit words if parity is used and one start and stop bit are present.

1.7 Why UART?

UART (Universal Asynchronous Receiver Transmitter) is a serial communication interface that receives and transmits serial data. Most computers and micro controllers have one or more serial data ports used to communicate with serial input/output devices.

For those computers and micro controllers, which process the data in parallel, in order to interact with distant data terminal equipment, they need to rely on a transmission line which supports serial communication. Hence there is great need for conversion of parallel to serial data and vice versa. This purpose is well served by UART.

When transmitting, UART takes 8 bits of parallel data and converts them into serial bit stream that consists of a start bit (0), 8 data bits (least significant bit first) and a stop bit (1). When receiving, the UART detects the start bit, receives the 8 data bits and converts the data into parallel form when stop bit is received. Since no clock is transmitted, the UART must synchronize the incoming bit stream with the local clock.

Chapter Two


In this chapter we present the introduction to hardware design process using the hard ware description languages and VHDL.

As the size and the complexity of digital system increase, more computer aided design tools are introduced into the hardware design process. The early paper-and-pencil design methods have given way to sophisticated design entry, verification and automatic hardware generation tools. The newest addition to this design methodology is the introduction of hardware description language (HDL), and a great deal of effort is being expended in their development. Actually, the use of this language is not new. Languages such as CDI, ISP and AHPL have been used for last some years. However, their primary application has been the verification of a design's architecture. They do not have the capability to model designs with a high degree of accuracy that is, their timing model is not precise and/or their language constructs imply a certain hardware structure. Newer languages such as HHDL, ISP and VHDL have more universal timing models and imply no particular hardware structure. A general way of design process using the HDLs is shown in Fig 2.1

Hardware description languages have two main applications, documenting a design and modeling it. Good documentation of a design helps to ensure design accuracy and design portability. Since a simulator supports them, the model inherent in an HDL description can be used to validate a design. Prototyping of complicated system is extremely expensive, and the goal of those concerned with the development of hardware languages is to replace this prototyping process with validation through simulation. Other uses of HDL models are test generation and silicon compilation.

2.1 Use of VHDL tools in VLSI design:

IC designers are always looking for way to increase their productivity without degrading the quality of their designs. So it is no wonder that they have embraced logic synthesis tools. In the few last years these tools have grown to be capable for producing design as good as a human designer. Now logic synthesis is helping to bring about a switch to design using a hardware description language to describe the structure and behavior of the circuits, as evidenced by the recent availability of logic synthesis tools using the very high-speed integrated circuit hardware description language (VHDL). Now logic synthesis tools can automatically produce a gate level net list allowing designers to formulate their design in a high level description such as VHDL.

Logic synthesis provided two fundamental capabilities; automatic translation of high-level description into logic designs and optimization to decrease the circuit area and increase its speed. Many designs created with logic synthesis tools are as good as or better than those created manually, in terms of chip area occupied and IC signal speed.

The ability to translate a high level description into a net list automatically can improve design efficiency markedly. It quickly gives designers an accurate estimate of their logic potential speed and chip real estate needs. In addition, designers can quickly implement a variety of architectural choices and compare area and speed characteristics.

In a design methodology based on synthesis, the designer begins by describing a design's behavior in high-level code capturing its intended functionality rather than its implementation. Once the functionality has been thoroughly verified through simulation, the designer reformulates the design in terms of large structural blocks such as registers, arithmetic units, storage registers and combinational logic typically constitutes only about 20% of a chips area. This creation can easily absorb 80% of time in gate level design. The resulting description is called Register Transfer Level (RTL), since the equation describes how the data is transferred from one register to another.

In a logic synthesis process, the tool's first step is to minimize the logical equations complexity and hence the size by finding the common terms that can be used repeatedly. In a translation step called technology mapping, the minimized equations are mapped into a set of gates. The non-synthesized portions of the logic are also mapped into a technology specific implementation at this point. Here the designer must choose the application specific integrated circuit (ASIC) vendor library in which to implement the chip, so that the logic synthesis tool may efficiently apply the gates available in that library.

The primary consideration in the entire synthesis process is the quality of the resulting circuit. Quality in logic synthesis is measured by how close the circuit comes to meet the designer's speed, chip area and power goals. These goals can apply to the entire IC or the portions of the logic.

Logic synthesis has achieved its greatest success on synchronous designs that have significant amounts of combinational logic. Asynchronous designs require that designers formulate timing constraints explicitly. Unlike the behavior of asynchronous designs, the behavior of synchronous designs is not affected by events such as the arrival of signals. By devising a set of constraints that a synthesis tools have to meet, the designer directs the process towards the most desirable solution.

Although it might be desirable to built a given circuit that is both small and fast, area typically trades off with speed. Thus designers must choose the trade off point that is best for a specific application.

When a designer starts a synthesis process by translating an RTL description into a netlist, the synthesis tools must first be able to understand the RTL description. A number of languages known as the Hardware Description Languages (HDLs) have been developed for this purpose. HDL statements can describe circuits in terms of the structure of the system or behavior or both. One reason HDLs are so powerful infact is that they support a wide variety of design descriptions.

An HDL simulator handles all those descriptions, applying the same simulation and test vectors from the designs behavioral level all the way down to the gate level. This integrated approach reduces the problems.

As logic synthesis matures, it will allow designers to concentrate more on the actual function and behavior rather than the details of the circuit. Logic synthesis tools are becoming capable of more behavior level tasks such as synthesizing sequential logic and deciding if and where the storage elements are needed in a design. Existing logic synthesis tools are moving up the design ladder while behavioral research is extending down to the RTL level. Eventually they will merge, giving designers a complete set of tools to automate designs from concept to layout.

2.2 Scope of VHDL:

VHDL satisfies all the requirements for the hierarchical description of electronic circuits from system level down to switch level. It can support all levels of timing specifications and constraints and is capable of detecting and signaling timing variations. The language models the reality of concurrency present in digital system and support the recursive nature of finite state machines. The concepts of packages and configurations allow the design libraries for the reuse of previously designed parts.

2.3 Why VHDL?

A design engineer A designer engineer in electronic industry uses hardware description language to keep pace with the productivity of the competitors. With VHDL we can quickly describe and synthesize circuits of several thousand gates. In addition VHDL provides the capability described as follows.

  • Power and flexibility
  • Device- Independent design
  • Portability
  • Benchmarking capabilities
  • ASIC migration
  • Quick time-to-market and low cost
  • 2.3.1 Power and Flexibility

    VHDL has powerful language constructs with which we can write descriptions of complex control logic very easily. It also has multiple levels of design descriptions for controlling the design implementation. It supports design libraries and creation of reusable components. It provides design hierarchies to create modular designs. It is one language for design and simulation.

    2.3.2 Device - Independent Design

    VHDL permits to create a design without having to first choose a device for implementation. With one design description, we can target many device architectures. With out being familiar it, we can optimize our design for resource utilization or performance. It permits multiple style of design description.

    2.3.3 Portability

    VHDL portability permits to simulate the same design description that we have synthesized. Simulating a large design description before synthesizing can save considerable time. As VHDL is a standard, design description can be taken from one simulator to another, one synthesis tool to another. One plat form to another means design description can be used in multiple projects. The Fig 2.2 illustrates that the source code for a design can be used with ant synthesis tool and the design can be implemented in any device that is supported by a synthesis tool.

    2.3.4 Benchmarking Capabilities

    Device independent design and portability allows benchmarking a design using different device architectures and different synthesis tools. We can take a completed design description and synthesize it, create logic for it, evaluate the results and finally choose the device - a CPLD or an FPGA that best suits our design requirements

    2.3.5 ASIC Migration

         The efficiency that VHDL has allows our product to hit the market quickly if it has been synthesized on a CPLD or FPGA. When production volume reaches appropriate levels, VHDL facilitates the development of Application Specific Integrated Circuit(ASIC). Some times, the exact code used with CPLD can be used with the ASIC and because VHDL is a well-defined language, we can be assured that our ASIC vendor will deliver a device with expected functionality.

    2.3.6 Quick time-to-market and low cost

         VHDL and Programmable logic pair together facilitate a speedy design process. VHDL permits designs to be described quickly. Programmable logic eliminates NRE expenses and facilitates quick design iterations. Synthesis makes it all possible. VHDL and programmable logic combine as a powerful vehicle to bring the products in market in record time.

    2.4 Design Synthesis

         The design process can be explained in six steps.

    1. Define the design requirements
    2. Describe the design in VHDL
    3. Simulate the source code
    4. Synthesize, optimize and fit (place and route) the design
    5. Simulate the post layout (fit) design model
    6. Program the device.

    2.4.1 Define the Design Requirements

    Before launching into writing code for our design, we must have a clear idea of design objectives and requirements. That is, the function of the design, required setup and clock-to-output times, maximum frequency of operation and critical paths.

    2.4.2 Describe the design in VHDL

    Formulate the Design: Having an idea of design requirements, we have to write an efficient code that is realized, through synthesis, to the logic implementation we intended.

    Code the design: after deciding upon the design methodology, we should code the design referring to the block, data flow and state diagrams such that the code is syntactically and sequentially correct.

    2.4.3 Simulate the source code

    With source code simulation, flaws can be detected early in the design cycle, allowing us to make corrections with the least possible impact to the schedule. This is more efficient for larger designs for which synthesis and place and route can take a couple of hours.

    2.4.4 Synthesize, optimize and fit the design

    Synthesis: it is a process by which netlists or equations are created from design descriptions, which may be abstract. VHDL synthesis software tools convert VHDL descriptions to technology specific netlists or set of equations.

    Optimization: The optimization process depends on three things: the form of the Boolean expressions, the type of resources available, and automatic or user applied synthesis directives (sometimes called constraints). Optimization for CPLD's involves reducing the logic to a minimal sum-of-products, which is then further optimized for a minimal literal count. This reduces the product term utilization and number of logic block inputs required for any given expression. Fig 2.3 illustrates the synthesis and optimization processes.

    Fitting: Fitting is the process of taking the logic produced by the synthesis and optimization process and placing it into a logic device, transforming the logic (if necessary) to obtain the best fit. It is a term typically used to describe the process of allocating resources for CPLD-type architectures. Placing and routing is the process of taking the logic produced by synthesis and optimization, transforming it if necessary, packing it into the FPGA logic structures (cells), placing the logic cells in optimal locations and routing the signals from logic cell to logic cell or I/O. Place and route tools have a large impact on the performance of FPGA designs. Propagation delays can depend significantly on routing delays. Fitting design in CPLD can be a complicated process because of numerous ways in which logic can be placed in the device. Before any placement, the logic equations have to be further optimized depending upon the available resources. Fig 2.4 shows the process of synthesizing, optimizing and fitting a design into a CPLD and an FPGA.

    2.4.5 Simulate the post layout design model

    A post layout simulation will enable us to verify, not only functionality of our design, but also the timing, such as setup, clock-to-output, the register-to-register, and/or fit our design to a new logic device.

    2.4.6 Program the device

    After completing the design description, synthesizing, optimizing, fitting and successfully simulating our design, we are ready to program our device and continue work on the rest of our system designs. The synthesis, optimization, and fitting software will produce a file for use in programming the device.

    2.5 Design Tool Flow:

    The above topics cover the design process. The Fig 2.5 shows the EDA tool flow diagram. It shows the inputs and outputs for each tool used in the design process.

    The inputs to the synthesis software are the VHDL design source code, synthesis directives and device selection. The output of the synthesis software - an architecture specific netlist or set of equations - is then used as the input to the fitter (or place and route software depending on whether the target device is a CPLD or FPGA). The outputs of this tool are information about resource utilization, static, point-to-point, timing analysis, a device programming file and a post layout simulation model. The simulation model along with a test bench or other stimulus format is used as the input to the simulation software. The output of the simulation software are often waveforms or data files.

    2.6. History of VHDL

    In the search for a standard design and documentation tool for the Very High Speed Integrated Circuits (VHSIC) program, the United States Department of Defense (DoD), in 1981, sponsored a workshop on Hardware Description Languages (HDL) at Woods Hole, Massachusetts. In 1983, the DoD established requirements for a standard VHSIC Hardware Description Language (VHDL) based on the recommendations of the “Woods Hole” workshop. A contract for the development of the VHDL, its environment and its software was awarded to IBM, Texas Instruments and Intermetrics Corporations. The time line of VHDL is as follows.

    • Woods Hole Requirements, 1981
    • Intermetrics, TI, IBM under DoD contract 1983-1985: VHDL 7.2
    • IEEE Standardization : VHDL 1987
    • First synthesized chip, IBM 1988

    2.7 Describing a design in VHDL:

    In VHDL an entity is used to describe a hardware module. An entity can be described using,

    1. Entity declaration.
    2. Architecture.
    3. Configuration.
    4. Package declaration.
    5. Package body.

    Entity declaration:

    It defines the names, input output signals and modes of a hardware module.


    entity entity_name is
    Port declaration;
    end entity_name;

    An entity declaration should starts with ‘entity' and ends with ‘end' keywords.

    Ports are interfaces through which an entity can communicate with its environment. Each port must have a name, direction and a type. An entity may have no port declaration also. The direction will be input, output or inout.

    In Port can be read
    Out Port can be written
    Inout Port can be read and written
    BufferPort can be read and written, it can have only one source.


         It describes the internal description of design or it tells what is there inside design. Each entity has at least one architecture and an entity can have many architecture. Architecture can be described using structural, dataflow, and behavioral or mixed style.

    Architecture can be used to describe a design at different levels of abstraction like gate level, register transfer level (RTL) or behavior level.


    architecture architecture_name of entity_name
    end architecture_name;

    Here we should specify the entity name for which we are writing the architecture body. The architecture statements should be inside the begin and end keyword. Architecture declarative part may contain variables, constants, or component declaration.


    If an entity contains many architectures andany one of the possible architecture binding with its entity is done using configuration. It is used to bind the architecture body to its entity and a component with an entity.


    configuration configuration_name of entity_name is
    end configuration_name.
    Block_configuration defines the binding of components in a block. This can be written as
    for block_name
    component_binding; end for;
    block_name is the name of the architecture body. Component binding binds the components of the block to entities. This can be written as,
    for component_labels:component_name
    end for;

    Package declaration:

    A VHDL package declaration is identified by the package keyword, and is used to collect commonly used declarations for use globally among different design units. A package may be as a common storage area, one used to store such things as type declarations, constants, and global subprograms. Items defined within a package can be made visible to any other design unit in the complete VHDL design, and they can be compiled into libraries for later re-use. A package can consist of two basic parts: a package declaration and an optional package body. Package declarations can contain the following types of statements:

    • Type and subtype declarations
    • Constant declarations
    • Global signal declarations
    • Function and procedure declarations
    • Attribute specifications
    • File declarations
    • Component declarations
    • Alias declarations
    • Disconnect specifications
    • Use clauses

    Items appearing within a package declaration can be made visible to other design units through the use of a use statement.


    package package_name is
    end package_name;

    Package body:

    If the package contains declarations of subprograms (functions or procedures) or defines one or more deferred constants (constants whose value is not immediately given), then a package body is required in addition to the package declaration. A package body (which is specified using the package body keyword combination) must have the same name as its corresponding package declaration, but it can be located anywhere in the design, in the same or a different source file. A package body is used to declare the definitions and procedures that are declared in corresponding package. Values can be assigned to constants declared in package in package body.


    package body package_name is
    Function_procedure definitions;
    end package_name;

    2.8 Modeling Hardware with VHDL:

    The internal working of an entity can be defined using different modeling styles inside architecture body. They are

    1. Dataflow modeling.
    2. Behavioral modeling.
    3. Structural modeling.

    Dataflow modeling:

    In this style of modeling, the internal working of an entity can be implemented using concurrent signal assignment. The dataflow modeling often called register transfer logic, or RTL. There are some drawbacks to using a dataflow method of design in VHDL. First, there are no built-in registers in VHDL; the language was designed to be general-purpose and the emphasis was placed by VHDL's designers on its behavioral aspects.

    Behavioral modeling:

    The highest level of abstraction supported in VHDL is called the behavioral level of abstraction. When creating a behavioral description of a circuit, we describe our circuit in terms of its operation over time. The concept of time is the critical distinction between behavioral descriptions of circuits and lower-level descriptions (specifically descriptions created at the dataflow level of abstraction). Examples of behavioral forms of representation might include state diagrams, timing diagrams and algorithmic descriptions. In a behavioral description, the concept of time may be expressed precisely, with actual delays between related events (such as the propagation delays within gates and on wires), or it may simply be an ordering of operations that are expressed sequentially (such as in a functional description of a flipflop).

    In this style of modeling, the internal working of an entity can be implemented using set of statements.

    It contains:

    • Process statements
    • Sequential statements
    • Signal assignment statements
    • Wait statements

    Process statement is the primary mechanism used to model the behavior of an entity. It contains sequential statements, variable assignment (:=) statements or signal assignment (<=) statements etc. It may or may not contain sensitivity list. If there is an event occurs on any of the signals in the sensitivity list, the statements within the process is executed.

    Inside the process the execution of statements will be sequential and if one entity is having two processes the execution of these processes will be concurrent. At the end it waits for another event to occur.

    Structural modeling:

    The third level of abstraction, structure, is used to describe a circuit in terms of its components. Structure can be used to create a very low-level description of a circuit (such as a transistor-level description) or a very high-level description (such as a block diagram).

    In a gate-level description of a circuit, for example, components such as basic logic gates and flip-flops might be connected in some logical structure to create the circuit. This is what is often called a netlist.

    For a higher-level circuit - one in which the components being connected are larger functional blocks - structure might simply be used to segment the design description into manageable parts.

    Structure-level VHDL features, such as components and configurations, are very useful for managing complexity. The use of components can dramatically improve your ability to re-use elements of designs, and they can make it possible to work using a top-down design approach. The implementation of an entity in structural modeling is done through set of interconnected components.

    It contains:

    • Signal declaration.
    • Component instances
    • Port maps.
    • Wait statements.

    2.9 Test Benches:

    One of the main reasons for using VHDL is that it is a powerful test stimulus language. As logic designs become more complex it is critical to have complex and comprehensive verification. To simulate the design an additional VHDL program called a test bench is required. They are used to apply a stimulus to the circuit over time and write the results to a screen or report file for analysis. Test benches can be used to:-

    • verify the design function (with no delays)
    • check assumption about timing relationships (using estimates or unit delays)
    • simulate with post-route timing information
    • verify the circuit at speed

    During simulation the test bench will be at the top of the design hierarchy.

    Chapter Three


    In this chapter an introduction to the various programmable logic devices, architectures, features and process of implementing a logic design are discussed.

    A programmable logic device or PLD is an electronic component used to build digital circuits. Unlike a logic gate, which has a fixed function, a PLD has an undefined function at the time of manufacture. Before the PLD can be used in a circuit, it must be programmed. These were the first chips that could be used to implement a flexible digital logic design in hardware. Other names we might encounter for this class of device are Programmable Logic Array (PLA), Programmable Array Logic (PAL), and Generic Array Logic (GAL).

    PLDs are often used for address decoding, where they have several clear advantages over the 7400-series TTL parts that they replaced. First, of course, is that one chip requires less board area, power, and wiring than several do. Another advantage is that the design inside the chip is flexible, so a change in the logic doesn't require any rewiring of the board. Rather, the decoding logic can be altered by simply replacing that one PLD with another part that has been programmed with the new design.

    Inside each PLD is a set of fully connected macrocells. These macrocells are typically comprised of some amount of combinatorial logic (AND and OR gates, for example) and a flip-flop. In other words, a small Boolean logic equation can be built within each macrocell. This equation will combine the state of some number of binary inputs into a binary output and, if necessary, store that output in the flip-flop until the next clock edge.

    Of course, the particulars of the available logic gates and flip-flops are specific to each manufacturer and product family. But the general idea is always the same. Hardware designs for these simple PLDs are generally written in languages like ABEL or PALASM (the hardware equivalents of assembly) or drawn with the help of a schematic capture tool.

    Programmable logic devices (PLDs) are divided into 3 basic architecture types, SPLD, CPLD and FPGA. Figure 3.1 shows the architecture tree of PLD.

    3.1 SPLD-architecture (Simple PLD-architecture, also known as PAL-architecture)

    In 1978 the first real PLD was developed by Monolitic Memories Inc. (MMI) called as “Programmable Array Logic” (PAL). This was the first kind of PLD-architecture. A SPLD-architecture based device consists of two or more vendor-specific macrocells to realize the logic functions. The PLDs in the following years were all based on this architecture type with some improvements. Fig 3.2 shows the macrocell of a SPLD

    The architecture had a mesh of horizontal and vertical interconnect tracks. At each junction was a fuse. With the aid of software tools, designers could select which junctions would not be connected by “blowing” all unwanted fuses. (This was done by a device programmer, but more commonly these days is achieved with ISP).

    Input pins were connected to the vertical interconnect. The horizontal tracks were connected to AND-OR gates, also called “product terms”. These in turn connected to dedicated flip-flops, whose outputs were connected to output pins. PLDs provided as much as 50 times more gates in a single package than discrete logic devices! This was a huge improvement, not to mention fewer devices needed in inventory and a higher reliability over standard logic. Flash PLDs provide the ability to program the devices time and time again, electrically programming and erasing the device.

    Significant characteristics for the SPLD-architecture:

    »one macrocell per output

    »minimum two macrocells per device

    »typically all macrocells identical

    »one product term per macrocell

    »product term typically generated by a AND-matrix and OR-matrix

    »minimum one matrix (AND/OR) programmable

    »dedicated flip-flop (FF) per macrocell

    Typical vendor-specific names for PLDs with SPLD-architecture:

    »PAL (Programmable Array Logic)

    »GAL (Generic Array Logic)

    »PLA (Programmable Logic Array)

    Main-advantages :

    »predictable timing

    »easy to develop

    Main-disadvantages :

    »inefficient resource utilization

    »only for simple logic functions

    3.2 CPLD-architecture (Complex PLD-architecture)

    To reach more complexity, the logical consequence of the SPLD-architecture was the CPLD-architecture. A “complex programmable logic device” (CPLD) contains many SPLD-like (PAL-like) devices interconnected via a programmable switch matrix. A CPLD contains a bunch of PLD blocks whose inputs and outputs are connected together by a global interconnection matrix. So a CPLD has two levels of programmability: each PLD block can be programmed, and then the interconnections between the PLDs can be programmed. The architectures of a CPLD is shown below in Fig 3.3.

    The SPLD-like devices were called logic-blocks, which contain many SPLD-like macrocells. Some PLD-vendors developed their own logic-block or switch-matrix architecture and gave them vendor-specific names. CPLDs are great at handling wide and complex gating at blistering speeds - 5 nanoseconds, for example, which is equivalent to 200 MHz.

    Significant characteristics for the CPLD-architecture :

    »product terms generated in programmable macrocells

    »typically one dedicated flip-flop per macrocell

    »many macrocells per logic-block

    »typically all logic-blocks identical

    »minimum two logic-blocks per device

    »routing between logic-blocks via global switch matrix

    Typical vendor-specific names for PLDs with CPLD-architecture :

    »CPLD (Complex Programmable Logic Device)

    »EPLD (Electrical Programmable Logic Device)

    »EPLD (Erasable Programmable Logic Device)

    »EEPLD (Electrically-Erasable Programmable Logic Device)

    »SPLD (Segmented Programmable Logic Device)

    »XPLD (eXpanded Programmable Logic Device)

    Main-advantages :

    »predictable timing

    »fast pin-to-pin delay

    »efficient resource utilization by switch-matrix

    »medium design complexities possible

    Main-disadvantages :

    »higher complexities need a very complex (expensive) switch-matrix

    3.3 FPGA-architecture

    To reach very high complexities the channel-based routing strategy was forced instead of the CPLD switch-matrix strategy. The FPGA-architecture consists of many logic-modules, which are placed in an array-structure. The channels between the logic-modules are used for routing. The array of logic-modules is surrounded by programmable I/O-modules and connected via programmable interconnects. This freedom of routing allows every logic-module to reach every other logic-module or I/O-module. The worldwide first PLD with FPGA-architecture was developed by Xilinx in 1984.

    There are two FPGA architecture subclasses, depending on the granularity of the logic-modules. Coarse-grained and fine-grained FPGAs. The coarse-grained FPGAs have very large logic-modules with sometimes two or more sequential logic elements, and the fine-grained have very simple logic-modules. The FPGA-architecture offers the highest programmable logic capacity. Fig 3.4 shows the architectures of FPGA.

    There are two basic types of FPGAs: SRAM-based reprogrammable and OTP (One Time Programmable). These two types of FPGAs differ in the implementation of the logic cell and the mechanism used to make connections in the device. The dominant type of FPGA is SRAM-based.

    In the SRAM logic cell, instead of conventional gates, an LUT determines the output based on the values of the inputs. In the “SRAM logic cell” diagram, shown in Fig 3.5, six different combinations of the four inputs determine the values of the output.) SRAM bits are also used to make connections.

    OTP FPGAs use anti-fuses (contrary to fuses, connections are made, not “blown,” during programming) to make permanent connections in the chip. Every time there is a design change, we must throw away the chip! The OTP logic cell is very similar to PLDs, with dedicated gates and flip-flops.

    Significant characteristics for the FPGA-architecture :

    »array of logic-modules

    »different logic-modules possible

    »distinction between FPGAs with coarse-grained or fine-grained logic-modules

    »coarse-grained have minimum one combinatorial and one sequential element

    »fine-grained have typically separated combinatorial- and sequential-modules

    »routing-channels physically between logic-modules

    »every logic-module can be interconnected to any other logic-modul or I/O-module

    Typical vendor-specific names for PLDs with FPGA-architecture :

    »FPGA (Field Programmable Gate Array)

    »LCA (Logic Cell Array)

    »pASIC (programmable ASIC)

    »SPGA (System Programmable Gate Array)

    »XPGA (eXpanded Programmable Gate Array)

    Main-advantages :

    »efficient resource utilization

    »very high complexities possible

    »high system frequencies possible

    Main-disadvantages :

    »no predictable timing (some exceptions)

    »100% interconnect is very expensive

    3.4 Implementing a Logic Design in FPGA or CPLD

    Implementing a logic design with the FPGA or CPLD development software usually consists of the following steps (depicted in the Fig 3.7 below):

    1. Enter a description of logic circuit using a hardware description language (HDL) such as VHDL or Verilog. Or a design can be drawn using a schematic editor.
    2. Use a logic synthesizer program to transform the HDL or schematic into a netlist. The netlist is just a description of the various logic gates in the design and how they are interconnected.
    3. Use the implementation tools to map the logic gates and interconnections into the FPGA. The configurable logic blocks in the FPGA can be further decomposed into look-up tables that perform logic operations. The CLBs and LUTs are interwoven with various routing resources. The mapping tool collects netlist gates into groups that fit into the LUTs and then the place & route tool assigns the gate collections to specific CLBs while opening or closing the switches in the routing matrices to connect the gates together.
    4. Once the implementation phase is complete, a program extracts the state of the switches in the routing matrices and generates a bit stream where the ones and zeroes correspond to open or closed switches.
    5. The bit stream is downloaded into a physical FPGA chip. The electronic switches in the FPGA open or close in response to the binary bits in the bit stream. Upon completion of the downloading, the FPGA will perform the operations specified by HDL code or schematic. Apply input signals to the I/O pins of the FPGA to check the operation of design.

    Chapter Four


    In this chapter, the operation of UART is explained using block diagram. The following topics are covered.

    • Block diagram
    • Transmitter operation
    • Receiver operation
    • Baud rate generator operation.

    UART is used within microcontrollers like MC6805, MC6811 and other microcontrollers. Following 8 bit registers are used:

    • RSR (Receive Shift Register) - Receives the bits sequentially from RxD and shift them to right.
    • RDR (Receive Data Register) - Receives data byte from RSR and places it on the data bus.
    • TDR (Transmit Data Register) - It receives byte of data from data bus to be transmitted and transfer it to TSR.
    • TSR (Transmit Shift Register) - It is a shift register used to transmit data bit wise by shifting each bit to right.
    • SCCR (Serial Communication Control Register) - The lower 3 bits in this register are used to select the required baud rate.
    • SCSR (Serial Communication Status Register) - This is a status register, which holds the flags to show status of different registers.

    4.1 Transmitter

    The operation of UART Transmitter is as follows:

    1. The microprocessor/micro controller waits until TDRE = 1 and then loads a byte of data into TDR and clears TDRE.
    2. The UART transfers data from TDR to TSR and sets TDRE.
    3. The UART outputs a start bit (0) for one bit time and then shifts TSR right to transmit the 8 bit data followed by a stop bit (1).

    4.2 Receiver

    The operation of UART Receiver is as follows:

    1. When UART detects a start bit, it reads the remaining bits serially and shifts them into the RSR.
    2. When all the data bits and stop bit are received, the RSR is loaded into the RDR and RDRF flag is set in the SCSR.
    3. The micro controller checks the RDRF flag and if set, the RDR is read and flag is cleared.

    The bit stream coming in on the RxD is not synchronized with the local clock (BCLK). If the bit rate of incoming bit stream differed from clock BCLK by a small amount, then it results reading wrong data. To avoid this problem, we have to sample the input data stream at much higher frequency than BCLK. Here it is assumed to sample the input data at 8 times the BCLK that is with a frequency BCLK*8. Therefore, reading is done continuously once every eight BCLK* 8 clocks until stop bit is detected. The sampling of RxD with BCLK*8 is shown in Fig 4.4

    Two counters namely CT1 and CT2 are used for the detection of RxD. CT1 counts the number of BCLK*8 clock pulses and CT2 to count the number of bits received after the start bit. If CT2 is 8 then all 8 data bits have been read and is in the middle of stop bit. If RDRF is 1 then, microcontoller has not yet read the previous data byte and hence Overrun Error occurred and OE flag is set. If we do not find the stop bit then Framing Error has occurred and hence FE flag is set.

    4.3 Clock Divider

    Three bits in the SCCR are used to select any of the eight-baud rates. Here it is assumed that the system clock is 8 MHz and the required baud rates are 300, 600, 1200, 2400, 4800, 9600, 19200 and 38400. Therefore, the maximum BCLK*8 frequency needed is 38400*8 = 307200. To get this frequency, the system clock has to be divided by factor of 8 M/307200 = 26.04167. Since division by an integer is only possible, a small amount of error in baud rate is generated and is accepted.

    Fig 4.5 shows the block diagram for the baud rate generator. Using a counter, the 8 MHz system clock is first divided by 13. The output of this counter goes to an 8-bit binary counter. The output of the flip-flops in this counter corresponds to divide by 2, divide by 4 and so on up to divide by 256. One of these outputs is selected by the multiplexer. The MUX selects inputs coming from the lower 3 bits of the SCCR. The MUX output corresponds to BCLK*8, which is further divide by 8 to give BCLK.

    Chapter Five


    In this chapter, we present the Design Entity and Strategy considered for developing the software. The following topics are covered.

    • Design entity
    • Functional Description of pins

    5.1 The final Design Entity of UART

    Functional Description of Pins:

    1. SCI_Sel: This is the chip select signal. SCI stands for Serial Communication Interface.
    2. R_W: This indicates READ or WRITE operations
    3. Clk: This is connected to the system clock in which UART is incorporated.
    4. rst_b: Reset pin.
    5. RxD Receiving Data pin.
    6. ADDR2 These 2 address lines along with R_W are used to select the mode of operation as summarized in the table below.
    7. Dbus: This is the data bus, which connects the UART and Processor.
    8. SCI_IRQ: This is the interrupt signal to the processor.
    9. TxD: Transmit Data pin.

    5.2 The design Entities of sub modules

    Functional Description of Pins:

    1. Bclk: This is the baud rate clock signal from the clock divider circuit.
    2. Sysclk: This is the system clock signal in which UART is incorporated.
    3. rst_b: Reset pin
    4. TDRE: This flag indicates the status of the TDR (Transmit Data Register)
    5. Load TDR: Signal to transfer data from data bus to TDR.
    6. Dbus: This is the data bus, which connects the transmitter and other components.
    7. Set TDRE: Signal to set the TDRE flag.
    8. TxD: Transmit Data pin.

    Functional Description of Pins:

    1. RxD: Receiving Data pin.
    2. BclkX8: This is the clock signal, which is 8 times more than the baud rate clock, generated in the clock divider circuit.
    3. Sysclk: This is the system clock signal in which UART is incorporated.
    4. rst_b: Reset pin
    5. RDRF: This flag indicates the status of RDR (Receive Data Register)
    6. RDR: Receive Data Register
    7. Set RDRF: Signal to set the RDRF Flag.
    8. Set OE: Signal to set the OE (Over run Error) flag.
    9. Set FE: Signal to set the FE (Frame Error) flag.

    Functional Description of Pins:

    1. Sysclk: This is connected to the system clock in which UART is incorporated
    2. rst_b: Reset pin
    3. Sel: These are 3 selection lines, which are used to select the required baud rate.
    4. BclkX8: This is the clock signal, which is 8 times more than the baud rate clock. Serves as clock for the Receiver module.
    5. Bclk: This is the baud rate clock signal. Serves as clock for the Transmitter module.

    Chapter Six


    In this chapter, VHDL Software code developed to implement the design entities is presented. The VHDL code is developed for the following:

    • Clock Divider.
    • UART Transmitter.
    • UART Receiver.
    • Complete UART.

    6.1 Clock Divider

    VHDL Code for Clock Divider:

    library ieee;

    use ieee.std_logic_1164.all;

    use ieee.std_logic_unsigned.all;

    entity clk_divider is

    port(Sysclk, rst_b: in std_logic;

    Sel: in std_logic_vector(2 downto 0);

    BclkX8: buffer std_logic;

    Bclk: out std_logic);

    end clk_divider;

    architecture baudgen of clk_divider is

    signal ctr1: std_logic_vector (3 downto 0):= “0000”; -- divide by 13 counter

    signal ctr2: std_logic_vector (7 downto 0):= “00000000”; -- div by 256 ctr

    signal ctr3: std_logic_vector (2 downto 0):= “000”; -- divide by 8 counter

    signal Clkdiv13: std_logic;


    process (Sysclk) -- first divide system clock by 13


    if (Sysclk'event and Sysclk = ‘1') then

    if (ctr1 = “1100”) then ctr1 <= “0000”;

    else ctr1 <= ctr1 + 1; end if;

    end if;

    end process;

    Clkdiv13 <= ctr1(3); -- divide Sysclk by 13

    process (Clkdiv13) -- clk_divdr is an 8-bit counter


    if (rising_edge(Clkdiv13)) then

    ctr2 <= ctr2 + 1;

    end if;

    end process;

    BclkX8 <= ctr2(CONV_INTEGER(sel)); -- select baud rate

    process (BclkX8)


    if (rising_edge(BclkX8)) then

    ctr3 <= ctr3 + 1;

    end if;

    end process;

    Bclk <= ctr3(2); -- Bclk is BclkX8 divided by 8

    end baudgen;

    6.2 UART Transmitter

    The software for the UART Transmitter is developed by considering the whole operation in terms of different states. The sates are IDLE, SYNCH and TDATA. The sates are defined as follows.

    State Description

    IDLE: This is the idle state. In this state the transmitter is in a continuous loop to check the status of the TDRE flag. If TDR is empty, a data byte is loaded.

    SYNCH: This is the synchronization state. In this state, data is synchronized with the system clock for orderly flow of data transmission.

    TDATA: In this state, actual data transmission takes place.

    VHDL Code for UART Transmitter:

    library ieee;

    use ieee.std_logic_1164.all;

    entity UART_Transmitter is

    port(Bclk, sysclk, rst_b, TDRE, loadTDR: in std_logic;

    DBUS:in std_logic_vector(7 downto 0);

    setTDRE, TxD: out std_logic);

    end UART_Transmitter;

    architecture xmit of UART_Transmitter is

    type stateType is (IDLE, SYNCH, TDATA);

    signal state, nextstate : stateType;

    signal TSR : std_logic_vector (8 downto 0); -- Transmit Shift Register

    signal TDR : std_logic_vector(7 downto 0); -- Transmit Data Register

    signal Bct: integer range 0 to 9; -- counts number of bits sent

    signal inc, clr, loadTSR, shftTSR, start: std_logic;

    signal Bclk_rising, Bclk_dlayed: std_logic;


    TxD <= TSR(0);

    setTDRE <= loadTSR;

    Bclk_rising <= Bclk and (not Bclk_dlayed); -- indicates the rising edge of bit clock

    Xmit_Control: process(state, TDRE, Bct, Bclk_rising)


    inc <= ‘0'; clr <= ‘0'; loadTSR <= ‘0'; shftTSR <= ‘0'; start <= ‘0'; -- reset control signals

    case state is

    when IDLE => if (TDRE = ‘0') then

    loadTSR <= ‘1'; nextstate <= SYNCH;

    else nextstate <= IDLE; end if;

    when SYNCH => -- synchronize with the bit clock

    if (Bclk_rising = ‘1') then

    start <= ‘1'; nextstate <= TDATA;

    else nextstate <= SYNCH; end if;

    when TDATA =>

    if (Bclk_rising = ‘0') then nextstate <= TDATA;

    elsif (Bct /= 9) then

    shftTSR <= ‘1'; inc <= ‘1'; nextstate <= TDATA;

    else clr <= ‘1'; nextstate <= IDLE; end if;

    end case;

    end process;

    Xmit_update: process (sysclk, rst_b)


    if (rst_b = ‘0') then

    TSR <= “111111111”; state <= IDLE; Bct <= 0; Bclk_dlayed <= ‘0';

    elsif (sysclk'event and sysclk = ‘1') then

    state <= nextstate;

    if (clr = ‘1') then Bct <= 0; elsif (inc = ‘1') then

    Bct <= Bct + 1; end if;

    if (loadTDR = ‘1') then TDR <= DBUS; end if;

    if (loadTSR = ‘1') then TSR <= TDR & ‘1'; end if;

    if (start = ‘1') then TSR(0) <= ‘0'; end if;

    if (shftTSR = ‘1') then TSR <= ‘1' & TSR(8 downto 1); end if; -- shift out one bit

    Bclk_dlayed <= Bclk; -- Bclk delayed by 1 sysclk

    end if;

    end process;

    end xmit;

    6.3 UART Receiver

    The software for the UART Receiver is developed by considering the whole operation in terms of different states. The sates are IDLE, START DETECTED and RECEIVE DATA. The sates are defined as follows.

    State Description

    IDLE: This is the idle state. In this state the receiver is in a continuous loop to check the receiving data for the start bit.

    START DETECTED: In this state, the received start bit is examined for its validity i.e., to check whether the received bit is a valid start bit or not.

    RECEIVE DATA: In this state, actual data reception takes place.

    VHDL Code for UART Receiver:

    library ieee;

    use ieee.std_logic_1164.all;

    entity UART_Receiver is

    port(RxD, BclkX8, sysclk, rst_b, RDRF: in std_logic;

    RDR: out std_logic_vector(7 downto 0);

    setRDRF, setOE, setFE: out std_logic);

    end UART_Receiver;

    architecture rcvr of UART_Receiver is

    type stateType is (IDLE, START_DETECTED, RECV_DATA);

    signal state, nextstate: stateType;

    signal RSR: std_logic_vector (7 downto 0); -- receive shift register

    signal ct1 : integer range 0 to 7; -- indicates when to read the RxD input

    signal ct2 : integer range 0 to 8; -- counts number of bits read

    signal inc1, inc2, clr1, clr2, shftRSR, loadRDR : std_logic;

    signal BclkX8_Dlayed, BclkX8_rising : std_logic;


    BclkX8_rising <= BclkX8 and (not BclkX8_Dlayed);

    -- indicates the rising edge of bitX8 clock

    Rcvr_Control: process(state, RxD, RDRF, ct1, ct2, BclkX8_rising)


    -- reset control signals

    inc1 <= ‘0'; inc2 <= ‘0'; clr1 <= ‘0'; clr2 <= ‘0';

    shftRSR <= ‘0'; loadRDR <= ‘0'; setRDRF <= ‘0'; setOE <= ‘0'; setFE <= ‘0';

    case state is

    when IDLE => if (RxD = ‘0') then nextstate <= START_DETECTED;

    else nextstate <= IDLE; end if;

    when START_DETECTED =>

    if (BclkX8_rising = ‘0') then nextstate <= START_DETECTED;

    elsif (RxD = ‘1') then clr1 <= ‘1'; nextstate <= IDLE;

    elsif (ct1 = 3) then clr1 <= ‘1'; nextstate <= RECV_DATA;

    else inc1 <= ‘1'; nextstate <= START_DETECTED; end if;

    when RECV_DATA =>

    if (BclkX8_rising = ‘0') then nextstate <= RECV_DATA;

    else inc1 <= ‘1';

    if (ct1 /= 7) then nextstate <= RECV_DATA;

    -- wait for 8 clock cycles

    elsif (ct2 /= 8) then

    shftRSR <= ‘1'; inc2 <= ‘1'; clr1 <= ‘1'; -- read next data bit

    nextstate <= RECV_DATA;


    nextstate <= IDLE;

    setRDRF <= ‘1'; clr1 <= ‘1'; clr2 <= ‘1';

    if (RDRF = ‘1') then setOE <= ‘1'; -- overrun error

    elsif (RxD = ‘0') then setFE <= ‘1'; -- framing error

    else loadRDR <= ‘1'; end if; -- load recv data register

    end if;

    end if;

    end case;

    end process;

    Rcvr_update: process (sysclk, rst_b)


    if (rst_b = ‘0') then state <= IDLE; BclkX8_Dlayed <= ‘0&0rsquo;;

    ct1 <= 0; ct2 <= 0;

    elsif (sysclk'event and sysclk = ‘1') then

    state <= nextstate;

    if (clr1 = ‘1') then ct1 <= 0; elsif (inc1 = ‘1') then

    ct1 <= ct1 + 1; end if;

    if (clr2 = ‘1') then ct2 <= 0; elsif (inc2 = ‘1') then

    ct2 <= ct2 + 1; end if;

    if (shftRSR = ‘1') then RSR <= RxD & RSR(7 downto 1); end if;

    -- update shift reg.

    if (loadRDR = ‘1') then RDR <= RSR; end if;

    BclkX8_Dlayed <= BclkX8; -- BclkX8 delayed by 1 sysclk

    end if;

    end process;

    end rcvr;

    6.4 Complete UART:

    VHDL code for complete UART is developed by component insanitation. The parts of UART such as Clock Divider, UART Transmitter and UART Receiver are declared as components and port mapping is done.

    VHDL Code for Complete UART:

    library ieee;

    use ieee.std_logic_1164.all;

    entity UART is

    port (SCI_sel, R_W, clk, rst_b, RxD : in std_logic;

    ADDR2: in std_logic_vector(1 downto 0);

    DBUS : inout std_logic_vector(7 downto 0);

    SCI_IRQ, TxD : out std_logic);

    end UART;

    architecture uart1 of UART is

    component UART_Receiver

    port (RxD, BclkX8, sysclk, rst_b, RDRF: in std_logic;

    RDR: out std_logic_vector(7 downto 0);

    setRDRF, setOE, setFE: out std_logic);

    end component;

    component UART_Transmitter

    port (Bclk, sysclk, rst_b, TDRE, loadTDR: in std_logic;

    DBUS: in std_logic_vector(7 downto 0);

    setTDRE, TxD: out std_logic);

    end component;

    component clk_divider

    port (Sysclk, rst_b: in std_logic;

    Sel: in std_logic_vector(2 downto 0);

    BclkX8: buffer std_logic;

    Bclk: out std_logic);

    end component;

    signal RDR : std_logic_vector(7 downto 0); -- Receive Data Register

    signal SCSR : std_logic_vector(7 downto 0); -- Status Register

    signal SCCR : std_logic_vector(7 downto 0); -- Control Register

    signal TDRE, RDRF, OE, FE, TIE, RIE : std_logic;

    signal BaudSel : std_logic_vector(2 downto 0);

    signal setTDRE, setRDRF, setOE, setFE, loadTDR, loadSCCR : std_logic;

    signal clrRDRF, Bclk, BclkX8, SCI_Read, SCI_Write : std_logic;


    RCVR: UART_Receiver port map(RxD, BclkX8, clk, rst_b, RDRF, RDR, setRDRF,

    setOE, setFE);

    XMIT: UART_Transmitter port map(Bclk, clk, rst_b, TDRE, loadTDR, DBUS,

    setTDRE, TxD);

    CLKDIV: clk_divider port map(clk, rst_b, BaudSel, BclkX8, Bclk);

    -- This process updates the control and status registers

    process (clk, rst_b)


    if (rst_b = ‘0') then

    TDRE <= ‘1'; RDRF <= ‘0'; OE<= ‘0'; FE <= ‘0';

    TIE <= ‘0'; RIE <= ‘0';

    elsif (rising_edge(clk)) then

    TDRE <= (setTDRE and not TDRE) or (not loadTDR and TDRE);

    RDRF <= (setRDRF and not RDRF) or (not clrRDRF and RDRF);

    OE <= (setOE and not OE) or (not clrRDRF and OE);

    FE <= (setFE and not FE) or (not clrRDRF and FE);

    if (loadSCCR = ‘1') then TIE <= DBUS(7); RIE <= DBUS(6);

    BaudSel <= DBUS(2 downto 0);

    end if;

    end if;

    end process;

    -- IRQ generation logic

    SCI_IRQ <= ‘1' when ((RIE = ‘1' and (RDRF = ‘1' or OE = ‘1'))

    or (TIE = ‘1' and TDRE = ‘1'))

    else ‘0';

    -- Bus Interface

    SCSR <= TDRE & RDRF & “0000” & OE & FE;

    SCCR <= TIE & RIE & “000” & BaudSel;

    SCI_Read <= ‘1' when (SCI_sel = ‘1' and R_W = ‘0') else ‘0';

    SCI_Write <= ‘1' when (SCI_sel = ‘1' and R_W = ‘1') else ‘0';

    clrRDRF <= ‘1' when (SCI_Read = ‘1' and ADDR2 = “00”) else ‘0';

    loadTDR <= ‘1' when (SCI_Write = ‘1' and ADDR2 = “00”) else ‘0';

    loadSCCR <= ‘1' when (SCI_Write = ‘1' and ADDR2 = “10”) else ‘0';

    DBUS <= “ZZZZZZZZ” when (SCI_Read = ‘0') -- tristate bus when not reading

    else RDR when (ADDR2 = “00”) -- write appropriate register to the bus

    else SCSR when (ADDR2 = “01”)

    else SCCR; -- dbus = sccr, if ADDR2 is “10” or “11”

    end uart1;

    Chapter Seven


    In this chapter, the results obtained after running the simulation and synthesis process for the UART are presented. The results are categorized as following:

    • Clock Divider.
    • UART Transmitter.
    • UART Receiver.
    • Complete UART.

    Simulation and synthesis are required to ensure that the design works according to the intended application. Simulation is performed by ModelSim XE II /Starter software Version 5.7c. Synthesis is performed by Xilinx ISE Software Version 6.1i. The results obtained are as follows.

    7.1 Clock Divider


    The clock divider circuit is simulated for various selection inputs by creating a test bench from the Test Bench Waveform source of Xilinx software. The system clock is assumed to be 8 MHz (125 ns - time period). Accordingly a clock high time - 52ns, clock low time - 53ns, input setup time - 10ns and output valid delay time of 10ns, total accounting to 125ns are chosen in the test bench waveform. The Test Bench Waveform and simulation result for the selection input “101” is shown in Fig 7.1.


    The synthesized Clock Divider circuit is as shown in Fig 7.2

    7.2 UART Transmitter


    In the Test Bench waveform of Xilinx software, multiple clock feature is selected. The multiple clocks are Sysclk and Bclk. Sysclk is same as Clock Divider. BCLK is assumed to be 9600 (104us). Therefore, BCLK signal is configured as clock high - 42000ns, clock low time - 42000ns, input setup time - 10000 ns and output valid delay time of 10000 ns, total accounting to 104 us. The Test Bench waveform and simulation results for the input “10101010” are as shown in Fig 7.3.


    The synthesized UART Transmitter is as shown in Fig 7.4

    7.3 UART Receiver


    In the Test Bench waveform of Xilinx software, multiple clock feature is selected. The multiple clocks are Sysclk and BclkX8. Sysclk is same as earlier modules. But the receiver clock is BCLK x 8. Hence the clock is 9600 x 8 = 76800. It gives rise to bit period of 13 us. Accordingly, clock high - 5 us, clock low time - 6 us, input setup time - 1us and output valid delay time of 1us, total accounting to 13 us is selected. The Test Bench waveform and simulation results for receiver inputs of “1111000” and “11001100” are as shown in Fig 7.5.


    The synthesized UART Receiver is as shown in Fig 7.6

    7.4 Complete UART

    The complete UART is synthesized by selecting the Virtex family FPGA XCV50-6-BG256. The synthesis report and RTL Schematics generated by the software are analysed. The design is implemented and observed the placement and routed design in the FPGA Floor Planner and Editor programs. Finally a program file is generated. The results are given below.

    Synthesis report:

    Release 6.1i - xst G.2Copyright (c) 1995-2003 Xilinx, Inc. All rights reserved.--> Parameter TMPDIR set to __projnavCPU : 0.00 / 0.41 s | Elapsed : 0.00 / 1.00 s

    --> Parameter xsthdpdir set to ./xst

    CPU : 0.00 / 0.41 s | Elapsed : 0.00 / 1.00 s

    Reading design: uart.prj

    Synthesis Options Summary

    HDL Compilation

    Compiling vhdl file C:/Xilinx/UART/uart_receiver.vhd in Library work.Architecture rcvr of Entity uart_receiver is up to date. Compiling vhdl file C:/Xilinx/UART/uart_transmitter.vhd in Library work.

    Architecture xmit of Entity uart_transmitter is up to date. Compiling vhdl file C:/Xilinx/UART/clk_divider.vhd in Library work. Architecture baudgen of Entity clk_divider is up to date. Compiling vhdl file C:/Xilinx/UART/uart.vhd in Library work. Architecture uart1 of Entity uart is up to date.

    HDL Analysis

    Analyzing Entity (Architecture ).INFO:Xst:1739 - HDL ADVISOR - C:/Xilinx/UART/uart.vhd line 8: declaration of a buffer port will make it difficult for you to validate this design by simulation. It is preferable to declare it as output. Entity analyzed. Unit generated. Analyzing Entity (Architecture ). Entity analyzed. Unit generated. Analyzing Entity (Architecture ). Entity analyzed. Unit generated. Analyzing Entity (Architecture ).Entity analyzed. Unit generated.

    HDL Synthesis

    Synthesizing Unit .

    Related source file is :/Xilinx/UART/clk_divider.vhd. WARNING:Xst:647 - Input is never used.

    Found 1-bit 8-to-1 multiplexer for signal .

    Found 4-bit up counter for signal .

    Found 8-bit up counter for signal .

    Found 3-bit up counter for signal .

    Summary: inferred 3 Counter(s).

         inferred 1 Multiplexer(s).

    Unit synthesized.

    Synthesizing Unit .

    Related source file is C:/Xilinx/UART/uart_transmitter.vhd. Found finite state machine for signal .

    Found 4-bit up counter for signal .

    Found 8-bit register for signal .

    Found 9-bit register for signal .

    Found 7 1-bit 2-to-1 multiplexers.


         inferred 1 Finite State Machine(s).

         inferred 1 Counter(s).

         inferred 18 D-type flip-flop(s).

         inferred 7 Multiplexer(s).

    Unit synthesized.

    Synthesizing Unit .

    Related source file is C:/Xilinx/UART/uart_receiver.vhd. Found finite state machine for signal .

    Found 1-bit register for signal .

    Found 3-bit up counter for signal .

    Found 4-bit up counter for signal .

    Found 8-bit register for signal .


         inferred 1 Finite State Machine(s).

         inferred 2 Counter(s).

         inferred 17 D-type flip-flop(s).

    Unit synthesized.

    Synthesizing Unit .

    Related source file is C:/Xilinx/UART/uart.vhd.

    Found 8-bit tristate buffer for signal .

    Found 3-bit register for signal .

    Found 1-bit register for signal .

    Found 1-bit register for signal .

    Found 1-bit register for signal .

    Found 1-bit register for signal .

    Found 1-bit register for signal .

    Found 1-bit register for signal .


         inferred 9 D-type flip-flop(s).

         inferred 8 Tristate(s).

    Advanced HDL Synthesis

    Selecting encoding for FSM_1 ...Optimizing FSM on signal with one-hot encoding. Selecting encoding for FSM_0 ...Optimizing FSM on signal with one-hot encoding.

    Low Level Synthesis

    Optimizing unit ... Optimizing unit ...Optimizing unit ...Loading device for application Xst from file ‘v50.nph' in environment C:/Xilinx. Mapping all equations... Building and optimizing final netlist ... Found area constraint ratio of 100 (+ 5) on block uart, actual ratio is 8.

    Final Report

    Final Results

    Device utilization summary:





    (*) This 1 clock signal(s) are generated by combinatorial logic, and XST is not able to identify which are the primary clock signals. Please use the CLOCK_SIGNAL constraint to specify the clock signal(s) generated by combinatorial logic.

    Timing Summary:

    Speed Grade: -6

    Minimum period: 11.093ns (Maximum Frequency: 90.147MHz)

    Minimum input arrival time before clock: 7.861ns

    Maximum output required time after clock: 10.823ns

    Maximum combinational path delay: 10.931ns

    Timing Detail:

    All values displayed in nanoseconds (ns)


    The UART is designed using VHDL. The Transmitter, Receiver and Clock Divider are designed separately. PORT MAPping is used to call each sub module. VHDL can be used to ensure the design accuracy and design portability. It replaces the costly prototyping process with validation through simulation. So, VHDL is a better way to design and simulate the complicated systems.

    The operation of Clock Divider, Transmitter and Receiver are tested by simulating their behavior with ModelSim XE II /Starter software Version 5.7c. Synthesis is performed by Xilinx ISE Software Version 6.1i.

    The VHDL ability to design complicated circuits is appreciated by the RTL Schematics generated by the synthesis software. The synthesis report was analyzed. The efficiency of VHDL in Placing and Routing the design in a selected FPGA is observed.


    1. Advanced Electronic Communications Systems, Fifth Edition - Wayne Tomasi
    2. Microprocessors and Interfacing, Fourth Edition - D.V.Hall
    3. VHDL for Programmable Logic - Kevin Skahill (Cypress semiconductor)
    4. A VHDL Primer, Third Edition - J. Bhaskar
    5. The VHDL Cookbook, First Edition - Peter J Ashenden
    6. Enhanced VHDL Tutorial with Applications -
    7. VHDL Tutorial - Don McGarry
    8. VHDL Language Reference - Altium Technical Reference
    9. VHDL Article -
    10. VHDL Tutorial -
    11. VHDL Introduction at
    12. FPGA Article -
    13. Programmable Logic Design, Quick Start Handbook -
    14. PLD Architecture guide -
    15. What are CPLD and FPGA? -