The graphics processing unit or GPU, as it is popularly referred as, has grown to be a part and parcel of modern day computing devices and systems. Over the last decade, this industry has seen a plethora of rapid developments in parameters of both competence and performance of the GPU units. The modern GPU is not just a dedicated and power-packed graphics engine; it also incorporates a very potent and highly parallel programmable processor featuring complex arithmetic and memory bandwidth management functions that considerably surpasses its CPU counterpart.
The GPUs extensive and rapidly growing learning curve in terms of both capability and programmability has initiated a research community that has effectively plotted a wide collection of computationally challenging multifarious problems to the GPU. This endeavor in general-purpose computing on the GPU has positioned the GPU as a convincing substitute to conventional microprocessors found in accelerated performing computer devices of the future.
This paper gives an introduction to the world of graphics processing units (GPUs) and provides an overview of the functions and various configurations of this wonderful device.
A graphics processing unit (GPU) is a dedicated graphics interpretation and depiction device for the following:-
Contemporary GPUs are very proficient at displaying and manipulating computer graphics. They are much better in terms of performance for a specific range of complex algorithms as compared to the general purpose Central Processing Units (CPU) due to their extremely parallel configuration.
The configuration of GPU is either:-
Top of the video card
Directly integrated into the motherboard
GPUs mounted on a graphics card are more powerful than those integrated into the motherboard as seen in more than 85% of desktops and notebooks being manufactured today.
1.1 Graphics Accelerators
A GPU is a processor attached to a graphics card which is dedicated to calculating floating point operations and similar others.
Graphics accelerators contain customized microchips that consist of special mathematical operations generally utilized in graphics rendering. Therefore the competence of the microchips will decide the effectiveness of the graphics accelerator. These are generally employed for high-end 3D rendering and 3D gaming.
A GPU executes a large number of primitive graphics operations much faster than depicting it straight to the host CPU screen. Most common operations for 2D graphics include BitBLT operation for drawing figures like circles, triangles, arcs aand rectangles. Nowadays support for 3D computer graphics is present in most of the GPUs and include digital video related functions.
Companies that produce GPUs
AMD- Manufacturers of ATI Radeon and ATI FireGL graphics chip line
NVIDIA- Manufacturers of NVIDIA Geforce and NVIDIA Quadro graphics chip line
GeForce 6600GT (NV43)Â GPU
INTEL- Manufacturers of Integrated Graphics products to compete with add-on GPUs on offer by NVIDIA and ATI
Functions of GPUs
Provide support for programmable shaders that handle the textures and vertices with a lot of similar operations supported by the central processing unit (CPU) like oversamplingÂ andÂ interpolation techniques to reduceÂ aliasing, and very high-precisionÂ color spaces. Since most of these computations involveÂ matrixÂ andÂ vectorÂ operations, engineers and scientists have increasingly studied the use of GPUs for non-graphical calculations.
Expadite Geometric calculations including rotation and translation of vertices into various coordinates
Transistors of GPUs are mainly used to perform calculations related to 3D computer graphics.
Accelerate memory-intensive work of texture mapping and polygon rendering
Decoding high-definition video on the card, taking some load off CPU.
GPUs also have frame buffer capabilities and 2D acceleration.
Support for YUV colour space, hardware overlays and MPEG primitives like iDCT and motion compensation
GPGPU stands forÂ General-Purpose computation on GPUs. With the increasing programmability of commodity graphics processing units (GPUs), these chips are capable of performing more than the specific graphics computations for which they were designed. They are now capable coprocessors, and their high speed makes them useful for a variety of applications. The goal of this page is to catalog the current and historical use of GPUs for general-purpose computation.
2.1 Dedicated graphics cards
The most powerful category of GPUs classically interface with theÂ motherboardÂ by means of anÂ expansion slotÂ such asÂ Accelerated Graphics PortÂ (AGP) or PCI ExpressÂ (PCIe) and can typically be substituted or updated with relative ease, presuming that the motherboard is capable of sustaining the upgrade. A fewÂ graphics cardsÂ still useÂ Peripheral Component InterconnectÂ (PCI) slots, but with their limited bandwidth they are commonly used only when an AGP or PCIe slot is unavailable.
NVIDIA 7600 GT GPU
A dedicated GPU is neither necessarily detachable nor does it interface in a standard configuration with the mother-board. The terminology "dedicated" highlights the fact that dedicated graphics cards haveÂ Random Access Memory (RAM)Â that is dedicated to the use of the card, not to the factuality thatÂ majorityÂ dedicated GPUs are detachable. Dedicated GPUs for laptops and other portable devices are typically interfaced through a atypical and often proprietary slot due to their size and weight limitations. Such ports may still be regarded either AGP or PCIe in accordance of their respective logical host interfaces, even though they are not physically exchangeable with their equivalents. Several cards can draw collectively a single image, such that the doubling of the pixels takes place along with high-quality ofÂ anti-aliasing. Each card can capture and store the textures and geometry of the screen, when it is parted into 2 halves, left side and right side.
2.2 Integrated graphics solutions
Integrated graphics solutions, orÂ shared graphics solutionsÂ are graphics processors that employ a part of a computer's system Random Access Memory (RAM) in contrast to dedicated graphics memory. Computer devices with integrated graphics account for 93% of all Personal Computer (PC) consignments. These solutions are inexpensive to execute as compared to dedicated graphics solutions, but are less competent. Historically, integrated solutions were often believed unfit to play 3 Dimensional games or execute high-graphics programs like Adobe Flash. However, todays integrated solutions such as the Intel'sÂ GMA X3000, AMD's Radeon HD 3200 and NVIDIA's GeForce 8200 are much more than capable of managing extensive 2D graphics or low-end 3D graphics. Moreover, the above mentioned GPUs still struggle when it comes to high-end video games. Nowadays modern desktop motherboards come with an integrated graphics solution inclusive of expansion slots so that the consumer can add a dedicated graphics card if he so desires in the future.
Intel GMA X3000 IGP (Under Heatsink)
Integrated Graphic Solutions can use a system RAM of 2 Gb/s to 12.8 Gb/s of bandwidth
Dedicated GPUs enjoy a much higher RAM of 10 Gb/s to above 100 Gb/s of bandwidth with respect to the model, thus they are much faster than the integrated ones.
2.3 Hybrid solutions
These are GPU solutions that have been brought to compete with integrated graphic accelerators across the Laptop and Personal Computer markets. The most popular being:-
Hybrid graphics cards are costlier than integrated graphics solutions, but much more affordable as compared to dedicated graphics cards. They share small amount of system memory as compared to separate graphic cards to compensate for the high latency of the Random Access Memory (RAM) of the system. While these solutions are often marketed as having as much as 1024MB of RAM, this refers to the amount that can be shared with the system memory. Technologies contained by PCI Express can make this achievable.
2.4 Programming model
The Stream Programming Model defines Components with Pipelines of Kernels, whereÂ the development toolsÂ relieve theÂ programmer from having to deal with memory or synchronization.
Any ANSI C application can be compiled to run, from which kernels can be mapped to the DPU for performance optimization. By remaining in C, algorithmic changes can quickly be introduced without unraveling assembly or RTL, and source-tree management becomes trivial.
DSP PROGRAM FLOW (STREAM PROCESSING)
For Kernels a few StreamCâ„¢ keywords define the stream and intrinsics are available for DSP operations that have no C standard equivalent, such as DOT product.
For multiple threads and processes, conventional OS-level task switching is used and SPI provides an application framework and plug-in DSP libraries.