This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Video is the technology of electronically capturing, recording, processing, storing, transmitting, and reconstructing a sequence of still images representing scenes in motion. Video technology was first developed for television systems, but has been further developed in many formats to allow for consumer video recording.
Starting in the late 70s to the early 80s, several types of video production equipment- such as time base correctors (TBC) and digital video effects (DVE) units (two of the latter being the Ampex ADO, and the NEC DVE)- were introduced that operated by taking a standard analog video input and digitizing it internally. This made it easier to either correct or enhance the video signal, as in the case of a TBC, or to manipulate and add effects to the video, in the case of a DVE unit. The digitized and processed video from these units would then be converted back to standard analog video.
Later on in the 1970s, manufacturers of professional video broadcast equipment, such as Bosch (through their Fernseh division), RCA, and Ampex developed prototype digital videotape recorders in their research and development labs. Bosch's machine used a modified 1" Type B transport, and recorded an early form of CCIR 601 digital video. None of these machines from these manufacturers were ever marketed commercially, however.
Digital video was first introduced commercially in 1986 with the Sony D-1 format, which recorded an uncompressed standard definition component video signal in digital
form instead of the high-band analog forms that had been commonplace until then. Due to the expense, D-1 was used primarily by large television networks. It would eventually be replaced by cheaper systems using compressed data, most notably Sony's Digital Betacam, still heavily used as a field recording format by professional television producers.
Consumer digital video first appeared in the form of QuickTime, Apple Computer's architecture for time-based and streaming data formats, which appeared in crude form around 1990. Initial consumer-level content creation tools were crude, requiring an analog video source to be digitized to a computer-readable format. While low-quality at first, consumer digital video increased rapidly in quality, first with the introduction of playback standards such as MPEG-1 and MPEG-2 (adopted for use in television transmission and DVD media), and then the introduction of the DV tape format allowing recording direct to digital data and simplifying the editing process, allowing non-linear editing systems to be deployed wholly on desktop computers.
As a consequence of the digital era, attempts to display media on computers where made that date back to the earliest days of computing, in the mid-20th century. However, little progress was made for several decades, due primarily to the high cost and limited capabilities of computer hardware.
Academic experiments in the 1970s proved out the basic concepts and feasibility of streaming media on computers. While during the late
1980s, consumer-grade computers became powerful enough to display various media. The primary technical issues with streaming were:
â€¢ having enough CPU power and bus bandwidth to support the required data rates
â€¢ creating low-latency interrupt paths in the OS to prevent buffer under-run
However, computer networks were still limited, and media was usually delivered over non-streaming channels, such as CD-ROMs.
Eventually in the 1990s the technology erupted bringing along:
â€¢ greater network bandwidth, especially in the last mile
â€¢ increased access to networks, especially the Internet
â€¢ use of standard protocols and formats, such as TCP/IP, HTTP, and HTML
â€¢ commercialization of the Internet
These advances in computer networking, combined with powerful home computers and modern operating systems, made streaming media practical and affordable for ordinary consumers. Stand-alone Internet radio devices are offering listeners a "no-computer" option for listening to audio streams.
In the following pages we will describe one of the most widely accepted methods, that is being used nowadays to broadcast digital video to millions of viewers; Flash Video.
2. Digital Video Broadcasting Technologies
There are numerous video broadcasting (streaming) technologies available nowadays which are briefly described in the following section:
2.1 Adobe Flash
Adobe Flash has historically been called Shockwave Flash and more recently Macromedia Flash but has come to be simply Flash. Flash is a set of multimedia technologies
developed and distributed by Adobe Systems, when Adobe Systems acquired Macromedia in 2006. Since its introduction in 1996, Flash technology has become a popular method for adding animation and interactivity to web pages; Flash is commonly used to create animation, advertisements, various web page components, to integrate video into web pages, and more recently, to develop rich Internet applications.
Flash can manipulate vector and raster graphics and supports bi-directional streaming of audio and video. It contains a scripting language called ActionScript. It is available in most common web browsers and some mobile phones and other electronic devices (using Flash Lite). Several software products, systems, and devices are able to create or display Flash, including the Adobe Flash Player. The Adobe Flash Professional multimedia authoring program used to create content for the Adobe Engagement Platform, such as web applications, games and movies, and content for mobile phones and other embedded devices.
Files in the SWF format, traditionally called "Flash movies" or "Flash games", usually have a .swf file extension and may be an object of a web page, strictly "played" in a standalone Flash Player, or incorporated into a Projector, a self-executing Flash movie (with the .exe extension in Microsoft Windows). Flash Video (FLV) files have a .flv file extension and are utilized from within .swf files.
2.2 Microsoft Windows Media
Microsoft Windows Media is a multimedia framework for media creation
and distribution for Microsoft Windows. It consists of a software development kit with several application programming interfaces and a number of prebuilt technologies. It supports ASF, ASX, WMA, WMV files over HTTP, MMS and Windoms Media DRM which is an implementation of digital rights management.
RealNetworks is a provider of Internet media delivery software and services based in Seattle, United States. The company is best known for the creation of RealAudio, a compressed audio format, RealVideo, a compressed video format and RealPlayer, a media player. The company is also known for its subscription-based online entertainment services like Rhapsody, SuperPass, and RealArcade, and for its media properties like Film.com and RollingStone.com (which it operates in partnership with Rolling Stone owners Wenner Media).
RealMedia streaming files can contain RealAudio and RealVideo streams, and several other formats like SMIL. Helix is their free software / open source media framework. The code is released under various licenses, like the RealNetworks Public Source License starting in 2003 and the GPL in 2004. The codecs used however are not provided open-source.
SHOUTcast is a multiplatform freeware digital audio streaming technology developed by Nullsoft. It allows audio content, primarily in MP3 or HE-AAC format, to be broadcast to and from media player software, enabling hobbyists and professionals to create Internet radio/Web radio networks. Additionally it provides video
streaming with the use of Nullsoft Streaming Video (NSV) which is a media container designed for streaming video content over the internet.
SHOUTcast consists of a client-server model, with each component communicating via a network protocol that intermingles audio data with metadata such as song titles and the station name. It uses HTTP as a transport protocol, although multicast is another option.
SHOUTcast servers and clients are available for Palm OS, Microsoft Windows, FreeBSD, Linux, Mac OS X, and Solaris. Client-only versions exist on Windows Mobile, Series 60 and the PlayStation Portable (PSPradio).
The output format is supported by multiple clients, including Nullsoft's own Winamp, VLC media player, Amarok, XMMS, Zinf and Apple iTunes.
QuickTime is a multimedia framework developed by Apple Inc. capable of handling various formats of digital video, media clips, sound, text, animation, music, and several types of interactive panoramic images. The QuickTime framework supports a large numbers of file types and makes it possible to stream this content from the QuickTime Server to any QuickTime complaint client.
Ampache is a free software Web-based Audio file manager / web Media Server. The name is a play on the two words Apache and Amplifier. It was originally written to take advantage of Apache's Mod_mp3 but has since been adapted to use its own streaming method. It is one of the oldest Applications of its kind that is still actively developed having been released
FORscene is an integrated internet video platform, covering non linear editing and publishing for broadcast, web and mobile.
Designed by Forbidden Technologies plc to allow collaborative editing of video, its capabilities extend to video logging, reviewing, publishing and hosting. The system is implemented as a web application with a Java applet as part of its user interface. It runs on multiple platforms without application installation, codec installation, or machine configuration and has many Web 2.0 features.
FORscene has been recognised by the Royal Television Society, winning their award for Technology in the Post Production Process in December 2005, and is now used internationally. However it should be noted that both the underlying compression technology and the user interface are proprietary and covered by software patents.
FreeCast is a free software application which allows peer-to-peer streaming, sometimes called peercasting. It makes possible an audio (Ogg Vorbis) or video (Ogg Theora) stream broadcast to a large number of listeners from a simple DSL connection.
The FreeCast client used by listeners does not require any configuration to work and supports NAT traversal. FreeCast is made up of Java applications released under the GPL and is available for most operating systems.
Icecast is a free streaming media project maintained by the Xiph.org Foundation. It also refers specifically to the server program which is part of the project.
Icecast was created in December 1998/January 1999 by Jack Moffitt and Barath Raghavan to provide an open source audio streaming server that anyone could modify, use, and tinker with. Version 2 was started in 2001, a ground-up rewrite aimed at multi-format support (initially targeting Ogg Vorbis) and scalability.
The Icecast server is capable of streaming content as Vorbis over standard HTTP, Theora over HTTP, MP3 (via the SHOUTcast protocol), AAC, and NSV over the SHOUTcast protocol.
Orb is a freeware streaming software that enables users to remotely access all their personal digital media files including pictures, music, videos, webcams and television. It can be used from any Internet-enabled device, including laptops, pocket PC, smart phones, PS3, Xbox 360 video game consoles. The current version of Orb can be used as a replacement for Microsoft's Windows Media Connect software for computers running the Windows operating system
2.11 Unreal Media Server
Unreal Media Server is a proprietary streaming server for Windows platforms, designed to provide multimedia delivery over LAN and Internet.
It supports media files and live media streams based on the following file formats: AVI (DivX, XVid, VP6, any other codecs), MPEG-1/2/4, WMV/WMA, MP3, ASF, QuickTime. Playlist functionality allows automatically playing all the files of server's virtual folder in a loop mode. Supported live media sources include: digital cameras, microphones, TV-tuners, analog video sources connected to video card or
to Video capture card that supports DirectShow interface. Live audio/video is encoded with WMA/MP3/GSM 6.10 - WMV/MPEG4 codecs in real time. Hardware encoder appliances are supported for streaming hardware compressed content without software transcoding.
3. Players, encoders, converters and Libraries
There are numerous players presently witch are capable of playing FLV files either directly on the user's computer or remotely through a web page.
3.1.1 Desktop Players
Locally installed players are utilities that should be manually installed to the user's computer and make the playback of stored FLV and SWF files possible.
Wimpy Desktop FLV Player
Wimpy is a free MacOS X and Windows Desktop Player that is capable of playing FLV and SWF files
FLV Player is a standalone utility to play Adobe Flash Video (FLV) files with. FLV Player is compatible with Windows 2000, XP and Vista. FLV Player supports both local and internet streaming content, has fullscreen mode and, best of all, it is free.
Additional to the above applications which have been developed to solely playback FLV files, there are applications that can playback FLV content with the use of external libraries or DirectShow filters. Some of those applications are mentioned below:
Adobe Media Player
VLC media player
Media Player Classic (ffdshow DirectShow codec)
Windows Media Player (ffdshow DirectShow codec)
3.1.2 Server-Side Players
Server-side players are applications that can be embedded to a Web Page making possible to view FLV and SWF files that are stored to the remote server. The files are being cached within the player and the playback is started with as less delay as possible.
Sonettic Cinema HD Player 4.0
Sonettic Cinema HD Player is a new FLV online player that is capable of streaming other media contents such as Apple Quicktime (mov) as well as MP4.
3.3 Encoders and Converters
Data conversion is the conversion of one form of computer data to another--the changing of bits from being in one format to a different one, usually for the purpose of application interoperability or of capability of using new features. Data encoding on the other hand is the compression of a stream in order to decrease its size requiring less bandwidth to transfer it. With the use of specific tools, it is possible to convert different video formats such as AVI, WMV, MPG, MOV etc to FLV files.
A few of the most known FLV Converters are displayed in the following table as well as the original file formats that they support:
Name Input Video format Operating
Flash Video MX AVI, MOV, WMV, MPG, ASF, MPEG, MPEG-2, MPEG-4, FLV, 3GP, H264, RM, RMVB Windows Moyea
Kibisis AVI,MPG,MPEG Windows Kibisis
On2 Flix Pro MOV, WMA, ASF,AV, DV, AVI, WMV Windows & Mac On2
Riva FLV Encoder AVI, MPEG, MPEG-2, MPEG-4, MOV, WMV Windows Rothenberger GTS
Video to Flash Converter Pro AVI, ASF, WMV, MOV, MP4, MPEG, MPG Windows Geovid
Video to Flash Encoder AVI, MOV, WMV, MPG, ASF, MPEG, MPEG-2, MPEG-4, FLV Windows Wondershare
Video to SWF Converter Pro AVI, MPG, MPEG, MPE, WMV, ASF, RM, RMVB, MOV, MP4 Windows AdShareIt
3.4 Libraries & SDKs
Additionally to all the above applications, libraries have been developed that make it possible to playback, encode or convert from a different video file, to an FLV file.
The most known libraries for this purpose are:
Turbine Video Engine SDK
Adobe SWF and FLV File Format Specification
4. Content Delivery Networks
A content delivery network or content distribution network (CDN) is a system of computers networked together across the Internet that cooperate transparently to deliver content (especially large media content) to end users. The first web content based CDNs were Sandpiper, Mirror Image and Skycache followed by Akamai and Digital Island. The first video based CDN was iBEAM Broadcasting.
Currently there are approximately 30 different types of Content Delivery providers on the market. They all range in size, type, reach and reliability. The top 3 CDN's are considered
Limelight Networks (http://www.limelightnetworks.com/)
CDN nodes are deployed in multiple locations, often over multiple backbones. These nodes cooperate with each other to satisfy requests for content by end users, transparently moving content behind the scenes to optimize the delivery process. Optimization can take the form of reducing bandwidth costs, improving end-user performance, or both.
The number of nodes and servers making up a CDN varies, depending on the architecture, some reaching thousands of nodes with tens of thousands of servers.
Requests for content are intelligently directed to nodes that are optimal in some way. When optimizing for performance, locations that can serve content quickly to the user may be chosen. This may be measured by choosing locations that are the fewest hops or fewest number of network seconds away from the requestor, so as to optimize delivery across local networks. When optimizing for cost, locations that are less expensive to serve from may be chosen instead. Often these two goals tend to align, as servers that are close to the end user sometimes have an advantage in serving costs, perhaps because they are located within the same network as the end user.
The techniques that are most commonly used by CDNs are briefly described in the following section:
Web caches are used which store popular content closer to the user. These shared network appliances reduce
bandwidth requirements, reduce server load, and improve the client response times for content stored in the cache.
Server-load balancing uses one or more layer 4-7 switches, also known as a web switch, content switch, or multilayer switch to share traffic among a number of servers or web caches. In this case the switch is assigned a single virtual IP address. Traffic arriving at the switch is then directed to one of the real web servers attached to the switch. This has the advantages of balancing load, increasing total capacity, improving scalability, and providing increased reliability by redistributing the load of a failed web server and providing server health checks.
A content cluster or service node can be formed using a layer 4-7 switch to balance load across a number of servers or a number of web caches within the network.
Request routing directs client requests to the content source best able to serve the request. This may involve directing a client request to the service node that is closest to the client, or to the one with the most capacity. A variety of algorithms are used to route the request. These include Global Server Load Balancing, DNS-based request routing, HTML rewriting, and anycasting. Proximity-choosing the closest service node-is estimated using a variety of techniques including reactive probing, proactive probing, and connection monitoring.
Service providers increasingly provide value-added services beyond basic data transport. Features such as virus scanning and parental
control are being offered, hoping to increase service attractiveness, user loyalty, revenue, and profit. Web caches and service nodes distributed throughout the content delivery network provide convenient dispatch points for connecting to enhanced services. This method is sometimes called vectoring of messages.
Finally it should be mentioned that two protocols suites are designed to provide access to a wide variety of content services distributed throughout a content network. The Internet Content Adaptation Protocol (ICAP) was developed in the late 1990's to provide an open standard for connecting application servers. A more recently defined and robust solution is provided by the Open Pluggable Edge Services (OPES) protocol. This architecture defines OPES service applications that can reside on the OPES processor itself or be executed remotely on a Callout Server.
5. Technologies and Transitions Protocols
5.1 Unicast Streaming
Unicast streaming is a one-to-one streaming session in which the client machine contacts the server to request the streaming of Online Video Content. Unicast uses IP delivery methods such as Transmission Control Protocol (TCP) and User Datagram Protocol (UDP), which are session-based protocols. This method of content streaming, in which a packet is sent from a single source to a specified destination, is still the predominant form of transmission on Local Area Networks (LANs) and within the Internet. Each unicast client that connects to the server takes up additional
bandwidth. For example, if you have 10 clients all playing 100-kilobits per second (Kbps) streams, those clients as a group are taking up 1,000 Kbps. If you have only one client playing the 100 Kbps stream, only 100 Kbps is being used.
It is easily understood however that the actual implementation of the Unicast streaming causes one major drawback. Due to the fact that the requirement in bandwidth is proportional to the number of clients the server has to serve, using Unicast for online content streaming could be a costly implementation.
5.2 Multicast Streaming
Multicast streaming is a technique for "one to many" communication over an IP infrastructure. It scales to a larger receiver population by not requiring prior knowledge of who or how many receivers there are. Multicast utilizes network infrastructure efficiently by requiring the source to send a packet only once, even if it needs to be delivered to a large number of receivers. The nodes in the network take care of replicating the packet to reach multiple receivers only where necessary. Multicast is considered to be the true broadcast. The actual server hosting the video content does not deliver the actual content to each single client but instead sends the data packet once, and the clients connected to the Multicast Group receive it. This is similar to tuning into a station on a radio. Each client that listens to the multicast adds no additional overhead on the server. In fact, the server sends out only one stream per multicast station.
The same load is experienced on the server whether only one client or 1,000 clients are listening.
However multicast, such as unicast, suffers from one major disadvantage. In this case, multicast on the Internet is generally not practical because only small sections of the Internet are multicast-enabled. Multicast addresses and packets in order to properly function need to traverse through multicast-enabled routers. This is not the case nowadays, however efforts are being made through major broadcasters such as BBC has begun encouraging UK-based ISPs to adopt Multicast onto their networks.
6. User generated content
User generated content (UGC, often hyphenated), also known as Consumer Generated Media (CGM) or User created Content (UCC), refers to various kinds of media content, publicly available, that are produced by end-users.
The term entered mainstream usage during 2005 after arising in web publishing and new media content production circles. It reflects the expansion of media production through new technologies that are accessible and affordable to the general public. These include digital video, blogging, podcasting, news, gossip, research, mobile phone photography and wikis. In addition to these technologies, user generated content may also employ a combination of open source, free software, and flexible licensing or related agreements to further diminish the barriers to collaboration, skill-building and discovery.
Sometimes UGC can constitute only a portion of a website. For example
on Amazon.com the majority of content is prepared by administrators, but numerous user reviews of the products being sold are submitted by regular visitors to the site. Often UGC is partially or totally monitored by website administrators to avoid offensive content or language, copyright infringement issues, or simply to determine if the content posted is relevant to the site's general theme.
The advent of user generated content marks a shift among some media organizations from creating on-line content to creating the facilities and framework for non-media professionals (ie, 'ordinary people') to publish their own content in prominent places.
User generated content has also been characterized as 'Conversational Media', as opposed to 'Packaged Goods Media' (that is, traditional media). The former is a two-way process in contrast to the one-way distribution of the latter. Conversational or two-way media is a key characteristic of so-called Web 2.0 which encourages the publishing of one's own content and commenting on other people's.
The notion of the passive audience therefore has shifted since the birth of New Media, and an ever-growing number of participatory users are taking advantage of the interactive opportunities, especially on the Internet to create independent content. Grassroots experimentation then generated an innovation in sounds, artists, techniques and associations with audiences which then are being used in mainstream media. The active, participatory and creative audience is prevailing today
with relatively accessible media, tools and applications, and its culture is in turn impacting mass media corporations and global audiences.
The Organization for Economic Co-operation and Development has defined three central characteristics for UGC:
Publication requirement: While UGC could be made by a user and never published online or elsewhere, we focus here on the work that is published in some context, be it on a publicly accessible website or on a page on a social networking site only accessible to a select group of people (eg, fellow university students). This is a useful way to exclude email, two-way instant messages and the like.
Creative effort: This implies that a certain amount of creative effort was put into creating the work or adapting existing works to construct a new one; i.e. users must add their own value to the work. The creative effort behind UGC often also has a collaborative element to it, as is the case with websites which users can edit collaboratively. For example, merely copying a portion of a television show and posting it to an online video website (an activity frequently seen on the UGC sites) would not be considered UGC. If a user uploads his/her photographs, however, expresses his/her thoughts in a blog, or creates a new music video, this could be considered UGC. Yet the minimum amount of creative effort is hard to define and depends on the context.
Creation outside of professional routines and practices: User generated content is generally created outside
of professional routines and practices. It often does not have an institutional or a commercial market context. In extreme cases, UGC may be produced by non-professionals without the expectation of profit or remuneration. Motivating factors include: connecting with peers, achieving a certain level of fame, notoriety, or prestige, and the desire to express oneself.
Mere copy & paste or even a link could also be seen as user generated self-expression. The action of linking to a work or copying a work could in itself motivate the creator, express the taste of the person linking or copying. Digg.com, Stumbleupon.com, leaptag.com is a good example where such linkage to work happens. The culmination of such linkages could very well identify the tastes of a person in the community and make that person unique through statistical probabilities.
Common examples of websites based on user generated content are the following:
6.1 Case Study: Youtube
YouTube is a video sharing website where users can upload, view and share video clips. YouTube was created in mid-February 2005 by three former PayPal employees. The San Bruno-based service uses Adobe Flash technology to display a wide variety of video content, including movie clips, TV clips and music videos, as well as amateur content such as videoblogging
and short original videos. In October 2006, Google Inc. announced that it had reached a deal to acquire the company for US$1.65 billion in Google stock, and eventually the deal closed on November 13, 2006.
Unregistered users can watch most videos on the site, while registered users are permitted to upload an unlimited number of videos. Some videos are available only to users of age 18 or older (e.g. videos containing potentially offensive content), white the uploading of pornography or videos containing nudity is prohibited. Related videos, determined by title and tags, appear onscreen to the right of a given video. In YouTube's second year, functions were added to enhance user ability to post video 'responses' and subscribe to content feeds.
Few statistics are publicly available regarding the number of videos on YouTube. However, in July 2006, the company revealed that more than 100 million videos were being watched every day, and 2.5 billion videos were watched in June 2006. 50,000 videos were being added per day in May 2006, and this increased to 65,000 by July.[
As of November 2007 YouTube plays back videos limited in both size and quality. The size is limited to pixel dimensions of 320 by 240 and the quality is limited to a bitrate of around 314kbit/s with a frame rate dependent on the uploaded video. ]YouTube limits the playback size and quality by re-encoding the user's uploaded video at the time of upload. In 2006 YouTube permitted playback at higher quality, larger sizes, and in stereo,
but some time after January 2007 YouTube applied quality reductions to new uploads.
YouTube's video playback technology is based on Macromedia's Flash Player 9 and uses the Sorenson Spark H.263 video codec. This technology allows the site to display videos with quality comparable to more established video playback technologies (such as Windows Media Player, QuickTime and RealPlayer) that generally require the user to download and install a web browser plugin in order to view video. Flash also requires a plug-in, but Adobe considers the Flash 7 plug-in to be present on about 90% of online computers. The video can also be played back with third-party media players like the ones mentioned earlier in the text.
Regarding audio, YouTube files contain an MP3 audio stream, which by default, it is mono-encoding with a 65kbit/s rate at 22050 Hz.
YouTube converts videos into .FLV (Adobe Flash Video) format after uploading.The extension is then stripped from the file (Extension can be found from the server's MIME Type). The different files are stored in obscurely named subdomains, accessible either directly or through YouTube's get_video PHP script. YouTube also converts content to other formats so that it can be viewed outside of the website.
YouTube officially accepts uploaded videos in .WMV, .AVI, .MOV, MPEG and .MP4, formats.
Users can view videos in windowed mode or full screen mode and it is possible to switch modes during playback without reloading it due to the full-screen function of Adobe Flash
6.2 Case Study: Revver
Revver is a video sharing website that hosts user-generated content. Revver attaches advertising to user-submitted video clips and shares all ad revenue 50/50 with the creators. Videos can be displayed, downloaded and shared across the web in either Apple QuickTime or FLV format. In addition, Revver is a Video Publishing Platform that can enable any third-party to build their own "Revverized" site. The site at http://www.revver.com is actually built on top of Revver's own API, and third-parties can build identical functionality into their own sites. Revver allows developers to create a complete white label of the Revver platform.
Revver is the first video-sharing website to monetize user-generated content through advertising and share ad revenue with the creator. Revver's system is often compared to Google's Adwords, but for video rather than websites.
The key technology behind Revver is the RevTag, a tracking tag that is attached to videos that users upload. The RevTag automatically displays a clickable advert at the end of each video. When viewers click on it, the advertiser is charged and the advertising fee is split between the video creator and Revver.
RevTags are trackable across the web so users are encouraged to share Revver videos as widely as possible. Since the RevTag is part of the video file itself (thanks to the interactivity made possible by Flash-based video players and by the QuickTime format), the technology works no matter where the video
file is hosted or displayed, be it at Revver.com, at another website, or in a user's hard drive or portable video player. Therefore Revver's monetization of the video is not hampered by the downloading or sharing of the video file by users.
The RevTag can fail to load an ad, or to register an ad click, if the device playing the video is not connected to the internet, in which case a default "Brought to you by Revver" message is shown at the end. Of course, if the video file is transcoded into a different format (such as by uploading it to YouTube or Google Video, or by running it through a program that changes the format of the video file, e.g. into MPEG or RealPlayer), then the RevTag would almost certainly be lost. Network problems between the viewer's computer and Revver's ad server, or problems on the ad server, can also prevent the loading of ads.
The Revver website provides tools for sharing including RSS, podcasting, and embeddable FLV or QuickTime players. This minimizes any added benefits of transcoding. Revver thus makes it easy for creators and fans to embed the video anywhere while still in its original RevTagged version.
Users are further encouraged to share by Revver's affiliate program. An Affiliate is a user who helps to promote their favorite videos (or any videos they believe will become popular), be it through email, sneakernet, peer-to-peer sharing, or posting on their own website or on social-networking webpages. Revver affiliates earn 20% of ad revenue for sharing videos. The remaining
revenue for each video is split 50/50 between the video creator and Revver. This is possible because the RevTag in a video file that is promoted by an affiliate contains information not only about the video being played but also about the affiliate.
By using the Revver API in conjunction with sharing options such as embedded players, developers can create user-interactive sites where video creators, as the users of such sites, provide video content and where the affiliate revenue for the video content goes to the site owner.
In the past, creators were able to restrict what kinds of advertisements could be placed at the end of their videos, but this is currently not possible. However, advertisers may choose to request that their ads be shown in videos of certain categories (such as videos that are most popular on certain websites), thus allowing them to better target their desired demographics.
To enable lawful sharing of Revver videos, the Revver upload license allows for redistribution under the Attribution-NonCommercial-NoDerivs 2.5 Creative Commons License.
7. Compression and Metadata
7.1.1 Video Compression
Video compression refers to reducing the quantity of data used to represent video images and is a straight forward combination of image compression and motion compensation. The major advantage of compressed video is that effectively reduces the bandwidth required to transmit digital video via terrestrial broadcast, via cable, or via satellite services.
compression is lossy, i.e. it operates on the premise that much of the data present before compression is not necessary for achieving good perceptual quality. For example, DVDs use a video coding standard called MPEG-2 that can compress ~2 hours of video data by 15 to 30 times while still producing a picture quality that is generally considered high quality for standard-definition video. Video compression, like data compression, is a tradeoff between disk space, video quality and the cost of hardware required to decompress the video in a reasonable time. However, if the video is over compressed in a lossy manner, visible (and sometimes distracting) artifacts can appear.
In order to actually be able to compress digital video it is required to have a video codec. A video codec is a device or software that enables video compression and/or decompression for digital video.
The most known video codecs that are currently being used in the digital video industry are the following:
It is used primarily in older videoconferencing and videotelephony products. H.261, developed by the ITU-T, was the first practical digital video compression standard. Essentially all subsequent standard video codec designs are based on it. It included such well-established concepts as YCbCr color representation, the 4:2:0 sampling format, 8-bit sample precision, 16x16 macroblocks, block-wise motion compensation, 8x8 block-wise discrete cosine transformation, zig-zag coefficient scanning, scalar quantization, run+value symbol
mapping, and variable-length coding. H.261 supported only progressive scan video.
MPEG-1 Part 2
It is used for Video CDs, and also sometimes for online video. If the source video quality is good and the bitrate is high enough, VCD can look slightly better than VHS. To exceed VHS quality, a higher resolution would be necessary. However, to get a fully compliant VCD file, bitrates higher than 1150 kbit/s and resolutions higher than 352 x 288 should not be used. When it comes to compatibility, VCD has the highest compatibility of any digital video/audio system. Very few DVD players do not support VCD, but they all inherently support the MPEG-1 codec. Almost every computer in the world can also play videos using this codec. In terms of technical design, the most significant enhancements in MPEG-1 relative to H.261 were half-pel and bi-predictive motion compensation support. MPEG-1 supports only progressive scan video.
MPEG-2 Part 2 (a common-text standard with H.262)
It is used on DVD, SVCD, and in most digital video broadcasting and cable distribution systems. When used on a standard DVD, it offers good picture quality and supports widescreen. When used on SVCD, it is not as good as DVD but is certainly better than VCD due to higher resolution and allowed bitrate. Though uncommon, MPEG-1 can also be used on SVCDs, and anywhere else MPEG-2 is allowed, as MPEG-2 decoders are inherently backwards compatible. In terms of technical design, the most significant enhancement in MPEG-2 relative to MPEG-1
was the addition of support for interlaced video. MPEG-2 is now considered an aged codec, but has tremendous market acceptance and a very large installed base.
Primarily used for videoconferencing, videotelephony, and internet video. H.263 represented a significant step forward in standardized compression capability for progressive scan video. Especially at low bit rates, it could provide a substantial improvement in the bitrate needed to reach a given level of fidelity.
MPEG-4 Part 2
It is an MPEG standard that can be used for internet, broadcast, and on storage media. It offers improved quality relative to MPEG-2 and the first version of H.263. Its major technical features beyond prior codec standards consisted of object-oriented coding features and a variety of other such features not necessarily intended for improvement of ordinary video coding compression capability. It also included some enhancements of compression capability, both by embracing capabilities developed in H.263 and by adding new ones such as quarter-pel motion compensation. Like MPEG-2, it supports both progressive scan and interlaced video.
MPEG-4 Part 10
This emerging new standard (a technically aligned with the ITU-T's H.264 and often also referred to as AVC) is the current state of the art of ITU-T and MPEG standardized compression technology, and is rapidly gaining adoption into a wide variety of applications. It contains a number of significant advances in compression capability, and it has recently been adopted
into a number of company products, including for example the XBOX 360, PlayStation Portable, iPod, the Nero Digital product suite, Mac OS X v10.4, as well as HD DVD/Blu-ray Disc.
DivX, Xvid, FFmpeg MPEG-4 and 3ivx
The above codecs are all different implementations of MPEG-4 Part 2.
A proprietary video codec developed by On2 Technologies and used in Adobe Flash Player 8 and above.
It is a codec that is popularly used by Apple's QuickTime, basically the ancestor of H.264. Many of the QuickTime movie trailers found on the web use this codec.
It is a codec that was licensed to Macromedia for use in its Flash Player 6 and is in the same family as H.263. This codec will be discussed extensively later in this document
Theora was developed by the Xiph.org Foundation as part of their Ogg project, based upon On2 Technologies' VP3 codec, and christened by On2 as the successor in VP3's lineage, Theora is targeted at competing with MPEG-4 video and similar lower-bitrate video compression schemes.
WMV (Windows Media Video)
Microsoft's family of video codec designs including WMV 7, WMV 8, and WMV 9. It can do anything from low resolution video for dial up internet users to HDTV. The latest generation of WMV is standardized by SMPTE as the VC-1 standard.
SMPTE standardized video compression standard (SMPTE 421M). It is based on Microsoft's WMV9 video codec and it is one of the 3 mandatory video codecs in both HD-DVD and Blu-Ray high-definition
optical disc standards. Commonly found in portable devices and on streaming video websites in its Windows Media Video implementation.
RealVideo is a popular codec technology a few years ago developed by RealNetworks, which is now fading in importance for a variety of reasons.
A very early codec used by Apple's QuickTime.
A GPL-licensed implementation of H.264 encoding standard, x264 is only an encoder.
Huffyuv (or HuffYUV) is a very fast, lossless Win32 video codec written by Ben Rudiak-Gould and published under the terms of the GPL as free software, meant to replace uncompressed YCbCr as a video capture format. See Lagarith as a more up-to-date codec.
A more up-to-date fork of Huffyuv is available as Lagarith.
A family of ultrafast lossless QuickTime and AVI codecs, developed by BitJazz Inc., for RGB[A], Y'CbCr[A] 4:4:4[:4], Y'CbCr[A] and 4:2:2[:4] formats; for both 10-bit and 8-bit channels; for both progressive and interlaced data; for both Mac and PC.
The two most common video codecs that are currently being used for digital video broadcast in the form of FLVs are the following:
Sorenson's Spark (H.263)
Sorenson's Spark (H.263) is a video codec standard originally designed by the ITU-T in a project ending in 1995/1996 as a low-bitrate compressed format for videoconferencing. It is one member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group (VCEG).
codec was first designed to be utilized in H.324 based systems (PSTN and other circuit-switched network videoconferencing and videotelephony), but has since also found use in H.323 (RTP/IP-based videoconferencing), H.320 (ISDN-based videoconferencing), RTSP (streaming media) and SIP (Internet conferencing) solutions.
H.263 was developed as an evolutionary improvement based on experience from H.261, the previous ITU-T standard for video compression, and the MPEG-1 and MPEG-2 standards. Its first version was completed in 1995 and provided a suitable replacement for H.261 at all bitrates. It was further enhanced in projects known as H.263v2 (also known as H.263+ or H.263 1998) and H.263v3 (also known as H.263++ or H.263 2000).
The next enhanced codec developed by ITU-T VCEG (in partnership with MPEG) after H.263 is the H.264 standard, also known as AVC and MPEG-4 part 10. As H.264 provides a significant improvement in capability beyond H.263, the H.263 standard is now considered primarily a legacy design (although this is a recent development). Most new videoconferencing products now include H.264 as well as H.263 and H.261 capabilities.
TrueMotion VP6 on the other hand is a video codec developed by On2 Technologies as a successor to earlier efforts such as VP3 and VP5. The VP6 codec has been used in products for broadcasting in the field, such as with BBC reporters and QuickLink software.
TrueMotion VP6 can provide a higher visual quality than H.263, especially when using lower bit rates but on the
other hand it is computationally more complex and therefore will not run as well on certain older system configurations
In August 2005, Macromedia announced they had selected VP6 as the flagship new codec for video playback in the new Flash Player 8. It should be noted however that the latest Flash Player (version 9) supports the H. 264 codec (also known as MPEG-4 part 10, or AVC) which is even more computationally demanding, but offers significantly better quality/bitrate ratio.
7.1.2 Audio Compression
Audio compression, as video compression, refers to the process of reducing the size of the audio that accompanies a video file. Audio compression algorithms are implemented in computer software as audio codecs. The most common audio codecs used nowadays are the following:
Advanced Audio Coding (AAC) is a standardized, lossy compression and encoding scheme for digital audio. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at the same bitrate, particularly below 192 kbit/s. AAC's best known use is as the default audio format of Apple's iPhone, iPod, iTunes, and the format used for all iTunes Store audio (with extensions for proprietary digital rights management).
AC-3 or Dolby Digital, is the common version containing up to six discrete channels of sound, with five channels for normal-range speakers (20 Hz - 20,000 Hz) (right front, center, left front, right rear and left rear) and one channel (20 Hz - 120 Hz) for the subwoofer
driven low-frequency effects. Mono and stereo modes are also supported. AC-3 supports audio sample-rates up to 48KHz.
ALAC (also known as Apple Lossless) is an audio codec developed by Apple Inc. for lossless data compression of digital music. Apple Lossless data is stored within an MP4 container with the filename extension .m4a. While Apple Lossless has the same file extension as AAC, it is not a variant of AAC, but uses linear prediction similar to other lossless codecs such as FLAC and Shorten. iPods with a dock connector (not the Shuffle) and recent firmware can play Apple Lossless-encoded files. It does not utilize any digital rights management (DRM) scheme, but by the nature of the container, it is thought that DRM can be applied to ALAC much the same way it can with other files in QuickTime containers.
Adaptive Transform Acoustic Coding (ATRAC) is a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC in 1992. ATRAC allowed a relatively small disc like MiniDisc to have the same running time as CD while storing audio information with minimal loss in perceptible quality. Today ATRAC is used in many Sony-branded audio players. Improvements to the codec in the form of ATRAC3, ATRAC3plus and ATRAC Advanced Lossless followed in 1999, 2002 and 2006 respectively. On August 30, 2007 Sony announced that its online music store, Connect Music Services, would no longer suport the ATRAC format in Europe.
Free Lossless Audio Codec (FLAC) is a file format for audio data compression. Being a lossless compression format, FLAC does not remove information from the audio stream, as lossy compression formats such as MP3, AAC, and Vorbis do. Like other methods of compression, FLAC's main advantage is the reduction of bandwidth or storage requirements, but without sacrificing the integrity of the audio source. For example, a digital recording (such as a CD) encoded to FLAC can be decompressed into an identical copy of the audio data. Audio sources encoded to FLAC are typically reduced in size 40 to 50 percent (47% according to their own comparison).
Monkey's Audio is an audio codec developed by Matthew T. Ashland which is primarily used through Windows Operating Systems. While lossless audio compression heavily relies on the file being compressed, Monkey's Audio generally achieves compression rates which are slightly better than FLAC and significantly better than the older Shorten. Given this, both encoding and decoding are generally slightly slower than both FLAC and Shorten, and due to design decisions, the decoder is problematic to implement on portable digital audio players. It suffers from relatively slow seeking, depending on the compression level chosen
MP3 or more precisely MPEG-1 Audio Layer 3, is a digital audio encoding format. It was invented by a team of international engineers at Philips, CCETT (Centre commun d'études de télévision et télécommunications), IRT, AT&T-Bell
Labs and Fraunhofer Society, and it became an ISO/IEC standard in 1991. This method is described in more detail later on this document.
Vorbis is a free and open source, lossy audio codec project headed by the Xiph.Org Foundation and intended to serve as a replacement for MP3. It is most commonly used in conjunction with the Ogg container and is therefore called Ogg Vorbis. Vorbis development began following a September 1998 letter from Fraunhofer Gesellschaft announcing plans to charge licensing fees for the MP3 audio format. Soon after, founder Christopher "Monty" Montgomery commenced work on the project and was assisted by a growing number of other developers. They continued refining the source code until a stable version 1.0 of the codec was released on July 19, 2002.
WavPack compression (.WV files) can compress (and restore) 8, 16, 24 & 32-bit float audio files in the .WAV file format. It also supports surround sound streams and high frequency sampling rates. Like other lossless compression schemes the data reduction rate varies with the source, but it is generally between 30% and 70% for typical popular music and somewhat better than that for classical music and other sources with greater dynamic range.
WavPack also incorporates a "hybrid" mode that provides all the advantages of lossless compression with an additional bonus: Instead of creating a single file, this mode creates both a relatively small, high-quality lossy file(.wv) that can be used all by itself,
and a "correction" file(.wvc) that (when combined with the lossy file) provides full lossless restoration. This allows getting the advantages of both lossy and lossless codecs together, and this feature is offered only by this codec and by OptimFROG DualStream.
Windows Media Audio
Windows Media Audio (WMA) is an audio data compression technology developed by Microsoft. It is a proprietary technology which forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. Today it is one of the most popular codecs, together with MP3 and MPEG-4 AAC. In 2003 it came second after MP3 in terms of standalone players supporting it. WMA Pro, a newer and more advanced codec, supports multichannel and high resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. And WMA Voice, targeted at voice content, applies compression using a range of low bit rates.
Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160-180 (manual set allows bitrates up to 320) kbit/s. It was formerly known as MPEGplus, MPEG+ or MP+. Development of MPC was initiated in 1997 by Andree Buschmann and later taken over by Frank Klemm, and is currently maintained by the Musepack Development Team (MDT) with assistance from Frank Klemm. Encoders and decoders are available
for Microsoft Windows, Linux and Mac OS X, along with plugins for several third-party media players available from the Musepack website, licensed under the LGPL or BSD licenses, and an extensive list of programs supporting the format.
Flash Video extensively uses the following codecs:
Nellymoser Asao codec
Recent versions of Flash Player also support AAC (HE-AAC/AAC SBR, AAC Main Profile, and AAC-LC).
MP3 or more precisely MPEG-1 Audio Layer 3, is a digital audio encoding format. It was invented by a team of international engineers at Philips, CCETT (Centre commun d'études de télévision et télécommunications), IRT, AT&T-Bell Labs and Fraunhofer Society, and it became an ISO/IEC standard in 1991. The compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people. This method is commonly referred to as Perceptual Coding.
It provides a representation of sound within a short term time/frequency analysis window, by using psychoacoustic models to discard or reduce precision of components less audible to human hearing, and recording the remaining information in an efficient manner.
The Nellymoser Asao codec is a proprietary single-channel (mono) format optimized for low-bitrate transmission of audio, developed by Nellymoser Inc. Sound data is grouped into frames of 256 samples. Each frame is converted into the frequency domain and the most significant (highest-amplitude) frequencies are identified. A number
of frequency bands are selected for encoding; the rest are discarded. The bitstream for each frame then encodes which frequency bands are in use and what their amplitudes are. This codec does not take into consideration actual sample rate, and has fixed ratio between input samples amount and output packet size (2 bits per input sample).
Finally ADPCM is a digital representation of an analog signal where the magnitude of the signal is sampled regularly at uniform intervals, then quantized to a series of symbols in a digital (usually binary) code. The major benefit of ADPCM is that it encodes each section of the analog signal, as differences between the current and the previous one and at the same time varies the size of the quantization step to allow further reductions of the required bandwidth for a given signal-to-noise ratio.
Metadata is used to facilitate the understanding, use and management of actual data. The metadata required for effective data management varies with the type of data and context of use.
In this case, the information that composes an FLV file is the primary data, and the metadata is information about that video. The length of the video (duration), the number of frames per second that the video displays (frame rate), and the number of kilobytes of data transferred per second when the video plays (video and audio data rates, where applicable) are all examples of video metadata. In many cases metadata is useful non-essential information that comes in handy when
searching for or querying a data asset. For example, think about a digital image database, and think of a keyword field in which you can add descriptive words about the image. Other examples of metadata, however, can border on indispensable. Reconsider the same digital image library, and think about the width and height fields in that database. Without that information, it would be much harder to display the image properly. The same types of metadata and, unfortunately, the same kinds of problems, apply to FLVs.
The most common metadata included in an FLV video are the following:
Name Type Description
Duration Number Length of the FLV in seconds.
Width Number Width of the video in pixels.
Height Number Height of the video in pixels.
Video Data Rate Number The Video Data Rate
Audio Data Rate Number The Audio Data Rate
Frame Rate Number The Frame Rate
Creation Date String The Date of the creation
Last Timestamp Number Time Stamp of the last tag in the FLV file.
Last Key Frame Timestamp Number Time Stamp of the last video tag which is a key frame.
File Size Number File Size in bytes
Video Size Number Total size of video tags in the file in bytes.
Audio Size Number Total size of audio tags in the file in bytes.
Data Size Number Total size of data tags in the file in bytes.
Meta Data Creator String The Name of the Meta Data Creator
Meta Data Date Date Date and time Metadata added.
Extra Data String Additional string data if specified.
Video Codec ID Number Video codec ID number used
in the FLV. (Sorenson H.263 =2, On2 VP6 = 4 and 5).
Audio Codec ID Number Audio codec ID number used in the FLV. (Uncompressed = 0, ADPCM = 1, MP3 = 2, NellyMoser = 5 and 6).
Audio Delay Number Audio delay in seconds. Flash 8 Video Encoder delays the video for better synch with audio. Cue points added with F8VE are shifted by this delay to match their intended position when encoding.
Can Seek To End Boolean True if the last video tag is a keyframe and, therefore, can be seeked.
In order to have additional Metadata, an external mechanism such as a database can be used that links meta-information with a unique string that identifies the FLV file. This way it is possible to index further the metadata and make searching through web-pages easier and faster.
8. Transmission Channels
Flash Video files can be delivered to the final destination in several different ways:
As a standalone .FLV file. Although Flash Video files are normally delivered using a Flash player for control, like the ones mentioned in the previous section, the .FLV file itself is fully-functional on its own and can be played or converted to other formats from local storage such as a hard disk or a CD. However this method required the file to be fully downloaded to the user's computer before playback can start.
Embedded in a SWF file using the Flash authoring tool (supported in Flash Player 6 and later). Again in this case the entire file must be transferred before playback can begin. As an additional drawback if there
is a need to change the video, the SWF file needs to be rebuilt to contain the new video file.
Via progressive download via HTTP (supported in Flash Player 7 and later). This method uses ActionScript to include an externally hosted Flash Video file client-side for playback. Progressive download has several advantages, including buffering, use of generic HTTP servers, and the ability to reuse a single SWF player for multiple Flash Video sources. Flash Player 8 includes support for random access within video files using the partial download functionality of HTTP, sometimes this is referred to as streaming. However, unlike streaming using RTMP, HTTP "streaming" does not support real-time broadcasting. Streaming via HTTP requires a custom player and the injection of specific Flash Video metadata containing the exact starting position in bytes and timecode of each key frame. Using this specific information, a custom Flash Video player can request any part of the Flash Video file starting at a specified key frame. For example, Google Video and Youtube support progressive downloading and can seek to any part of the video before buffering is complete. The server-side part of this "HTTP pseudo-streaming" method is fairly simple to implement, for example in PHP, as an Apache HTTPD module, or a lighttpd module.
Streamed via RTMP to the Flash Player using the Flash Media Server (formerly called Flash Communication Server), VCS, ElectroServer, or the open source Red5 server.
The RTMP protocol has three
o The "plain" protocol which works on top of TCP and uses port number 1935
o RTMPT which is encapsulated within HTTP requests to traverse firewalls
o RTMPS which works just like RTMPT, but over a secure HTTPS connection.
The raw TCP-based RTMP protocol maintains a single persistent connection and allows real-time communication. To guarantee smooth delivery of video and audio streams, while still maintaining the ability to transmit bigger chunks of information, the protocol splits video and data into 128-byte fragments (except for audio which uses 64-byte fragments). Fragments from different streams are then interleaved and multiplexed over a single connection. With longer data chunks, the protocol only carries a one-byte header per fragment, thus incurring very little overhead.
At a higher level, the RTMP protocol encapsulates MP3 and Flash Video multimedia streams, and can make remote procedure calls (RPCs) using the Action Message Format.
Other RPC services are made asynchronously with a single client/server request/response model, so real-time communication is not necessary.
Many corporate network firewal