Study On Untangling The World Wide Web Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The web is one of the wonders of the modern world. It should not be confused with the internet. The internet is the infrastructure (all the cables and technology) that allows computers to communicate. The web is a series of web pages created in notepad. Notepad is simple software that doesn't format text; this makes it able to be rendered by a web browser, such as Internet Explorer by Microsoft, if written in the correct code; in this case HTML and Java Script. The internet revolutionised the way in which we go about our lives, one of the best examples is research. Before the internet became widely available to ordinary people, to research something the library would have been the first port of call but then search engines were developed where people could and still can find, what is effectively a library of web pages.

A Brief History of the Web

The origins of the web can be traced back to the work of Sir Tim Berners Lee; he has brought over one hundred and one billion people to the web. Tim Berners Lee was working for CERN (the European Organization for Nuclear Research). His job was to co-ordinate research and data from computers around the globe, but of course they were all incompatible with one another; these lead to the creation of the web, where everything was compatible no matter where you were in the world. Tim set about creating a place where any piece of information could be linked to another piece of information. To do this he combined hypertext and the internet. The internet is a "global system of interconnected computer networks". In this network there is the World Wide Web which simply is "a collection of internet sites that offer text, graphics, sound and animation resources through hypertext".

The Anatomy of a Web Page

Web pages are viewed by using software called web browsers. There are many web browsers such as Internet Explorer by Microsoft or Safari by Apple. A web browser simply reads the source code of a web page and works out, from the code, how it will be displayed; this is called rendering a web page. In the source of a web page there will be a number of tags, these are:

The hyper text would be displayed as this:

This shows how web pages can render hyper text mark up language, it has been given a font and a specific size, but to change this more html would be needed, either in the document it's self or in a cascading style sheet. Cascading style sheets (CSS) determine the way a web page is laid out, for example, the alignment of text or images or the colours would all be changed within the CSS.

(In this section explain, with examples, the structure of the language the source code is written in).

In the early days all the code to produce a webpage was written in one document. But as the web expanded, so web designers often wanted to redesign their web sites. This led to the development of style sheets. A style sheet or Cascading style sheet, as previously mentioned, determines the layout of the page. The layout was separate code so it could easily be debugged if anything was to go wrong. This was a break through because people could change the appearance of the web page while being certain that none of the pages content would change.

(Explain, with examples of their source code, how style sheets work and are linked to web pages. Explain how they make the job of designing a website easier. You may want to give an example of a famous website and how its design has changed. Use to find a good example.)

Fairly quickly people wanted the web to do more than display information. They wanted to interact with it. Rather than just having static web pages, they wanted web pages to provide web services, which needed input from the viewers. Web pages developed so they could support a variety of scripting languages. An example of such a scripting language is Java script. This meant that navigation bars and forums could now be made; java script allowed people to use web pages for a much more than just reading and research. Java script is used most in websites such as Amazon where reviews can be written and viewed by anyone visiting the web site; it also allowed search engines to evolve because now people could input what they were searching for.

(Explain here, with examples, how a web page can run a script (program) written in JavaScript).

There are much more sophisticated ways that web servers can offer services today. We'll study these at A level. The growth of an interactive web is often known as Web 2.0.

Finding a Needle in a Haystack

The web is growing at an incredible rate. In fact, there are 5000 new pages being created every second. The web today is estimated to contain about 15 billion web pages.

Finding stuff on the web soon became a lot harder. New services developed called Search Engines. The first search engine was called Archie and was designed by Alan Emtage in 1990; its name came about because it is a shortened version of archive. The name archive was decided upon because of the way search engines work.

(give a brief history of search engines here)

How Search Engines Work

Search engines use spiders to crawl the web; a spider is a piece of software that looks at web pages and archives keywords, but to the program they will be bit patterns. They way a spider or crawler finds keywords is by ignoring certain words, such as a, the and it. This means that because it is not looking for some words its search has been narrowed, but it hasn't been narrowed enough. The next thing spiders do is stem words, this means that word endings, for example ed or ing, are removed, so if 'spider' was found on a web page it would become 'spid'. Spid doesn't make sense to a human that speaks English but a crawler is just looking for bit patterns that aren't stop words (stop words being it, if, and etc.). Although spid doesn't make sense there are some words that when stemmed are very similar to another word, like swimming becomes swimm, but this is not swim so cannot be confused. When a keyword is found it is archived to a database, this is where Archie got is name, when a word is searched for the web pages it was found on are simply listed as links to the pages found.

(In this section explain in your own words how a search engine works. Use the PowerPoint as a starting point, but do some of your own research - a good starting point might be HowStuffWorks. Remember, the key point to get over is that computers don't understand the pages they display. Nor can they read. A computer only works with patterns of bits - so try to explain how it can recognise words)

How A Search Engine Ranks the Pages It Finds

With so many pages on the web, often a search will return far too many 'hits' for you to look through. Search engines therefore have to rank the pages in order of importance. There are two ways that search engines rank web pages, they are, white hat SEO and black hat SEO (search engine optimisation), Google use white hat SEO, they rank their pages by finding key words, if racing was searched for you would find almost 400,000,000 hits. These are ranked in a certain order; they are ranked first by the number of times the keyword appears in the web page, search engines also use link analysis to see how many other pages link to a certain page, therefore it should be of more relevance. Search engines like Google weigh the importance based on web links, in other words, if there is a highly ranked page linked to your page it will be boosted to a higher rank. seo-process-diagram.jpg

(In this section explain in your own words how a page ranking system might work. Use the PowerPoint as a starting point, but watch Chris Bishop again, and do some extra work as well. See if you can find a good diagram that might help.)

A Bit More About Web Services

Web services have grown enormously over the last ten years. Most important has been the growth of e-commerce. This has only been possible because people can pay for things online. The internet is a public network - everyone can see it, send messages across it, and, if they are clever enough, intercept other peoples messages. So how can people pay for things without their credit card details being stolen?

(in this section explain the principles behind secure transactions. Use Chris Bishop again as a starting point - his example of how a message is encoded, and how secret keys can be exchanged.)

5000 pages a second, 15billion pages on the web

Page rank is how important a page is. They are ranked by being linked to other web pages. (Anchor text). Https means secure.