This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
In short, we build a Java application that starts with a main thread called the web deamon. This daemon creates a special socket data types called ServerSocket in Java (socket that only listens), and on an endless loop listens to this main socket. The web deamon never dies - it is created when the application begins.
On each request coming from the main socket, the web deamon creates a new thread to handle the request. This newly created thread is called proxy, and it handles the connection between the application and the client and the connection between the application and the server. When finished to serve this request, the proxy thread is terminating. Thus, we have one proxy thread to handle each client request - which makes us multithreaded!
The proxy thread handles the client request. First it does a bit of parsing on the request, to retrieve the server name etc. Then it checks if the requested object is cached. Calling a method on the cache manager object does this (the cache manager object will be explained later). Anyway, if the requested object is cached, the proxy receives it from the cache manager and transfers the bits to the client.
If the requested object is not cached, the proxy thread creates a socket and forwards the request to the server. When receiving the reply, it forwards it to the client and to the cache manager (to cache it of course...). Then it terminates.
We have a static object to manage all the caching activity: this is the cache manager. It hides all the details of the caching implementation from the other components of the application, and exposes simple interface to get and set cached information. The main principles of the caching are as follows:
At startup, nothing is cached - the cache is cleaned.
When receiving replies from web server, the information is stored on a persistent storage (disk file), and the name of that file is generated according to the URL. If there's not enough space for caching, the purge-cache algorhytm is called, which makes enough space for the desired object to be cached. This algorithm will be explained later.
When the proxy thread asks weather a URL is cached, the cache manager checks to see if there's a matching file. If there's not, then it returns 'not cached' to the proxy thread. If it's there, the cache manager checks to see if the information stored in the file is up to date (according to timeout definitions). If no, it deletes the file from the storage and returns 'not cached' to the proxy thread, otherwise it returns 'cached'.
Not all replies from web servers are stored in cached - for instance when the client requests dynamic (generated) objects (like database queries).
The algorithm to purge the cache is relatively simple: get rid of big files and non-frequently-used files. For this purpose, there's a data structure which holds information about the cached objects, such as hit rate for the last X hours (to obtain usability frequency) and size on disk. A special attribute holds the overall grade (or weight) for each file, this weight is calculated based on how frequently the file was used and its size (the ratio between these two elements is configurable). Then, the "heaviest" file is deleted, until there's enough space.
Every amount of time the cached files are checked to be sure that they are up to date. The check criterion is simple: if the cached file is older then a certain (configurable) amount of hours, it is deleted from the cache. A special thread called cache purge thread is performing this task - it is created on startup and wakes up every certain amount of hours and does this checking.
One more thread to go: this is the so-called admin thread. It is created at startup, and like the web daemon it creates a Server Socket and listens on a special (admin) port. It task is to receive requests from the application administrator and respond to it. When a request comes on this port, this thread generates HTML page and sends it back to the sender. The HTML page contain an applet that allows the administrator to configure the application parameters. The administrator has to fill out a password and send it to the admin thread on the same port, then if the password is verified the administrator gets a new dialog in which he can view and change the application parameters. The statistic information is taken from the data structure mentioned above.
Proxy - implements the proxy thread and the web daemon.
Cache - implements the cache manager and the cache purge thread.
Admin - implements the admin thread.
Parse - implements various parsing manipulations on strings.
Processes and Threads
The application is implemented as a single multithreaded process.
The threads are as follows:
Web daemon - listens on the main socket and dispatches the proxy threads. It is always alive.
Proxy thread - for each client request a proxy thread is created to handle the request. The proxy threads share resources such as the data structure, which holds statistic, and cache information - on these shared resources there are thread synchronization mechanisms.
Cache purge thread - wakes up every X hours and cleans out dated files from the cache. It is always alive, but most of the time not active.
Admin thread - listens on admin socket (on a special - admin - port). Handles requests from the proxy administrator. For such a request, it generates an HTML page and sends to the administrator, then gets the page back with new parameters for the application. It also verifies a password and replies to the administrator with a special statistic HTML page.
The design and implementation is totally object oriented, as supported by the Java language.
The main objects are:
Web daemon - static object. Inherits from the Java Thread object. Holds the main Server Socket as member variable. Creates proxy object for each client request.
Proxy - inherits from the Java Thread object. Holds as member variables the socket to the client and the socket to the server (if necessary). Calls to methods of the cache manager object.
Cache - static. Manages all the cache activities. Exposes interface methods to be called by the proxy object and other objects. Holds a data structure as a member variable, to store statistic and cache information. Holds a set of fields to influence the cache behavior, and exposes methods to set/get these fields (get - will be called by the cache purge thread and the admin thread, set - will be called by the admin thread).
Purge - static. Inherits from the Java Thread object.
Admin - static. Inherits from the Java Thread object. Contains methods for parsing and generating HTML pages.
The application serves a web proxy server to handle only HTTP requests.
The TCP/IP implementation is supported via primitive data types in Java: Sockets.
Socket is an object in Java, which handles network traffic to and from the local machine.
Server Socket is a socket that only listens to requests, while Socket is a full duplex type.
Web daemon object:
Main method - this method is the first to be executed and takes care of all initializations.
Dispatch method - listens on main socket and creates proxies to handle client requests.
Run method - this method is called to activate the thread. It's the main method of the object in a sense that all the work of this object is done by it.
GetObject - receives as arguments the client URL and a reference to Stream object. Opens the file in which the cached information is stored and transfers the bits as stream to the Stream object referenced as argument. Returns true iff succeeded.
IsCached - receives as argument a URL and check if it is stored in the cache. Returns Boolean answer.
IsUpToDate - receives as argument filename which stores cached information and checks if the data is up to date (not older than a certain amount of hours). Returns true iff information is up to date.
IsCachable - receives as arguments a client URL and HTTP reply header (a string from the web server) and decides if the object should be cached. Returns true iff it should.
PutObject - receives as arguments a client URL and a reference to a Stream object. Stores the bits from the stream in a file, the filename is generated according to the URL.
Run - does the work of purging the cache and goes to sleeps for a while...
Run - listens on admin socket, and does the work of communicating with the administrator via HTML pages. Calls get/set methods of cache object (get - to reply to administrator with statistic HTML page, set - to alter the behavior of the application).
1. The purge cache thread - is it really necessary?
Reasons for 'yes': this thread keeps the cache clean thus we always have (ideally) enough space to cache new objects, and not waste time on cleaning the cache while we're in the process of caching an object. Improves the caching time of a new object, thus improves concurrency of the application (since each caching operation locks the shared data structure for thread synchronization reasons).
Reasons for 'no': out of data cacheable objects harm no one. We should not make an expensive cleaning operation (especially since this locks the shared data structure for a relatively long time), because the next time an out of the client will ask date object, the cache manager will delete it from disk. (And also when there will be no more space left to cache new objects, old one will be deleted by the purge algorithm).
2. What are the user configurable parameters? Which parameters should be kept constant and which should not? We think that in general parameters that concerns environment should be configurable (cache disk size etc.), and also parameters that influence the purge algorithm should be configurable.
We designed and implemented a caching web proxy server, which can be configured remotely over the Internet by its administrator. The proxy handles only HTTP communications. The source is written in Java, which makes the code cross-compiler and the application cross-platform. The Java technology enables the main proxy application to send Java applets to remote administrator, sitting on a different machine on a remote network, and accessed over the Internet by standard TCP\IP communication.
The proxy includes two applications. The main application starts up and serves and a proxy server which listens to clients requests on a specific port, and forwards the requests to a web server or to another web proxy (father proxy), then sending the replys back to the clients. This application will be referred to as the proxy application.
The proxy application also does caching. When a client requests an object, the proxy checks if the object is cached. If so, it does not forward the request, and replies to the client with the cached object (this is called a cache hit). If the requested object is not cached (cache miss), the request is forwarded, and when the reply arrives to the proxy, it both sends it back to the client and caches it on its machine, for future use. Thus, if a client requests an object which has already been requested (by the client or by another client), the proxy retrieves the object from its local machine cache and sends it back, without searching for the object out there on the Internet. This feature gives a performance boost to the proxy itself and to the client (and to all other clients who will request these cached objects in the future). Caching the objects on the local machine is done using the host machine file system, thus the cache size is dependent on the local system resources in terms of hard disk free space.
The feature of forwarding requests to another web proxy or to a web server, along with the caching behavior, enables one to use a hierarchy of caching proxy servers. Modern client browsers do caching on the client local machine, so caching is really done in levels (hierarchy): The first level is in the RAM on the client machine (usually the browser caches a few web pages in memory). The size of this level is depended on the memory resources of the browser machine. The second level, also done by the browser, is keeping the objects in a persistent storage, using the local file system on the browser machine. The third level is the caching done by the proxy, on its machine. Then, if the proxy can forward requests to another proxy, thus creating a chain or a hierarchy of proxies, each one of them does its caching on its machine, and by this creating a new caching level. If the requested object could not be found throughout the cache chain, the request is finally forwarded to the web server, which delivers the bits back through the chain of proxies, enables each of them to cache the reply for future use.
Why is it good?
It seems like a lot of disk space is being spent that way, so what's the point?
The answer is that disk space can be cheaper than the time spent on getting the requested objects from the Internet. Access to the Internet is very time consuming. In any case, the proxy administrator can choose whether to enable this feature or not, so it is up to him to decide - the technology is there for him. This fact is specially important when considering Intranets. Today, organizations manage their Internet working by forming a local net (Intranet) which has a gateway to the Internet. A proxy server usually runs on this gateway machine. For big organizations, a local flat-hierarchy Intranet is not a good enough solution. Instead, the Intranet is built as a hierarchy of sub-Intranets, each of these sub-nets is gatewayed to the main Intranet through a proxy, and the entire Intranet is gatewayed through a proxy to the Internet. For example, let's say corporation Turtle Inc. has a couple of branches. The branch in Italy has an Intranet, and so is the branch in France and New-York. The Turtle corporation has a hierarchy of Intranets and proxies as follows:
The Italy branch is gatewayed to the Europe Intranet via a proxy server, and so does the France branch. The Europe Intranet is gatewayed to the Corp Intranet (in the U.S.) through a proxy server, and the New-York branch is gatewayed to the Corp Intranet through its proxy server. Now, when a client browser in Italy searches for on object that was requested earlier by a French client, it gets the reply from the cache in the Europe Intranet proxy. When a client browser in New-York searches for an object that was earlier requested by the French client, it gets the reply from the cache in the Corp Intranet proxy. And when a client in France searches for an object previously requested by him or by another French client, it gets the reply from the cache in the France Intranet proxy. Only the first request goes out there to the Internet, all future request to the same object will be satisfied due to the proxy hierarchy cache. Designing an organization local net in that structure leads to performance boost all over the enterprise, and that's what making those features of caching and chaining so cool!
The second application is a Java applet, referred to as the applet. It enables the proxy administrator to remotely manage the proxy, from his machine, via its web browser. When the administrator wants to, he can access the desired proxy server by typing the proxy machine's IP address (or machine name) on the URL address window in the browser plus the suffix '/admin' (for instance, 'techst02/admin' ). The browser considers this to be a valid client request, and forwards it on to the proxy (or chain of proxies). When the proxy monitors this request, it compares it at run-time to the IP address (or machine name) of the host machine on which it runs. If they do not match, the proxy forwards this as a normal request on to the father proxy (or web server). But, if they do match, then the proxy assumes that an administrator is trying to attach. It responses by sending back a Java applet that handles all the remote configuration and the necessary security issues (such as password login), and of course does not forward that request.
Back on the administrator machine, the browser gets a Java applet and starts it. The applet starts by requesting a password from the administrator and sending it to the proxy application. The applet and the proxy now talk full duplex. If the password is correct, the proxy sends Ack to the applet, and the applet responses by presenting all the parameters and status of the proxy, enables the administrator to alter parameters, thus change the proxy behavior (in terms of traffic management and cache activities), and sending the new parameters to the proxy. All of the operations in the administrator machine take place through its browser, by the applet, thus having full Graphical User Interface (GUI) support which is not restricted only to HTML web forms, seen on search engines for example. It is the power of Java applets technology that gives the administrator the ability to control the proxy remotely, plus the friendly graphical environment to do so - thus, if we or anyone else in the future decide to enhance the set of configurable parameters or the user interface to the administrator, the infrastructure is there to be used and enhanced. We're talking applets here, not just a dull HTML page form.
A note on security: The applet only present GUI to the administrator and handles communications to the proxy application. When the administrator enters the login password, the applet sends it over the Internet to the proxy. The proxy checks the validity of the password and sends back to the applet Ack/Nack, based on that the applet logs the administrator in or not. So attackers can learn nothing at the password from viewing the applet operations.
Accessing multiple proxies remotely
We talked about a structure of proxy hierarchy, and how the proxy and the cache behavior support that. The remote access feature also support this design.
The administrator can access a unique proxy by specifying its IP address or machine name on the browser URL window. The requested proxy will be the only one to respond to the administrator by sending the applet. The applet and the requested proxy will talk full duplex over the Internet, and over all other chained proxies in the hierarchy. When the administrator alters the proxy parameters, only the requested proxy will be affected - all the other proxies will not change. This is important because it enables one administrator fully control the behavior of each and every one of the proxies in the structure. It also enables a group of administrators to control their proxies (the US administrator will control the proxies in the US, and independently the France administrator will control the proxies in France. Now, is that hot or what??).
Still, there's a potential design problem here. Let's say two or more administrators want to control a certain proxy. They both access the proxy at the same time, and each of them does not know about the other. Now, if one of them wants the proxy to behave in a certain manner, and the other administrator wants the opposite, it will lead - in the best case of implementation - to inconsistent result from one of the administrator point of view ("what the hell is going on here, I instructed the proxy to do something and it does the opposite!"). This is a design problem, and we solved it by deciding that remotely controlling a certain proxy is limited to one human administrator at a time. When an administrator controls the proxy, no other administrator can control the same proxy (this is very similar to protecting shared resources from multiple threads in a multithreaded environment). So, this solution ensures that a scenario like that could not happen.
All the clients' requests plus the administrator remote configuration are done parallelly by using threads. A certain proxy can handle multiple client requests at the same time, plus be remotely controlled by an administrator. For instance, in a certain time frame, the proxy can handle a request from client A, a request from client B, two requests from client C (this could happen because the modern browsers use threads too...) and communications with a remote applet running on the administrator machine. Non of these users (clients and administrator) would notice the difference.
Using threads leads to special problems, concerning shared resources and synchronization issues, and we will discuss that later (see 'Overview' ).
The development environment
The development was done on the Windows 95 and Windows NT platforms. Developing in Java means that the code is cross-compiler and platform independent, because the Java language encapsulates the details of the host operating system, and presents a uniform standard interface to the developer. Code written in Java should be compiled by any tool which compiles Java, on any platform. Plus, the Java Byte Code (that's the binary results of the compilation) should run on any machine supporting the Java Virtual Machine. That's a plus when developing Internet related applications.
The main proxy application was developed using Microsoft Visual J++ 1.0 . We found it convenient to write the code using this tool, because it is a code-based development tool (as opposed to other tools which give the developer more 'visual' centric view), and the application is basically an engine running without GUI support, and without supporting human interaction with it (unless it is done remotely by a browser).
The applet was developed with Symantec Visual Cafe. Much of its code is related to GUI, and we found the Visual Cafe environment to be very friendly and powerful when it comes to that (this is a visual tool).
So how does it work?
When the proxy application starts, one thread listens on the main socket always, and dispatches other threads to do the job of handling each client request. This dispatcher thread is referred to as the web daemon, while the connection handling threads are simply called proxies. A proxy thread is also incharge of catching the administrator special request and sending the applet back to him. A proxy thread is incharge of handling one request - handling means that it should reply back to the client through a socket. The proxy thread checks if the requested object is cached - it does so by invoking a method (service) on the cache manager. If the object is cached, the cache manager returns it and the proxy sends it to the client. If the object is not cached, the proxy forwards the request to father proxy or web server, and waits for the reply to come (by listening to a socket). When the reply arrives, the proxy thread delivers it to the client and caches it. Again, caching is done by calling methods of the cache manager. When this work is done, the proxy thread terminates itself.
When a proxy thread is running, the web daemon still listens on the main socket to requests, and for each one creates another proxy thread. That enables many requests to be handled simultainaslly. The web daemon is created at startup and never dies. When starting, it performs general initializations (such as creating the cache manager object and cleaning junk from the cache directory), and enters an endless loop of listening to the main socket and creating a proxy thread for each request.
To help the proxy threads using the HTTP protocol, we use two different classes: HttpRequestHdr and HttpReplyHdr. Sending or getting an HTTP message body is simple: we read/write the bits through the socket. But HTTP requires special header fields to be sent with a message. HttpRequestHdr does the work of creating these fields upon sending, and HttpReplyHdr does the work of receiving those fields upon arrival. For example, if the proxy fails to forward a request because the web server could not be reached, it generates an HTML web page with a proper message and constructs an HTTP message to send back to the client, informing him of the error and giving him the correct headers of this error (such as the return status code header).
The cache manager is a static object. It encapsulates all the details of caching from the other components in the application. For example, the proxy thread does not aware of the file name inwhich the requested object is cached - it just delivers a URL argument to the cache manager, which in its turn generate a file name out of that URL. Thus, if we will want in the future to change the mechanism of file name generation, the only code we should change is the code of the cache manager - the proxy thread's code would not be affected. This kind of encapsulation is supported by object oriented development environments, such as Java.
Caching objects is done using the host machine (onwhich the proxy runs) file system. When the cache manager is called to cache an object, it first generates a file name out of the URL of the object. The stream of bits coming from the father proxy or web server is written to a file. In addition, the cache manager holds a hash table data structure. Each entry in the hash table has two fields: a key and a value. The key is the file name, and the value is a date structure. When a new object is cached, it is stored on disk and then in the hash table a new entry created, into which the cache manager inserts the file name and the date it was created (year, month, day, hour, minute, etc.). Later, when the cache manager is asked to check if an object is cached, it first generates a file name out of the URL of the object, then enumerates the hash table to find the file name. If the file name is found, the cache manager returns it to the proxy thread; otherwise, it returns a status indicating 'not cached'.
If a requested object is found in the cache (a cache hit), the entry in the hash table is updated with the current date. This enables the Least Recently Used (LRU) algorithm to take place when the cache if full and a file needs to be deleted. So if the free space of the cache is going under a minimum level, the LRU algorithm enumerates the hash table and deletes the LRU file. If we would like in the future to change this policy (maybe to Least Frequently Used, for instance) we should change the value field of the hash table to be some other class, and change the code in the methods incharging of making more free space - all other components are unaware of this mechanism.
The cache manager methods are called from the multiple proxy threads. This could raise problems of synchronization and shared resources protection. We solve these problems by putting multithreading synchronization locks on all shared resources, as supported by the synchronized() Java native method. Methods involving the hash table are multithreaded-safe because the hash table object synchronizes all actions performed on it internally (supported by Java hash table methods). Files are multithreaded protected by the Java File object and by our code. For example, there is a chance that one thread will read from a file, and a second thread will try to delete this file (a scenario like that could happen, if the first thread is reading a cached object A, and the second thread is caching object B, thus making the cache size grow, and causing the cache manager to make more free space; the algorithm will detect that object A is the LRU, and try to delete it). This will cause the proxy to break in the worst case, or to failure in either the read or the delete operation, in the best case. We should not allow it, so we added code to check if write and delete operations are allowed before doing them.
Cacheable Vs. Non-Cacheable objects
Not all the replys are to be cached. For example, when a client sends a request to a search engine, typically the search engine treats that request as a query, and generates an HTML web page with the results of the query. This result page should not be cached on the proxy, because next time an identical query will be sent to the engine, different results are likely to be returned (because of the frequent changes in the engine database). These generated pages are called dynamic pages, while regular web pages that sit somewhere on a web server waiting to be retrieved are called static pages.
The problem is to identify that a certain web page is a dynamic one. We do not know a 100 percent solution to this problem, so we can only follow conventions. For example, dynamic pages can be generated by CGI scripts, and there's no way for the proxy to know that the page is a result of a CGI script, unless it gets some help from the HTTP reply headers or the URL of the page. It is a convention that CGI generated pages are taken from a URL which contains the sub-string "cgi-bin", so we added code to check the URL of each request, and if it contains this sub-string we do not cache the reply.
We also check for URLs containing special characters, indicating that this URL is a query. For example, the question mark "?" is a typical character used to submit queries. The proxy also gets help from the reply return code. If it contains a return code other then OK, for example, we do not cache the reply.
Let us just note that this problem is not as serious to the browser cache mechanism as it is to the proxy cache manager. Modern browsers manage caching on the browser cache, and theoretically they could be faced with the same problem - what should not be cached. But even if the browser do cache a non cacheable object, the user can always instruct it to "refresh" or "reload" the object from the Internet - in that case, the browser will re-send the request and wait for reply. On the other hand, proxies behave very differently, and the main reason for that is that the user should not be aware of them. So, if a proxy caches a non cacheable object, and the client will ask to refresh (reload) on his machine, the browser will re-send the request to the proxy, and the proxy would treat that request as a cache hit (because it has the requested object in its cache), will not forward the request, and reply with the cached bits. So it is extremely important for proxies to try to identify which object should not be cached.
The admin thread
One of the initialization operations done at startup by the web daemon is constructing the admin thread. This thread handles communication with a remote administrator, and sets/gets parameters from other components (such as the cache manager). It first creates a socket (admin socket) and listens on a special (admin) port.
When the administrator accesses the proxy, the proxy thread catch the event and sends him back a web page with the admin applet. The applet starts by presenting a login dialog box, and waiting for the administrator to enter password; then it sends the password to the proxy application, to the admin port, thus the admin thread at the main application could catch that request without interfering the handling of all client requests. The admin thread in the main application processes the password and sends back to the applet, on the admin port, an answer (Ack/Nack). Again, having the applet and the main application "talking" full duplex does not reflect on the activities going on in the main application (handling client requests).
At this phase, if another administrator accesses the main application, the proxy thread will catch that and send an admin applet to him too. The applet will run on his machine, and will try to talk to the main application. However, the admin thread in the main application talks only with the first administrator and ignores the second. As soon as the first administrator finishes, the admin thread will treat the second one. This protects the main application from multiple administration accesses, preventing the scenario of two or more remote administrators trying to control the same proxy, thus causing to inconsistent behavior of the proxy or unfriendly behavior in the applets on their machines (see above for a full description of the potential problem). The second administrator do get the applet (otherwise he could think the proxy is unreachable), but sending admin requests to the proxy is blocked until the first administrator finishes. We thought this is the best design approach for the problem, and implemented the solution that way.
After the password check, the administrator gets a full GUI window with the status of the proxy and parameters that can be altered. This is not just an HTML based form, but a powerful Java applet with all the GUI that we wanted (or any developer will want in the future) - including dialogs, check boxes, etc., and getting those user interface controls out of the browser area in a windowing manner.
The config class
When the administrator changes parameters and chooses to send them to the main application, the applet sends the bits to the proxy, and the proxy gets them and alter its behavior. To help both the proxy and the applet with the job of getting and setting parameters, we designed a special class called config. This class appears both in the applet code and in the proxy code, and handles all get/set methods involved with the configurable parameters.
The config class calls methods on other object to retrieve status of their parameters (these are get methods), and calls different set of methods on the other object to change their parameters (these are set methods). It also does the job of packaging these parameters and sending them over sockets to a remote machine. So, the applet uses the get ability of the config class to learn about the status of the main application, and present this status to the administrator, providing him the interface required to change this status, while as the main application uses the set ability of the config class to change its parameters. Both the applet and the main application use the config class ability to send the parameters to each other.
We thought this is a good design, for a couple of reasons: First, we should write all the code incharge of managing the configurable parameters only once, and let both the proxy and the applet use it. Second, it support encapsulation and object oriented philosophy - all the details are in the config class and not spread out all over the code, so future changes can be made more easily. For example, if we decide to encrypt the parameters before sending and decrypt them on retrieving, thus making the communication between the applet and the proxy more secure, we should add code to do that only in the config class; all other objects would not be aware to the change.
To install on your machine, do the following:
Create a directory (the proxy directory) and copy the *.java files of the main proxy application to it.
From the proxy directory, create a subdirectory called 'Applet' and copy the *.java files of the applet to it.
Note that the file Config.java is both in the proxy directory and in the applet directory - this is due to the fact that both share the config class. Note also that there are two different files named Admin.java, one is for the admin thread in the proxy and the second is for the applet main class.
Compile the files in the proxy directory and the files in the applet directory. Now you have the *.class files of the proxy in the proxy directory and the *.class files of the applet in the applet directory.
That's it - you can run the proxy by starting the Daemon class in the proxy directory (specifying the daemon port as an optional parameter - if not specifying, the default is 8080). The daemon will perform some initializations, including creating a subdirectory called 'Cache' from the proxy directory (if there's no such one) and cleaning the cache directory (each time you run the proxy, it will clean the cache directory on startup).
Point your browser to the proxy machine name and port. Now all requests are channeled through it. To remotely control it, type <MachineName>/admin on the browser. Wait a bit for the applet to come. The initial password is 'admin', and after first login you can change it.