A Distributed File System Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Usually In computing, a distributed file system or network file system or other files system that can able to right of access to files from sharing multiple hosts through a computer network. This can do it most likely for multiple users on multiple computers to share files and storage property.

The client nodes do not include through entree to the primary block storage but transmit through the network using a protocol. This makes it potential to contain the access to the file system depending on access lists or capabilities on both the servers and the clients, depending on how was the designed Protocol is.

In difference, in a shared disk file system all nodes have identical access to the block storage where the file system is situated. On these systems the access control have to reside on the client.

Distributed file systems can contain services for transparent replication and fault tolerance. That is, when a narrow number of nodes in a file system go offline, the system continues to effort without a few data loss.

The variation between a scattered file system and a distributed data store can be vague, but DFSes are usually geared to use on local area networks.

DFS Characteristics

The permissions of shared folders that are part of the DFS are still the same.

Shares with important information can be replicated to several servers providing fault tolerance.

The DFS root must be created first.

DFS Components

DFS root - A shared inventory that be able to have other shared directories, files, DFS links, and other DFS roots. One root is authorized per server. Types of DFS roots:

Stand alone DFS root - Not published in Active Directory, cannot be simulated, and can be on any Windows 2000 Server. This provides no error tolerance with the DFS topology saved on one computer. A DFS can be operate using the following syntax:

Domain DFS root - It is published in Active Directory, can be replicated, and can be on any Windows 2000 Server. Files and directories must be manually replicated to other servers or Windows 2000 must be configured to replicate files and directories. Configure the domain DFS root, then the replicas when configuring automatic replication. Links are automatically replicated. There may be up to 31 replicas. Domain DFS root directories can be accessed using the following syntax:

DFS link - A pointer to another shared directory. There can be up to 1000 DFS links for a DFS root.

DFS administration is done on the Administrative Tool, "Distributed File System". This tool is on Windows 2000 Server computers, and Windows 2000 Professional computers that have the ADMINPAK installed.

Client Computers

Example 1 Windows 2000 Professional

Example 2 Windows 2000 Server

Example 3 Windows 95 and Windows 98 with DFS client software. (No access to DFS links on NetWare servers).

Example 4 Windows NT 4.0 or later Server and Workstation

Distributed File system = DFS


The File Replication Service (FRS) can used to replicate DFS shares automatically.

The Distributed File System is worn to construct a hierarchical vision of multiple file servers and splits on the network. in its place of having to reflect of a specific machine name for every set of files, the user will only hold to remember one name; which will be the 'key' to a list of share establish on multiple servers on the network. assume of it as the home of all file shares with links that point to one or more servers that essentially host those shares. DFS has the potential of routing a client to the nearby existing file server by using Active Directory site metrics. It can also be established on a cluster for even better performance and reliability. Medium to large sized groups are most likely to assistance from the use of DFS - for lesser companies it is just not worth setting up as an regular file server would be just fine.

Understanding the DFS Terminology

It is important to recognize the new ideas that are part of DFS. beneath is an definition of all of them.

Dfs root: You can imagine of this as a share that is evident on the network, and in this share you can have extra files and folders.

Dfs link: A link is an additional share wherever on the network that goes under the root. When a user opens this link they will be conveyed to a shared folder.

Dfs target (or replica): This can be referred to as either a root or a link. If you have two identical shares, normally stored on different servers, you can group them together as Dfs Targets under the same link.

The image below shows the actual folder structure of what the user sees when using DFS and load balancing.

A distributed file system stores files on one or more computers called servers, and makes them accessible to other computers called clients, where they appear as normal files. There are several advantages to using file servers: the files are more widely available since many computers can access the servers, and sharing the files from a single location is easier than distributing copies of files to individual clients. Backups and safety of the information are easier to arrange since only the servers need to be backed up. The servers can provide large storage space, which might be costly or impractical to supply to every client. The usefulness of a distributed file system becomes clear when considering a group of employees sharing documents. However, more is possible. For example, sharing application software is an equally good candidate. In both cases system administration becomes easier.

There are many problems facing the design of a good distributed file system. Transporting many files over the net can easily create sluggish performance and latency, network bottlenecks and server overload can result. The security of data is another important issue: how can we be sure that a client is really authorized to have access to information and how can we prevent data being sniffed off the network? Two further problems facing the design are related to failures. Often client computers are more reliable than the network connecting them and network failures can render a client useless. Similarly a server failure can be very unpleasant, since it can disable all clients from accessing crucial information. The Coda project has paid attention to many of these issues and implemented them as a research prototype.

From caching to disconnected operation

The origin of disconnected operation in Coda lies in one of the original research aims of the project: to provide a file system with resilience to network failures. AFS, which supported 1000's of clients in the late 80's on the CMU campus had become so large that network outages and server failures happening somewhere almost every day became a nuisance. It turned out to be a well timed effort since with the rapid advent of mobile clients (viz. Laptops) and Coda's support for failing networks and servers Coda equally applied to mobile clients.

We saw in the previous section that Coda caches all information needed to provide access to the data. When updates to the file system are made, these need to be propagated to the server. In normal connected mode, such updates are propagated synchronously to the server, i.e. when the update is complete on the client it has also been made on the server. If a server is unavailable, or if the network connections between client and server fail, such an operation will incur a time-out error and fail. Sometimes, nothing can be done. For example, trying to fetch a file, which is not in the cache, from the servers, is impossible without a network connection. In such cases, the error must be reported to the calling program. However, often the time-out can be handled gracefully as follows.

To support disconnected computers or to operate in the presence of network failures, Venus will not report failure(s) to the user when an update incurs a time-out. Instead, Venus realizes that the server(s) in question are unavailable and that the update should be logged on the client. During disconnection, all updates are stored in the CML, the client modification log, which is frequently flushed to disk. The user doesn't notice anything when Coda switches to disconnected mode. Upon re-connection to the servers, Venus will reintegrate the CML: it asks the server to replay the file system updates on the server, thereby bringing the server up to date. Additionally the CML is optimized - for example, it cancels out if a file is first created and then removed.

There are two former critical of reflective meaning to detached process. First there is the notion of billboard files. Since Venus cannot serve a cache miss through a detachment, it would be nice if it kept important files in the cache up to date, by frequently asking the server to send the latest updates if necessary. Such important files are in the users hoard database (which can be automatically constructed by ``spying'' on the users file access). Updating the hoarded files is called a hoard walk. In practice our laptops hoard enormous amounts of system software, such as the X11 window system binaries and libraries, or Wabi and Microsoft Office. Since a file is a file, legacy applications run just fine