This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Welcome to the chapter 'Network Troubleshooting'. This chapter describes how to narrow down the exact problem in the network. Additionally, it discusses troubleshooting in wireless networks, troubleshooting steps and resources required for troubleshooting.
In this chapter, you will learn to:
Explain how to narrow down the problem
Discuss troubleshooting in wireless networks
Explain the troubleshooting Steps
List the resources required for troubleshooting
Describe the troubleshooting tips
Note to Reviewer: An additional subtopic troubleshooting in wireless network is included.
16.1 Narrowing Down the Problem
A system administrator when experiences any kind of network problem it is always a challenge to get it resolved, keeping in mind the network uptime expectation.
A systematic approach to resolve the problem is always helpful and the first thing that should be done is to narrow down to the exact root cause. Following are the points that help the system administrator to narrow down the exact root cause:
Check the basics.
Check what is causing the issue is it hardware or software.
Check if issue is at server or workstation.
Check if issue is caused by internal malfunction or external intrusion.
In this section, the above points are examined in detail.
16.1.1 The Basics
The problem could be as basic as route, switch or server not turned on. It usually could be a very simple issue made to look complex. Someone has that "God lies in the basics", this is so true sometimes. In the zest to get the issue resolved somehow this basic is ignored and brains are overused thinking it to be a complex problem.
The basic checklist that one needs to keep in mind is check for valid users, check if the policies are successfully applied, check the status of LED link, check the log files and check for abnormal activity.
In case users are not able to login there could be various reasons behind it. Some of those are listed below:
The user is not a valid user anymore that is the user account is either deleted or disabled.
The group membership does not allow the user to login at that time or that day according to policy set.
The user is not entering the correct credentials. The password is wrongly entered either due to spelling mistake, an old password entered or the password is entered in the wrong case.
The status of the lights on the LED indicates if various types of operations, these are always helpful in easy detection of network issues.
Network devices such as router, switch, hub and the important Network Adapter Card (NIC) have the link light. It is typically green in color with the label link.
In case of network run on a 10Base-T, the link light indicates that there is a logical connection between the NIC and hub or switch. In case the link light is static on both the devices it means that both the devices are communicating fine.
Checking log files on Network Operating System (NOS) example Windows server OS gives a good idea of what errors have occurred with the workstation or server with respect to local devices or network devices.
System log on Windows server OS gives an idea if some system device is malfunctioning or some network protocol has an issue.
Application log will give an idea if some application is malfunctioning or has crashed. This could be software which is being used as a network application.
This is usually hard to find but can be found by monitoring tools. Many NOS come with built in monitoring tools to monitor network activity and report if some abnormal behaviour is detected.
16.1.2 Hardware or Software
It's always important to find if the root cause of the network problem is hardware or software.
If link LED's are fine it means that network transmission is fine then it is most probably some software which was either installed or updated recently.
Sometime patch or version upgrade for OS or anti-virus could also cause an issue.
16.1.3 Server or Workstation
If a user is unable to login to the network it is important to find out if there is an issue with the local workstation or is it related to the server.
If user cannot logon locally to the workstation then you can try logging on from another workstation to the network. If that does not work then the issue is with the server and not the workstation.
16.1.4 External Intrusion
If everything within the network seems to be working perfectly but still there seems to abnormal activity in the network, then the log on to the server and check the firewall.
Firewall should serve as an indicator if there has been successful or unsuccessful intrusion by external network user. Within the network following things can be checked:
If particular workstation or particular segment of the network has been affected.
If there is an issue with the hardware connection that is cabling, connectors and so on.
If NIC is binded with network protocols then an issue with the OS malfunctioning can also unbind the network device with its corresponding software part in the OS.
16.2 Troubleshooting in Wireless Network
Wireless networks are a little challenging for the network administrators to troubleshoot due to the fact that both the wired and wireless components have to be monitored and the troubleshooting steps designed accordingly. Table 16.1 explains the factors that affect the wireless networks. One needs to taken into account these factors while troubleshooting a wireless network.
Wireless networks are more prone to interference as they transmit signals through radio waves.
Interference can be through other wireless devices such as Bluetooth devices, cordless phones and other devices that with frequency ranges.
Signal strength of wireless devices is affected factors such as interference intensity and distance between Wireless Access Point (WAP) a client.
Wireless networks can set up encryptions such as Wired Equivalent Protocol (WEP) and Wi-Fi Protected Access 2 (WPA2) with Advanced
Encryption Standard (AES).
To avoid discrepancy while setting up encryption, ensure that the same encryption standard is used for the sender, receiver and relaying or amplifying WAP devices.
Wireless networks generally do not have a configuration issue as they use channel 1,6 or 11 between 2.4GHz or 5Ghz band and the clients auto configure themselves to an channel that WAP is relaying.
These networks have more bandwidth so the only reason for signal to get affected is that the client and the WAP do not tune in to the same channel.
Wireless devices these days used 802.11g or 802.11b standard. Additionally, the have a console that can be used to set the standard that one wants to use.
Always, ensure that the same standard is used by both the send and receive device.
Service Set Identifiers (SSID) Mismatch
Wireless device have an SSID. Other wireless devices within the range are relayed this SSID. SSID's can either be Basic Service Set Identifiers (BSSIDs), which identify an individual client or Extended Service Set Identifiers (ESSIDs), which identify a certain WAP.
Ensure that the receiving devices detect the SSID. In case wireless card is unable to catch the signal from the WAP that means wither the WAP or receiver device is not functioning properly or they are out of range.
Table 16.1: Factors Affecting Wireless Networks
16.3 Troubleshooting Steps
Table 10.2 describes the troubleshooting steps in detail.
The first step is to collect data. Data collection helps in identifying symptoms and determining the root cause of the issue.
Identify the affected areas
The second step is to identify the exact affected area in the network.
It is always essential to narrow down the problem to a specific workstation or a subnet of a small or a large network.
Determine the recent changes
The third step is determining the recent changes.
Information about the recent changes is very helpful to narrow down to the root cause.
Determine the probable root cause
The fourth step is determining the most probable root cause.
The third step is very likely to suggest the probable root cause but wants more important is to determine the root cause.
Draft the action plan
The fifth step is drafting the action plan.
After the collecting the data on the recent changes one then needs to draft an action plan to resolve the issue.
Note: While planning the action plan take into consideration the short term and long term after effects as well.
Implement the action plan
The sixth step is to implement the action plan.
To resolve the issue, this step is very critical. Remember, the steps first needs to be implemented in the test environment before implementing it in real time.
Analyse the after effects
The seventh step is identifying the after effects.
Analyse the short term and long term after effects.
Document the procedure
The eighth step is documenting the entire procedure.
Documentation includes the entire process starting with arrival of the issue and ending with it being resolved. This helps others while tacking a similar issue later.
Escalate the requirement
The ninth step is escalating the requirement.
At times the issue needs to be escalated to a higher level. One can escalate at the evry beginning when the engineer knows it's out of the knowledge or support boundaries or when initial troubleshooting steps taken fail to get the issue resolved.
Follow the correct order
The last step is following correct order while troubleshooting.
Remember, to first collect data, identify the root cause, create a action plan, implement the plan and analyse the after effects.
16.4 Troubleshooting Resources
While troubleshooting hardware, one port of a switch should always be kept not connected to the network so that it can be experimented. Also one switch and router should always be kept in reserve as backup.
Additionally, backup of the data including system data should be performed regularly. Following are the resources required by a system administrator while troubleshooting:
Screw driver set
OS installation CD(s)
OS Updates and Patches CD
Monitoring tools software
Support contract phone number
16.5 Troubleshooting Tips
Till now the basics of network troubleshooting, troubleshooting steps and troubleshooting resources are covered. Now, let's review some troubleshooting tips.
Never Ignore the Basic
This could be as simple as link light on any network device that's not lit. Human errors where system administrator hasn't turned on the power, left a cable unplugged, or mistyped a username and password are quite possible.
Identify the Priority Issues
A network administrator should always prioritize the tasks to be handled in the scenario of multiple issues being reported at the same time. This is usually in the order of the most important issues being addressed first. The following are some issues listed from highest to lowest priority:
Entire network failure
Particular subnet of users experiencing network connection issue
A group of workstation affected by network
Workstation rebooting continuously
Workstation going into hung state intermittently
Workstation applications going into a hung intermittently
Check for Software Configuration
More often than not, root cause of network issues are related to software configuration. The following that needs to be usually checked are:
WINS configuration (not much in use nowadays)
In Windows server OS, the application and system log file can help in identifying the nature of the issue.
Check for Hardware Issues
It is always advisable to take care of the physical environment for a server if it is optimized for factors such as location, temperature and humidity.
This becomes very helpful during actual root cause determination and troubleshooting. Things like RFI, hardware cabling and EMI should be kept in mind and tools like cable testers and LAN analyzer (tester) are certainly important.
Check for Software Issues
It is very much possible that softwares could also the cause of network issues. Some of the software issues are as follows:
Viruses, spyware, malware
Operating system service packs and updates
Security patch for OS
Other softwares on the network workstation
A good way of troubleshooting is to have a test workstation with the same changes and then gradually rolling back the software changes. More often than not rolling back one or more software updates resolves the issue immediately.