This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Research is becoming comparatively exciting and interesting process now days. Research helps the academicians and managers to solve various problems successfully in organizations. Business research can be defined as critical, organized, systematic, and objective which provides solutions to the issues or problems (Cavana, Delahaye & Sekaran, 2001). It is divided into basic and applied research. Basic research is referred to the research that is done to improve the understanding of problems that is occurring in organizations and find the appropriate solutions. Whereas, applied research is the research which is using particular results to solve the problems in organizations. The findings from basic and applied research will contribute to the knowledge in the various disciplines. Research process basically occupy 11 main steps such as preparation for research, preliminary information gathering, defining the problems, framework development, research objectives, research design, data collection process, data analysis, interpretation of findings, report presentation and final action (Cavana, Delahaye & Sekaran, 2001).
Hence, it is very important for the researcher to test the goodness of data through validity and reliability. Normally the goodness of data will be conducted during data analysis process. As what we know, it is essential to establish the reliability and validity of test instruments in empirical studies. It is, do tests give consistent measurements (main aspect of reliability), and do tests measure what is claimed for them (main aspect of validity)? (Zikmund, 2003). I will discuss further about the validity and reliability in-depth in the following sections. Furthermore, the science questions paper that is used for test purposes will be discussed. Besides that, the test-retest reliability coefficient will be discussed thoroughly using results that obtained through tests.
Validity is the capability of a scale or measuring instrument to measure what is needed to be measured. For example, when a student gets a lower mark, she started to complaint that the test didn't measure her understanding in the particular subject - what the lecturer had planned to measure; the test may have measured something else. Normally, researchers are very particular whether or not their measure is valid and the evaluation of validity articulates their concern with accurate measurement. If the scale did not measure what is supposed to be measured, there will be problems (Messick, 1995).
Measurement validity will be distinguished between a numbers of different types of validity. Those types of validity reflect different ways of measuring the validity of a measure of a concept. Now let us crane our neck towards the types of validity. Basically it is divided into three main categories such as face validity or content validity, criterion validity and construct validity.
2.1 Face Validity or Content Validity
Face validity refers to the measure that is developed which obviously reveals the content of the questions. This validity can be strongly established by asking other people whether or not the measure related to the concepts. Or, the same process can be done through expertise judges. It is important to see whether or not the measure represents the whole concept. The content validity will continuously increase when more scale items represent the main concepts. As have been mentioned earlier, content validity can be obtained through three ways such as from literature, qualitative research and expert judges. Content validity could be obtained through these methods (Haynes, Richard & Kubany, 1995).
2.2 Criterion Validity
Criterion validity is the ability of some measure to correlate with other measure of the same construct. For example, the traditional way of measuring the length is using tape measure while the new way of measuring is through laser technology. If the new measure correlated with other measures, it is known as criterion validity (Zikmund, 2003). Criterion validity is all about prediction about a phenomenon rather than explanation. Prediction is related to mathematical approach while explanation is more towards logical approach (Lacity & Jansen, 1994). Criterion validity is divided into concurrent validity and predictive validity. Concurrent validity is the measure that is taken at the same time as criterion measure and valid. Whereas, predictive validity is a type of validity that predicts a future event.
2.3 Construct Validity
Construct validity is the capability of a measure to establish a relationship of hypotheses that is produced from a theory. It is done during the statistical analysis process. In other words, construct validity explains the consistency between empirical evidences and theoretical logic. Construct validity is said as exist when it behave as it is supposed to be and intercorrelated with all the other variables (Hunter & Schmidt, 1990). The author also indicated that the construct validity is measured through correlation between independent variables and dependent variables that is used in the research. Besides that, the construct validity cannot be articulated in a single coefficient method.
A researcher should have the basic understanding about convergent validity and discriminant validity before determine the construct validity. Convergent validity is the ability of the new measures to have a high correlation with different measures of similar constructs whereas discriminant validity exists when measures in dissimilar concepts has a low correlation and considered as complex method during research process (Kane, 2001).
We cannot deny that the reliability of the measurement scale also plays an important role as validity in research process. According to Bryman (2008), reliability means the consistency of a measure of a concept. In other words, reliability refers to the measures which capitulate consistent results overtime and free from any errors. For example, the tailor obtains the value of fabric's length by using measuring tape. The measuring tape is considered as reliable if each time the repeated measures of the fabric shows the same length (Zikmund, 2003).
Besides that, we as a researcher can expect stable deviation scores in a testing situations using the same testing instruments. That is the reason for reliability involves consistency. We must understand that the reliability value changes with different samples, subject populations, different situations and research settings. The three major concepts which are involved in measuring reliability are stability, internal reliability and inter-observer consistency (Bryman, 2008).
Stability in reliability relates to the measure which is stable overtime and across situations. This will ensure that the results of the measure towards samples would not change. Internal reliability can be tested using split-half method. This internal reliability relates to multiple indicator measures. It happens when the respondent's answers to each and every question are calculated to form an overall score through multiple-item measures. This process will leads towards lack of coherence. So, we must make sure that all the indicators are related to each other to avoid unnecessary problems (Hunter & Schmidt, 1990)
Inter-observer consistency relates to the qualitative study. Lack of consistency exists when a subjective judgment involved in inter-observer consistency through recording of observations or the translation of data into categories and more than one people involved in this activity. There are three main methods in reliability such as test-retest method, split-half method and equivalent-form method (Zikmund, 2003).
3.1 Test-Retest Method
Test-retest method in reliability involves similar scale or measure which is used to the same respondents at different point of time to achieve stability. Normally, expected and similar results will be found if the measure is stable and administered under the same situation each time. For example, when researcher found 75 percent of employees are committed towards organization in different time under same condition, it shows that the measure has repeatability. This indicates that high level of reliability exist through high correlation and consistency between two measures at different time (Cortina, 1993)
If the measurement produce different results in different point of time, the results is known as unreliable due to error in measurement. Although test-retest method gives advantages to the researcher in terms of measurement, it still has several weaknesses such as pre-measure and homogeneity of the measure. Pre- measure occurs when the respondent's behavior changes during first test and second test. Basically, human behaviors are unpredictable. This will lead to low or moderate correlation due to behavior changes rather than measure's reliability. Whereas, the second weaknesses relates to the homogeneity of the measure when we need to present some parallel scale items for the measuring purposes (Cortina, 1993; Zikmund, 2003)
3.2 Split-Half Method
Split-half method is a splitting halves method to evaluate internal consistency by measuring large number of items. In this method, the researcher will compare the results that obtained from one half of the items with another one half of the items. The estimation of split-half method depends on how the items are split into two and different results will be obtained through sample splits (McLean, Smits, Tanner, 1996)
3.3 Equivalent-Form Method
This equivalent-form method is a method that designs two alternative instruments to test in the same group of respondents. The scale is considered reliable if there is a high correlation between these two different instruments. But, the problem exists if the reliable is low or the designed instruments are not similar (McLean, Smits, Tanner, 1996; Zikmund, 2003)
As a teacher, I need to do a proper planning in preparing the test questions for my students. The test is a necessary tool to identify students understanding level after the teaching and learning process. It needs good planning and systematic way of conduct. Tests itself are divided into multiple choice questions, subjective questions, essay questions, right or wrong method and others. All these method are important to get the information's. Teachers can identify the students' weaknesses and plan for the future through previous information's.
The test questions should be prepared based on 'Jadual Penentuan Ujian (JPU)' since it provides an appropriate method to design the questions which is valid and reliable (Nitko, 2003). This table helps the teacher to decide type of questions to be tested.
I have prepared a set of science question paper which consists of 30 questions to determine the level of student's achievement. The total number of questions is important since it affect the reliability and validity of the measures. This study involves 20 students (year 5) altogether. The questions are only focusing on multiple choice methods with clear pictures. Pictures will help the students to understand the questions better. I have allocated 1 hour for test 1 and test 2 respectively. Test 1 and test 2 were conducted on different days but using the same question papers.
I will conduct a reliability test on a set of examination questions that I used in the classroom. I plan to use test-retest reliability method to determine the reliability of the questions. The test-retest reliability coefficient can be calculated by computing the correlation coefficient of the test results. The calculations are as below and table 1 show the total marks that obtained by the students using same set of questions in different time.
SJK (T) JAVA LANE, 70000 SEREMBAN, NEGERI SEMBILAN.
SENARAI NAMA MURID TAHUN 5 VALLUVAR - 2010
GURU KELAS : PN.SHARALLA KOLANDAISAMY
Table 1: Total marks that obtained by the students using same set of questions in
âˆ‘ (Test 1 * Test 2)
= (83Ã-85) + (80Ã-82) + (93Ã-97) + (77Ã-85) + (73Ã-82) + (87Ã-94) + (97Ã-90) + (80Ã-85) +
(87Ã-93) + (70Ã-85) + (93Ã-87) + (83Ã-90) + (97Ã-90) + (100Ã-98) + (73Ã-88) + (97Ã-97) +
(93Ã-95) + (83Ã-90) + (100Ã-100) + (80Ã-90)
= 156, 345
âˆ‘ (Test 1)
= 83 + 80 + 93 + 77 + 73 + 87 + 97 + 80 + 87 + 70 + 93 + 83 + 97 + 100 + 73 + 97 + 93
+ 83 + 100 + 80
âˆ‘ (Test 1^ 2)
= (83Ã-83) + (80Ã-80) + (93Ã-93) + (77Ã-77) + (73Ã-73) + (87Ã-87) + (97Ã-97) + (80Ã-80) +
(87Ã-87) + (70Ã-70) + (93Ã-93) + (83Ã-83) + (97Ã-97) + (100Ã-100) + (73Ã-73) + (97Ã-97) +
(93Ã-93) + (83Ã-83) + (100Ã-100) + (80Ã-80)
= 150, 666
âˆ‘ (Test 2)
= 85 + 82 + 97 + 85 + 82 + 94 + 90 + 85 + 93 + 85 + 87 + 90 + 90 + 98 + 88 + 97 + 95 +
90 + 100 + 90
âˆ‘ (Test 2^ 2)
= (85Ã-85) + (82Ã-82) + (97Ã-97) + (85Ã-85) + (82Ã-82) + (94Ã-94) + (90Ã-90) + (85Ã-85) +
(93Ã-93) + (85Ã-85) + (87Ã-87) + (90Ã-90) + (90Ã-90) + (98Ã-98) + (88Ã-88) + (97Ã-97) +
(95Ã-95) + (90Ã-90) + (100Ã-100) + (90Ã-90)
= 163, 093
N = 20
âˆ‘ (Test 1 * Test 2) - [âˆ‘ (Test 1) * âˆ‘ (Test 2)] / N
= 156, 345 - [1726 Ã- 1803] / 20
= 156, 345 - 155, 598.9
âˆ‘ (Test 1^ 2) - [âˆ‘ (Test 1) ^2] / N
= 150, 666 - [1726 Ã- 1726] / 20
= 150, 666 - 148, 953.8
âˆ‘ (Test 2^ 2) - [(âˆ‘ (Test 2) ^2] / N
= 163, 093 - [1803 Ã- 1803] / 20
= 163, 093 - 162, 540.45
r = 746.1
âˆš (552.55 Ã- 1712.2)
So, the test-retest reliability coefficient is 0.76
We cannot deny that the validity and reliability of the measurement scale plays an important role in research process. Validity and reliability are interrelated. For example, if the measure is not reliable, it cannot be valid too. So, we should be very clear on three main aspects of reliability such as stability, internal reliability and inter-observer consistency (Bryman, 2008). Besides that, to obtain an accurate result, the measurement scale must fulfill the requirement of validity and reliability simultaneously. As a conclusion, we as a researcher need to use well validated and reliable measures to obtain expected results and fulfill the objectives of the research as well as in academic field.