1.0 DEFINITION OF FACTOR ANALYSIS

Factor analysis (FA) refers to a latent structure approach that can be used to analyze interrelationships among a large number of variables by explaining the underlying unobservable variables (latent variables) that are reflected in the observed variables (manifest variables) known as factors.

With FA, the researcher can first identify the separate dimensions of the structure and then determine the extent to which each variable is explained by each dimension.  Once these dimensions and the explanation of each variable are determined, the summarization and reduction of data can be achieved.

In summarizing the data, FA describes the underlying dimensions of data in a much smaller number of items than the original variables. It examines the pattern of correlations (or covariances) between the observed measures.

Data reduction can be achieved by calculating scores for each underlying dimension and substituting them for the original variables.

FA is an interdependence technique where variates (factors) are formed to maximize their explanation of the entire variable set. These groups of variable would represent dimensions within the data which the researcher needs to label them.

Basically, there are two types of FA, exploratory and confirmatory. The first analysis is used to discover the nature of the construct that influence a set of response and latter, test a specified set of constructs is influencing responses in a predicted way.

Data summarization

The goal of data summarization is achieved by defining a small number of factors that adequately represent the original set of variables.

Data reduction

Data reduction is achieved by identifying representative variables from a much larger set of variables for use in subsequent multivariate analyses or creating an entirely new set of variables whilst retaining the nature and character of the original variables.

Data reduction relies on the factor loadings and uses them a s a basis for either identifying variables for subsequent analysis with other techniques or making estimates of the factor themselves (factor scores or summated scales), which then replace the original variables in subsequent analysis. 

Factor analytic technique is run according to their purpose either an exploratory or confirmatory perspective. Many researchers consider using the Exploratory Factor Analysis (EFA) when they are searching for structure among a set of variables or as a data reduction technique. EFA technique does not set any a priori constraints on the estimation of the components or the number of components to be extracted compared to the Confirmatory Factor Analysis (CFA). CFA is used to confirm what is expected on the basis of pre-established theory.

2.0 PURPOSE OF FACTOR ANALYSIS

The primary purpose of FA is to discover simple patterns in the pattern of relationships amongst variables by defining the underlying structure in a data matrix. This could be done by data summarization) and reduction.

3.0 HISTORY OF FACTOR ANALYSIS

FA was pioneered in 1904 by psychologist, Charles Spearman, who hypothesized that the enormous variety of test of mental ability (measures of mathematical skill, vocabulary, verbal skills and others) could be explained by one underlying factor of general intelligence he called g.FA was developed to analyze the test scores of g so as to determine if g is made up of a single underlying general factor or of several more limited factors measuring attributes like mathematical ability.

Raymond Cattell expanded the Spearman g test by using a multi factor theory to explain intelligence. He also developed several mathematical methods such as Cree Test and similarity coefficient.  His statistical methods led to an improved version of factor analyses by statistician.

4.0 PRINCIPAL COMPONENT (PCA) VERSUS FACTOR ANALYSIS (FA)

There are many debates amongst statistician on the different of Principal Component and FA. A distinct different is Principal Component assumes that responses are measured based on the underlying factors whist the latter are based on the measured responses.

Principal component analysis is used when the objective is to summarize most of the original information (variance) in a minimum number of factors for prediction purposes. In contrast, FA is used primarily to identify underlying factors or dimension that reflect what the variables share in common.

Principal components are defined as linear combinations of measurement, that contain small proportions of unique variance and in some instances, error variance whilst FA considers only the common or shared variance, assuming that both the unique and the error variance are not of interest in defining the structure of the variables.

PCA produces an orthogonal transformation of the variables without taking into consideration of underlying model whilst FA is based on a proper statistical model and is more concern with explaining the covariance structure of the variables than with explaining the variance (Chatfield, 1980).

The calculation of PC scores is straightforward whilst the calculation of factor scores is more complex and a variety of methods can be used.

Looking at the practical perspective, principal component analysis is most appropriate when the primary concern is data reduction focusing on the minimum number of factors needed to account for the maximum portion of the total variance represented in the original set of variables. FA is most appropriate when the primary objective is to identify the latent dimension or construct represented in the original variables.

5.0 STEPS IN FACTOR ANALYSIS

5.1 TEST ASSUMPTIONS

5.1.1 FA is robust to assumptions of normality

If the variables are normally distributed, then the solution is enhanced. To check normality, .....

5.1.2 Measure the sampling adequacy of sample size

There are many proposed sample size for FA. Guilford (1954) recommended that the sample size should be at least 200 whilst Hair, Black, Babin & Anderson (2010) stated that the minimum is to have at least five times as many observation as the number of variables to be analyzed and the more acceptable size would have 10:1 ratio. Comrey and Lee (1992) provided the following guidance in determining the adequacy of sample size:

Table 1: Determining the Adequacy of Sample Size

Sample Size

Indication

100

Poor

200

Fair

300

Good

500

Very good

1,000 or more

Excellent

5.1.3 All variables must be must be suitable for correlational analysis.

The sample is identified homogeneous with the respect to the underlying factor structure. It is inappropriate to treat a subset of items as a set of items known to differ in FA such as gender, where it will mislead the representation of the unique structure of each group.

There are various ways to quantify the degree of intercorrelations amongst the variables such as the Measure of Sampling Adequacy (MSA). The index ranges from 0 to 1 when each variable is perfectly predicted without error by other variables.  If MSA value falls below 0.50, researcher should identify variable for deletion to achieve an overall value of 0.50. According to Hair et al. (2010) can be interpreted as the followings:

Table 2: Measure of Sampling Adequacy (MSA).

Measure Of Sampling Adequacy

Indication

0.8 or above

Meritorious

0.7 or above

Middling

0.60 or above

Mediocre

0.5 or above

Miserable

Below 0.5

Unacceptable

Another method of determining the appropriateness of FA is the Bartlet test of sphericity and Kaiser-Myer-Oikin (KMO), a statistical test for the presence of correlations among the variables that indicates the significant status of the correlation matrix among at least some of the variables. KMO should indicates more than 0.5.

The factor analyst must ensure that the data matrix has sufficient correlations to justify the application of FA.

The anti image correlation matrix can be used to indicate whether the data matrix is suitable for FA. It is based on the correlation matrix of unpredicted variables using multiple regression. FA should not be performed when anti image correlation is less than 0.5 due to the lack of sufficient correlation with other variables.

6.0 SELECT TYPE OF ANALYSIS

6.2. 1EXTRACTION

In FA, the researchers group variables by their correlations, such that in a group (factor) have high correlations with each other. It is important to understand how much variable's variance is shared with olther variables in that factor versus what cannot be shared. The total variance of any variable os composed of its common, unique and error variances. As a variable is more highly correlated with one of more variables, the commune variable known as communalities increases.

6.2.2 ROTATION

This important tools refers to the movement of the reference axes of the factors from the origin to some other position. The ultimate effect of rotating the factor matrix is to redistribute the variance from earlier factors to later ones to achieve a simpler, theoretically more meaningful pattern.

There are two ways of rotation, either orthogonal factor rotation or oblique factor rotation. In orthogonal factor rotation, the axes rotation is maintain at 90 degrees compared to oblique factor rotation. The major orthogonal approaches are Varimax, Quartimax and Equimax. The Varimax method encourages the detection of factors each of which is related to few variables, on the other hand, Quartimax seeks to maximize the variance of the squared loadings for each variables and tend to produce factors with high loadings for all variables. Equimax is a solution of compromise between Varimax and Quartimax.

For Oblique factor rotation, Oblimin, Promax, Orthoblique, Dquart, Doblimin and Orthoblique has been developed. Oblimin allows factors to covary and to correlate with each other.

The researcher need to choose either orthogonal or oblique factor rotation based on the particular needs of a given research problem. However, Hair et al (2010) suggested that Orthogonal Rotation method is preferred when the research goal is data reduction to either a smaller number of variables or a set of uncorrelated measures for subsequent use in other multivariate techniques. Where as the oblique rotation methods are best suited to the goal of obtaining several theoretical meaningful factors or construct.

The Significance of Factor Loadings

Factor loadings indicatehow strongly a measured variable is correlated with a factor. A 0.30 loadings translates to approximately 10 percent explanation and a 0.50 loadings indicates that 25 percent of the variance is accounted for by the factor. Using practical significance of factor loadings, Hair et al. (2010) proposed the followings (for sample size of 100 or above):

Table 3: Significance of Factor Loadings

Factor Loadings

Indication

± 0.30 to 0.49

Meets the minimal level for interpretation of structure

± 0.50 or greater

Practically significant

Exceed 1.7

Indicative of well defined structure

Comrey & Lee (1992) also proposed practical significance of factor loading as below:

Table 4: Significance of Factor Loadings

Factor Loadings

Indication

More than 0.70

Excellent

Less than 0.63

Very good

Less than 0.55

Good

Less than 0.45

Fair

Less than 0.32

Poor

In relation to the table above, Hair et al (2010) provide guidelines for identifying significant factor loandings based on sample size as below:

Table 5: Guidelines for Identifying Significant Factor Loadings Based on Sample Size

Factor Loadings

Sample Size Needed for Significant a

0.30

350

0.35

250

0.40

200

0.45

150

0.50

120

0.55

100

0.60

85

0.65

70

0.70

60

0.75

50

a Significance is based on a 0.5 significance level (α), a power level of 80 percent, and standard errors assumed to be twice those conventional correlation coefficients.

Source: Computation made with SOLO Power Analysis, BDMP Statistical Software, Inc. 1993

Assess the Communalities of Variable

Communalities measures the percent of variance in a given variable explained by all the factors joint and may be interpreted as the reliability of the indicator. Communalities is used to indicate any variables that are not adequately accounted for by the factor solution. Variables with communalities less than 0.50 are considered of not having an acceptable level of explanation and researchers may then need to extract more factors to explain the variance.

6.3 DETERMINE NUMBER OF FACTORS

There are number of methods to determine the optimal number of factors. 

Latent root Criterion/Kaiser Criterion.

The latent root criterion or also known as Kaiser Criterion states that factors having latent roots or eigenvalues of the correlation matrix that are greater than 1 are considered significant. Eigenvalue refers to amount of variance explained by each principal component to each factor. Hair et all (2010) suggested that using eigenvalue for establishing a cut off is most reliable when the number of variables is between 20 and 50.

Scree Test Criterion.

The Cattell scree test is derived by plotting the latent roots against the number of factors in their order of extraction and the shape of the resulting curve is used to evaluate the cutoff point. From the Scree test, as one moves to the right, toward later components, the eigenvalues drop, The Cartell Scree test states to drop all other components after the one starting the elbow (a point after which the remaining eigenvalues decline in approximately linear fashion.

Variance Criterion

Variance Criterion is an approach to ensure practical significance for the derived factors in which the cumulative percentages of the variance extracted by successive factors. Hair (2010) proposed that it is uncommon to accept a solution that accounts for 60 percent of the total variance as a satisfactory solution.

6.4 NAME AND DEFINE FACTORS

As the variables become correlated and group together, the researchers need to label the group that can represent each group of variables as accurate as possible.

6.5 ANALYSE INTERNAL RELIABILITY

Reliability is an indicator to measure internal reliability. The rationale for internal consistency is that the individual items or indicators of the scale should all be measuring the same construct and highly correlated. There are two diagnostic measures of reliabilities, either to look at the item-to-total correlation and inter item correlation or the reliability coefficient. If the researcher choose the first method, the item-to-total correlations should exceed 0.50 and inter item correlation exceed 0.30. Using reliabilities coefficient, Zikmund, Babin, Carr & Griffin (2010) provide guideline in determining reliabilities as in Table below:

Table 6: Coefficient alpha (α) to Determine Reliabilities

Coefficient alpha (α)

Indication

Between 0.80 to 0.95

Very good

Between 0.70 to 0.80

Good

Between 0.60 to 0.70

Fair

Below 0.60

Poor

7.0 EXPLANATORY FACTOR ANALYSIS USING STATISTICAL PACKAGE FOR SOCIAL SCIENCE (SPSS)

Correlation Matrix

att1

att2

att3

att4

att5

att6

att7

att8

att9

att10

att11

att12

att13

att14

att15

att16

Correlation

att1

1.000

.664

.250

.435

.490

.315

.378

.328

.574

.336

.575

.338

.176

.436

.379

.560

att2

.664

1.000

.383

.506

.444

.456

.345

.260

.525

.316

.468

.414

.320

.533

.480

.674

att3

.250

.383

1.000

.457

.210

.321

.216

.054

.217

.206

.231

.225

.429

.425

.314

.296

att4

.435

.506

.457

1.000

.351

.352

.336

.240

.415

.352

.405

.416

.331

.558

.439

.529

att5

.490

.444

.210

.351

1.000

.210

.318

.194

.303

.216

.603

.330

.188

.296

.238

.352

att6

.315

.456

.321

.352

.210

1.000

.358

.128

.379

.475

.329

.290

.276

.421

.311

.486

att7

.378

.345

.216

.336

.318

.358

1.000

.256

.373

.344

.332

.320

.175

.333

.265

.397

att8

.328

.260

.054

.240

.194

.128

.256

1.000

.348

.209

.215

.128

.128

.200

.231

.265

att9

.574

.525

.217

.415

.303

.379

.373

.348

1.000

.437

.368

.383

.203

.492

.398

.609

att10

.336

.316

.206

.352

.216

.475

.344

.209

.437

1.000

.366

.296

.181

.325

.289

.419

att11

.575

.468

.231

.405

.603

.329

.332

.215

.368

.366

1.000

.338

.176

.382

.333

.445

att12

.338

.414

.225

.416

.330

.290

.320

.128

.383

.296

.338

1.000

.186

.377

.266

.386

att13

.176

.320

.429

.331

.188

.276

.175

.128

.203

.181

.176

.186

1.000

.391

.233

.318

att14

.436

.533

.425

.558

.296

.421

.333

.200

.492

.325

.382

.377

.391

1.000

.428

.579

att15

.379

.480

.314

.439

.238

.311

.265

.231

.398

.289

.333

.266

.233

.428

1.000

.559

att16

.560

.674

.296

.529

.352

.486

.397

.265

.609

.419

.445

.386

.318

.579

.559

1.000

Anti-image Matrices

att1

att2

att3

att4

att5

att6

att7

att8

att9

att10

att11

att12

att13

att14

att15

att16

Anti-image Covariance

att1

.399

-.141

-.001

-.012

-.051

.047

-.042

-.058

-.112

-.005

-.120

.029

.048

-.002

.014

-.016

att2

-.141

.366

-.057

-.013

-.054

-.080

.030

-.010

-.004

.051

.016

-.060

-.030

-.024

-.045

-.102

att3

-.001

-.057

.647

-.135

-.004

-.062

-.023

.075

.021

-.001

.006

.016

-.189

-.074

-.066

.063

att4

-.012

-.013

-.135

.519

-.027

.021

-.017

-.052

.011

-.046

-.024

-.097

-.019

-.107

-.057

-.047

att5

-.051

-.054

-.004

-.027

.570

.040

-.068

-.010

.010

.034

-.224

-.063

-.038

.021

.028

.009

att6

.047

-.080

-.062

.021

.040

.605

-.090

.041

-.010

-.183

-.039

-.004

-.033

-.046

.017

-.060

att7

-.042

.030

-.023

-.017

-.068

-.090

.719

-.089

-.024

-.061

-.002

-.075

.008

-.015

.001

-.035

att8

-.058

-.010

.075

-.052

-.010

.041

-.089

.817

-.102

-.032

.002

.053

-.055

.016

-.054

.017

att9

-.112

-.004

.021

.011

.010

-.010

-.024

-.102

.482

-.099

.039

-.071

.022

-.072

-.013

-.095

att10

-.005

.051

-.001

-.046

.034

-.183

-.061

-.032

-.099

.647

-.082

-.040

-.006

.023

-.014

-.029

att11

-.120

.016

.006

-.024

-.224

-.039

-.002

.002

.039

-.082

.491

-.025

.020

-.029

-.034

-.015

att12

.029

-.060

.016

-.097

-.063

-.004

-.075

.053

-.071

-.040

-.025

.711

.006

-.038

.011

.004

att13

.048

-.030

-.189

-.019

-.038

-.033

.008

-.055

.022

-.006

.020

.006

.737

-.094

.016

-.040

att14

-.002

-.024

-.074

-.107

.021

-.046

-.015

.016

-.072

.023

-.029

-.038

-.094

.501

-.026

-.064

att15

.014

-.045

-.066

-.057

.028

.017

.001

-.054

-.013

-.014

-.034

.011

.016

-.026

.632

-.125

att16

-.016

-.102

.063

-.047

.009

-.060

-.035

.017

-.095

-.029

-.015

.004

-.040

-.064

-.125

.362

Anti-image Correlation

att1

.897a

-.369

-.002

-.026

-.107

.095

-.079

-.101

-.256

-.009

-.271

.055

.089

-.005

.028

-.042

att2

-.369

.911a

-.118

-.030

-.118

-.169

.059

-.019

-.008

.105

.037

-.118

-.058

-.055

-.095

-.280

att3

-.002

-.118

.865a

-.234

-.006

-.099

-.034

.104

.037

-.001

.011

.024

-.274

-.130

-.104

.131

att4

-.026

-.030

-.234

.939a

-.050

.037

-.027

-.079

.023

-.079

-.047

-.159

-.031

-.209

-.100

-.107

att5

-.107

-.118

-.006

-.050

.875a

.069

-.106

-.015

.018

.056

-.423

-.098

-.058

.039

.047

.021

att6

.095

-.169

-.099

.037

.069

.907a

-.136

.058

-.019

-.293

-.072

-.007

-.049

-.083

.027

-.127

att7

-.079

.059

-.034

-.027

-.106

-.136

.950a

-.117

-.041

-.089

-.003

-.106

.011

-.025

.002

-.070

att8

-.101

-.019

.104

-.079

-.015

.058

-.117

.894a

-.162

-.044

.003

.069

-.071

.024

-.075

.032

att9

-.256

-.008

.037

.023

.018

-.019

-.041

-.162

.921a

-.177

.080

-.121

.036

-.147

-.023

-.229

att10

-.009

.105

-.001

-.079

.056

-.293

-.089

-.044

-.177

.901a

-.145

-.059

-.008

.041

-.022

-.061

att11

-.271

.037

.011

-.047

-.423

-.072

-.003

.003

.080

-.145

.883a

-.042

.034

-.057

-.060

-.036

att12

.055

-.118

.024

-.159

-.098

-.007

-.106

.069

-.121

-.059

-.042

.944a

.009

-.063

.016

.008

att13

.089

-.058

-.274

-.031

-.058

-.049

.011

-.071

.036

-.008

.034

.009

.887a

-.154

.023

-.078

att14

-.005

-.055

-.130

-.209

.039

-.083

-.025

.024

-.147

.041

-.057

-.063

-.154

.946a

-.047

-.150

att15

.028

-.095

-.104

-.100

.047

.027

.002

-.075

-.023

-.022

-.060

.016

.023

-.047

.943a

-.262

att16

-.042

-.280

.131

-.107

.021

-.127

-.070

.032

-.229

-.061

-.036

.008

-.078

-.150

-.262

.922a

a. Measures of Sampling Adequacy(MSA)

KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy.

.914

Bartlett's Test of Sphericity

Approx. Chi-Square

2491.010

df

120

Sig.

.000

Communalities

Initial

Extraction

att1

.601

.617

att2

.634

.606

att3

.353

.526

att4

.481

.514

att5

.430

.645

att6

.395

.360

att7

.281

.278

att8

.183

.164

att9

.518

.598

att10

.353

.308

att11

.509

.576

att12

.289

.274

att13

.263

.320

att14

.499

.550

att15

.368

.356

att16

.638

.682

Extraction Method: Principal Axis Factoring.

Total Variance Explained

Factor

Initial Eigenvalues

Extraction Sums of Squared Loadings

Rotation Sums of Squared Loadings

Total

% of Variance

Cumulative %

Total

% of Variance

Cumulative %

Total

% of Variance

Cumulative %

dimension0

1

6.452

40.324

40.324

5.959

37.243

37.243

3.346

20.915

20.915

2

1.340

8.373

48.697

.833

5.206

42.449

2.150

13.438

34.353

3

1.062

6.639

55.336

.582

3.637

46.086

1.877

11.733

46.086

4

.951

5.942

61.278

5

.841

5.253

66.531

6

.756

4.727

71.257

7

.656

4.101

75.359

8

.643

4.017

79.376

9

.577

3.608

82.985

10

.528

3.298

86.283

11

.499

3.118

89.401

12

.421

2.633

92.033

13

.389

2.431

94.464

14

.348

2.176

96.640

15

.302

1.889

98.529

16

.235

1.471

100.000

Extraction Method: Principal Axis Factoring.

Factor Matrixa

Factor

1

2

3

att16

.797

att2

.778

att1

.725

-.301

att14

.702

att9

.696

-.324

att4

.688

att11

.643

-.333

att15

.581

att6

.569

att5

.562

-.401

.410

att10

.526

att12

.522

att7

.519

att3

.487

.441

.308

att13

.412

.345

att8

.352

Extraction Method: Principal Axis Factoring.

a. 3 factors extracted. 18 iterations required.

Rotated Factor Matrixa

Factor

1

2

3

att9

.732

att16

.710

.359

att1

.567

.525

att2

.555

.391

.382

att10

.498

att6

.464

.366

att15

.462

.344

att7

.424

att8

.370

att12

.361

att3

.709

att13

.545

att14

.470

.544

att4

.403

.527

att5

.770

att11

.341

.655

Extraction Method: Principal Axis Factoring.

Rotation Method: Varimax with Kaiser Normalization.

a. Rotation converged in 6 iterations.

Factor Transformation Matrix

Factor

1

2

3

dimension0

1

.717

.515

.470

2

-.113

.751

-.650

3

-.688

.412

.597

Extraction Method: Principal Axis Factoring. 

Rotation Method: Varimax with Kaiser Normalization.