Research on the sixteen personality factor questionnaire


The Sixteen Personality Factor Questionnaire (16PF) is a measure of normal personality, based on R. B. Cattell's factor-analytic theory of personality (Cattell, 1933, 1946). Since the original 16PF Questionnaire was published in 1949, it has been revised four times, in 1956, 1962, and 1968 and the latest fifth edition in 1993.

Description of Test

The 16PF Fifth Edition Questionnaire contains 185 multiple-choice items. The questionnaire is written with simple (fifth grade) and updated language, and is meant for individuals above 16 years of age.

Personality items have a three-choice answer format, and the middle response is a question mark. The items are nonthreatening and ask about personal preferences, interests, behaviors, and opinions. Items were also reviewed for gender, race or cultural bias, compliance with the Americans with Disability Act, and cross-cultural translatability.

Parallel versions of the 16PF test include the 16PF Adolescent Personality Questionnaire for lower age ranges of 12 - 18 years (Schuerger, 2001); 16PF Select, a shorter version used for employee selection (Cattell et al., 1999); and the 16PF Express with reduced items for each factor (Gorsuch, 2006). 16PF is also included in the PsychEval Personality Questionnaire which measures both normal and abnormal personality dimensions (Cattell et al., 2003).


The test consists of sixteen primary factor scales including a cognitive ability scale, five global factor scales, and two super factor scales. In addition, it features three response style indices to measure validity. All scales were determined through factor analysis.

The sixteen primary scales are basic elements of personality. Each primary scale contains 10-15 items. The sixteen primary scales are Warmth (A), Reasoning (B), Emotional Stability (C), Dominance (E), Liveliness (F), Rule-Consciousness (G), Social Boldness (H), Sensitivity (I), Vigilance (L), Abstractedness (M), Privateness (N), Apprehension (O), Openness to Change (Q1), Self-Reliance (Q2), Perfectionism (Q3), and Tension (Q4).

Global scales are based on more items (40-50) than primary scales, hence are more reliable and robust. The five global scales are Extraversion, Anxiety Neuroticism, Tough-Mindedness, Independence, and Self-Control. More confidence can be placed in their accuracy. Each global factor is made up of four or five specific primary traits. The global scales align fairly well with other Big-Five measures. However, global scales are broad in meaning.

Super factor I, called active outward engagement, consists of both Extraversion and Independence. It involves tendencies to socially connect to the world, and to explore and master the environment. Super factor II, called self-disciplined practicality versus unstrained creativity, consists of both Self-control and Receptivity. Self-controlled people tend to be more tough-minded and less receptive to feelings and new ideas, while impulsive and undisciplined tend to be more creative and receptive to emotions and ideas.

The three response-style scales help to identify unusual response patterns which especially affects predictive validity. The three response indices are the Infrequency (INF) Scale, the Acquiescence (ACQ) Scale, and the Impression Management (IM) Scale. Factors that affect accurate responding include low reading comprehension, test anxiety, and giving socially desirable answers. The INF scale measures whether the individual responded meaningfully or randomly, while the IM and ACQ scales indicate whether the individual was motivated to present an accurate self-portrayal.

Administration and Scoring

Administration and scoring is easy and can be completed by a trained nonprofessional. The test can be administered individually or in group settings, and requires minimal supervision. It is untimed, and has simple, straight forward instructions. The paper-and-pencil format takes 35 to 50 minutes while computerized administration takes 25 to 35 minutes.

Online administration and scoring are available on the Internet via NetAssess. The 16PF Questionnaire is available in over 35 languages, and provides internet multilingual testing and scoring with the appropriate norms.

Hand scoring is quick and simple, and takes an experienced scorer only 6 or 7 minutes to complete. It requires a set of four scoring keys, a norm table, and an Individual Record Form. Detailed hand-scoring instructions are provided in the test administrator's manual.

Computer scoring can generate additional scores and information that enhance test interpretation. Test answer sheets may be mailed or faxed to the test publisher, or scored on a personal computer via the Internet or software.

Score Interpretation

Scores are presented in "stens" or standard-ten scale, ranging from 1 to 10, with a mean of 5.5 and a standard deviation of 2. Sten 4 is considered low, sten 5 or 6 average, and sten 7 high. A+ indicates a high score (right pole) while A- indicates a low score (left pole) on the primary scale Warmth (A).

The 16PF primary and global scales are bipolar, with well-defined meanings at both poles rather than varying degrees of the scale. A high or low score on a scale is not regarded as good or bad. Rather, the score increases the likelihood that the trait defined at the pole will be distinctive of the individual's behavior. Whether that trait is determined to have positive or negative effects depends on the context.

According to Cattell and Schuerger (2003), interpreting the 16PF Questionnaire first requires the interpreter to consider all other sources of information about the person to better understand the individual and the context in which the testing is taking place. For instance the purpose for the testing, anxiety or unfamiliarity with tests, results from other tests, life history accounts, and interview data. This is followed by the evaluating of response indices, global and primary scores. Then determine scale interactions and finally integrating all information in relation to the assessment question.

Individual scores are compared to the mean score based on the 16PF normative sample. Scores that fall outside the average range are the focus of interpretation. Extreme scores are central to the individual's identity.

Interpreting Primary Scales

According to the 16PF Fifth Edition Administrator's Manual (Russell & Karol, 2002), having two to seven extreme primary scores is within the average range. Extreme primary scores represent the strong behavioral tendencies that may be difficult for the person to shift away from. The more extreme primary scores an individual has, the more well-defined his or her personality style will be. Possible explanations for few extreme scores are that the person's behavior is average, unclear self-picture on certain traits by answering similar items in inconsistent directions, or avoid making a poor impression or by choosing a relatively high number of b responses.

Interpreting Global Scales

According to the 16PF Fifth Edition Administrator's Manual (Russell & Karol, 2002), 86% of the general population have zero to two extreme global scores in their profiles. 10% had three extreme global scores, 3% four, and less than 1% had all five global factors. People with three extreme global scale scores have an above-average number of distinctive traits, and those with four or five total extreme global scores are rather unique in the distinctiveness of their personality.

Primary scales may contribute to global scales in either a positive or a negative direction. For example, a high score on the global Extraversion can come from high scores on Warmth (A+), Liveliness (F+), or Social Boldness (H+) or from low scores on Privateness (N-) and Self Reliance (Q2-).

Scale Interaction

An essential part of 16PF interpretation is understanding which primary scales fit together to form the global scales. Two or more scores can interact and have modified meaning together, called scale interactions or score patterns. For instance, with a score pattern of high Warmth (A+) and Dominance (E+), the aggressive and overbearing qualities present in E+ individuals are softened by the tendency to be concerned about others and their feelings (A+). The individual is likely to be persuasive and socially facilitative rather than stubborn or domineering (Karson et al., 1997).

Interpreting Response indices

Scores of the response style indices are presented as raw scores and percentiles rather than as sten scores. The 16PF manuals (Russell & Karol, 2002) suggest using the 95th percentile as the cut off mark for response set. If any of the indices is extreme, the interpreter should evaluate whether the individual's response set might be affecting the validity of the profile.

Normative and Standardization Procedure

Information describing the norming process is vague. The current standardization sample was released in 2002 and is based on a stratified random sample of 10,261 individuals, matched to the U.S. Census data from 2000 for sex, race, and age (Cattell & Schueger, 2003).


Test-retest Reliability

The 16PF Fifth Edition Technical Manual (Conn & Rieke, 1994, cited in Cattell & Schueger, 2003) reports strong test-retest reliabilities, which are estimated on a sample of 204 people for two-week interval and 159 people for two-month interval.

Two-week test-retest estimates for the 16PF primary scales ranged from .69 to .87, with a mean of .80, while two-month test-retest reliabilities ranged from .56 to .79, with a mean of .69.

Two-week test-retest estimates for the global scales ranged from .84 to .91 with a mean of .87, and two-month test-retest estimates ranged from .70 to .82 with a mean of .78.

International 16PF editions also show strong test-retest reliabilities. For instance, one-month estimates of primary scales had a mean of .83 for the German edition (Schneewind & Graf, 1998, cited in Cattell & Mead, 2008); .86 for the Danish edition (IPAT, 2004c, cited in Cattell & Mead, 2008); and .73 for the French edition (IPAT, 1995, cited in Cattell & Mead, 2008).

Internal Consistency

The test manual (Conn and Rieke, 1994, cited in Cattell & Schueger, 2003) also reports good internal consistency for the 16PF scales. Estimated on a stratified random sample of 10,261 people, Cronbach's alpha ranged from .66 to .86, with a mean of .76. Internal consistency estimates are not provided for the global scales.


The three response style indices are used to measure validity.

Content-related Evidence: Factorial validity

Factorial validity of 16PF scales is particularly important as the 16PF Questionnaire is developed through factor analysis. Several factor-analytic studies have established strong support for the structure of the primary and global traits across diverse sample groups.

For instance, Hofer, Horn, and Eber (1997) found the factor structure to be robust across six diverse samples of a total of 30,732 individuals. Dancer and Woods (2007) found strong support for the global traits through factor analysis of the primary traits based on a sample of 4,414 business employees.

Factorial validity has also been confirmed in the international editions, for instance German edition (Schneewind and Graf, 1998), Italian edition (Barbaranelli & Caprara, 1996), Chinese edition (Jia-xi and Guo-peng, 2006), French edition (Rolland and Mogenet, 1996), and Japanese edition (IPAT, 2007),

Construct-related Evidence: Convergent Validity

Convergent validity is established by the correlations between the 16PF scales and scales on other instruments. Strong relationships with other measures of personality help to validate the meanings of the 16PF scales. For instance the NEO-PI-R (Costa and McCrae, 1992a), the California Psychological Inventory (Gough, 1987), the Myers-Briggs Type Indicator (Myers and McCaulley, 1985), and the Personality Research Form (Jackson, 1989).

There is good convergent validity for international 16PF editions too. For instance, the German edition has strong correlations with the NEO-PI-R and the Personality Research Form (Schneewind and Graf, 1998).

Criterion-related Evidence: Predictive Validity

Predictive validity of the 16PF scales has been established by its usefulness in a range of settings, for instance employee selection, career development, clinical and counseling, educational and research settings.

The 16PF scales have also been useful in understanding and predicting a range of areas, for instance leadership potential (Conn and Rieke, 1994), social skills (Conn and Rieke, 1994), creativity (Guastello and Rieke, 1993), and several occupational profiles (Cattell, R.B. et al., 1970; Conn and Rieke, 1994; Schuerger and Watterson, 1998; Walter, 2000).

