| Psychological Assessment: A Journal of Consulting and Clinical Psychology | © 1991 by the American Psychological Association, Inc. |
December 1991 Vol. 3, No. 4, 648-653 | For personal use only--not for distribution. |
Elevated T scores on the Minnesota Multiphasic Personality Inventory (MMPI) F scale have been found among Chinese normative and criterion groups. The elevation may be attributed to cultural differences in endorsement percentages of the F scale items on which the Chinese subjects in the People's Republic of China (PRC) and Hong Kong tend to endorse more frequently than originally designed. To construct a Chinese infrequency scale, 2 sets of items were chosen, using 10% and 15% endorsement percentages of the combined group of Hong Kong and PRC subjects as the criteria for item selection (ICH1 and ICH2). These infrequency scales were validated using criterion groups in Hong Kong and the PRC, including psychiatric patients, prisoners, fake-good subjects, and fake-bad subjects.
Studies using the Minnesota Multiphasic Personality Inventory (MMPI; Hathaway & McKinley, 1967 ) with Chinese subjects have invariably reported elevated scores on Scale F for normative and criterion groups when the American norm is used. These elevations were observed whether the English version was taken by Chinese American college students ( Sue & Sue, 1974 ) or the Chinese version was taken by "normals" as well as by psychiatric patients in the People's Republic of China (PRC) and in Hong Kong ( Cheung, 1985 , 1986 ; Cheung & Song, 1989 ; National MMPI Coordinating Group, 1982 ). For the Chinese MMPI normative sample in the PRC, respective T scores for male and female subjects were 75 and 69 ( National MMPI Coordinating Group, 1989 ), whereas for the subset of college students within this Chinese sample, the T scores were 74 (men) and 67 (women). In Hong Kong, the college-sample T score on F for both male and female subjects was 68. For psychiatric patients, the elevation was even more pronounced, with T scores ranging from 70 to 100 ( Cheung & Song, 1989 ).
The elevation on Scale F has raised a number of problems in the administration and interpretation of the Chinese MMPI. The use of a high F or the F K index ( Gough, 1950 ) to invalidate profiles would have excluded an exceedingly large number of cases. Similar problems with attrition of data have been reported in earlier studies in the U.S. when high F scores ( F > 16) alone were used to exclude profiles ( Gynther, 1961 ; Gynther & Shimkunas, 1965 ). These criteria have been suspended from the exclusion rule in research with the Chinese MMPI in China and Hong Kong. Instead, only the testretest (TR) index ( Dahlstrom, Welsh, & Dahlstrom, 1975 ) was found to be useful in identifying subjects who might be checking randomly or were disorganized in their responses.
Taken at its face value, the elevated F scores on MMPI profiles of Chinese individuals may be interpreted as indicative of malingering, confusion, carelessness, or openness to endorsing psychological disturbance. An examination of the original construction of this validity scale raises doubts about the validity of such an interpretation for Chinese subjects. Unlike most of the other basic scales that were constructed empirically by comparing criterion groups with the normative sample, the F scale was developed statistically by including those items that were endorsed infrequently by the normative sample ( Dahlstrom, Welsh, & Dahlstrom, 1975 ). Sixty-four items were chosen on the basis of no more than 10% endorsement by men and women in the Minnesota normative sample. With the American normative sample, the mean raw score on Scale F was 3 out of 64 items ( Hathaway & McKinley, 1967 , Table 7). For the Chinese samples, the average man would be endorsing 11 to 14 items, whereas the average woman would be endorsing 11 to 12 items on the F scale ( Cheung, 1985 ; National MMPI Coordinating Group, 1989 ). This suggests that the items on the F scale are not as infrequently endorsed by the Chinese subjects as they are supposed to be.
Table 1 lists the endorsement percentages of true and false items on the F scale by the Chinese normative sample in the PRC, the college sample in Hong Kong, and the college sample in the U.S. ( Loper, Robertson, & Swanson, 1968 ). For the American college sample, the endorsement rate exceeds 10% for 11 items among men and 7 items among women. In contrast, the number of items exceeding the 10% endorsement rate is much higher among the Chinese and Hong Kong subjects. For the Chinese men and women, there are 45 and 42 such items, respectively, whereas for both the male and female Hong Kong college students, the total number of items exceeding 10% endorsement is 37. Some of the infrequent items on the original F scale are endorsed by a majority of the subjects in the PRC or Hong Kong, nullifying their function as "infrequent" items. The mean endorsement percentages for the PRC and the Hong Kong subjects are all significantly higher than those for the U.S. subjects. There is no significant difference among the Chinese subjects from the PRC and from Hong Kong.
Items that were endorsed most differently between the American subjects and the two groups of Chinese subjects include religious items (Item 115, I believe in a life hereafter, and Item 258, I believe there is a God ), sex items (Item 20, My sex life is satisfactory ), and activity level (Item 40, Most any time I would rather sit and daydream than do anything else ). The discrepancy between American and Chinese subjects in endorsing these items is more likely a reflection of cultural differences in beliefs, values, and practices than in deviance.
Although discrepancies in item endorsement have been found in other MMPI basic scales ( Cheung, 1985 ), the elevations on such scales may be adjusted when the Chinese norm is applied ( Cheung & Song, 1989 ). The construction of the F scale, however, differed from the other scales in that items were selected entirely on the basis of infrequency of endorsement. The meaning of the scale will be altered if the original F scale is simply renormed.
To construct a parallel infrequency scale on the Chinese MMPI, a similar procedure to that adopted by the original MMPI F scale was used. Whereas no more than 10% endorsement rate was used in the original F scale construction, different criteria for item selection were tried out on several versions of the Chinese infrequency scale. In this study, the new scales were rescored on groups of subjects, including patients, prisoners, and subjects who were asked to fake good or to fake bad.
The normative subject groups in both the PRC and Hong Kong are used to provide a cross-national sample of Chinese subjects for item selection. The normative sample for the Chinese MMPI in the PRC consisted of 1,553 men and 1,516 women and was collected from a national stratified sample ( National MMPI Coordinating Group, 1989 ). The sample covered different age groups, various educational levels from Grade 6 upward, and six geographical regions. The Hong Kong sample consisted of 334 male and 271 female college students whose group profiles were found to be similar to those of less educated factory workers ( Cheung, 1985 ). All of the cases have been screened for validity on the basis of the TR index ( Dahlstrom, Welsh, & Dahlstrom, 1975 ).
Instrument.Except for minor variations in a few items in terms of linguistic expression or adaptation of the contents to fit the local context, the Chinese versions of the MMPI in Hong Kong and in the PRC are highly similar. The PRC version was translated using the Hong Kong version as a blueprint. Details of the translation procedures of the two versions have been reported by Cheung (1985) and the National MMPI Coordinating Group (1982) . The overall Pearson productmoment correlations between the two Chinese versions and the original MMPI were found to be quite high, supporting the cross-cultural equivalence of the translated versions.
Procedure.The endorsement percentage of all the MMPI items for men and women were listed for the PRC and the Hong Kong samples. Two selection criteria were tried out: For the first infrequency scale for China and Hong Kong ( ICH 1), items that were endorsed by no more than 10% of both the Hong Kong and the PRC samples were included; for ICH 2, items that were endorsed by no more than 15% of both the Hong Kong and the PRC samples were included. Earlier versions using only items selected from the Hong Kong sample or the PRC sample alone were found to be less satisfactory when applied to the other sample.
ValidationThe derived infrequency scales were rescored on different criterion groups of subjects who had been tested in previous studies in Hong Kong and in the PRC. The Hong Kong groups included 91 male and 49 female psychiatric patients, 89 male prisoners, 50 male and 41 female college students who had been asked to fake good, and 52 male and 41 female students who had been asked to fake bad on the MMPI. The PRC groups included 125 male convicted murderers, 36 male and 31 female obsessivecompulsive patients, 128 male and 103 female manicdepressive patients, and 121 male and 64 female schizophrenic patients. All of the cases have been screened for validity on the basis of the TR index. Linear T scores for the Hong Kong groups were derived by using the Hong Kong college sample as the norm on each of the new infrequency scales. Similarly, T scores for the PRC groups were derived using a PRC college sample. In addition, the raw scores for each of the infrequency scales were prorated to simulate the same number of items as if in the original F scale for the purpose of comparison. These prorated raw scores were converted to T scores using the American norm.
Items on the two infrequency scales derived on the basis of the endorsement percentage of the Hong Kong and the PRC normative subjects are listed in the Appendix . For ICH 1, there are 15 items with a 80.0% overlap with the original F scale. For ICH 2, there are 37 items with a 67.6% overlap. The contents of the items resembled those found in the original F scale, ranging from bizarre sensations, peculiar thoughts, strange experiences, sense of isolation, antisocial behavior, and isolation from family. The additional items not included in the original F scale covered mostly symptoms of physical illness.
The scores for the Hong Kong subjects on each of the infrequency scales are listed in Table 2 . Those for the PRC subjects are listed in Table 3 . When the U.S. norm was used on the original MMPI F scale, the scores of the normative group as well as the fake-good group were all moderately elevated. The psychiatric and delinquent groups were even more elevated, with the mean T score for the Hong Kong female patients at 80, and up to 90 for the Chinese female manicdepressive patients. These elevations were largely suppressed when the T scores were normed using the respective local Chinese sample. The question of whether cross-cultural adjustment could have been made by using the original F scale with a local norm then arises. Even though the renorming of the F scale would have lowered the elevation found among Chinese subjects, the interpretative problem would still have remained concerning the meaning of the score on the F scale, which was originally based on empirical deviation from the norm.
The statistically derived infrequency scales on the Chinese MMPI were then examined for their ability to discriminate between the normative and the criterion groups. On both infrequency scales, the fake-bad groups scored extremely high, with scores much more elevated than those of the patients or the prisoners. The fake-good groups, however, did not score significantly lower than did the normative group subjects. This pattern is consistent with other fake-good studies, especially when the present group of subjects were only instructed to respond in a way in which they would appear to be "normal" and well adjusted.
On ICH 1, patients and prisoners in Hong Kong scored significantly higher than did the college students. The elevation among these criterion groups was only moderate. For ICH 2, no significant difference was found between the normals and the male patients or between the normals and the male prisoners in Hong Kong. However, significant differences were found between the normals and all of the criterion groups in the PRC for both ICH 1 and ICH 2. When the raw scores on these infrequency scales were prorated and converted to T scores using the American norm, the scores were all lower than those that were obtained on the original F scale. However, the converted ICH 2 T scores for a few patient groups in the PRC remained on the high side.
The consistently elevated F score on the Chinese MMPI may be attributed to the cultural differences in item endorsement pattern between the American normative groups and the Chinese subjects. One way to adjust the discrepancy is to rescore the F scale using local norms. This adjustment, however, ignores the original rationale of the F scale construction and limits the interpretation of the meaning of score elevation.
Using the same statistical rationale for the original F scale, we constructed scales that consist of infrequently endorsed items. Two endorsement criteria were used: (a) no more than 10% endorsement as in the original F scale and (b) no more than 15% endorsement, which is being used in the development of the new Adolescent Form of the MMPI2. ICH 1 and ICH 2 were derived from the combined consideration of endorsement percentages of Chinese subjects in Hong Kong as well as in the PRC. For validation, Hong Kong and PRC data sets consisting of students, psychiatric patients, prisoners, and fake-good and fake-bad subjects were rescored on the two newly constructed infrequency scales.
The results showed that both infrequency scales were able to discriminate between normative group subjects and fake-bad subjects. Although the number of items on the scales was less than that of the original F scale, proration of the raw scores provided T scores on the American norm that were much more moderate than the original T scores. On the Chinese infrequency scales, the T scores of the fake-bad groups were all extremely elevated beyond 120 whether scored using the Hong Kong or the U.S. norm. The fake-good groups, on the other hand, showed scores similar to those of the normal student groups. The lack of difference between the fake-good groups and the normal groups may be due to the instructions used in the original study in which the subjects were just asked to present themselves as normal and well adjusted.
The choice between ICH 1 and ICH 2 is more difficult. On the one hand, ICH 1 used the more stringent endorsement criterion of 10% adopted by the original F scale. However, the total number of items selected for this scale was only 15. ICH 2, on the other hand, used the 15% endorsement criterion and contained a total of 37 items. This higher endorsement criterion is similar to the new Adolescent Form of the MMPI, which adopts 20% as the criterion and contains 66 items. The contents of items in ICH 1 and ICH 2 were similar in nature, including bizarre sensations, peculiar thoughts, strange experiences, physical symptoms, antisocial behavior, and alienation from parents. Results from the validation study in Hong Kong and the PRC suggest that ICH 1 may provide better discrimination between normals and patients within a valid range. Despite its shorter length, ICH 1 may be preferred as the Chinese infrequency scale.
All item numbers are based on Form R of the MMPI.
True items
47, 49, 56, 85, 123, 151, 184, 209, 210, 246, 360
False items
177, 192, 220, 272
True items
14, 27, 29, 47, 49, 50, 56, 61, 85, 114, 123, 151, 184, 197, 200, 202, 206, 209, 210, 227, 246, 273, 275, 291, 350, 360
False items
17, 65, 177, 187, 192, 193, 196, 220, 272, 405, 486