Stroop Effect Differences of Native and Non-Native Japanese Speakers

Aaron Tiesling-Rusch
Andrew Dimond
Beloit College


An experiment was carried out to examine the differences in Stroop effect between native Japanese speakers with knowledge of English and English speakers with knowledge of Japanese. Three separate Numeric Stroop tests were administered to participants (N = 38) in different graphemes: Japanese Kanji, Japanese Hiragana, and English Alphabet. There were no significant differences in Stroop effect between the two groups, regardless of what graphemes were used.


The original Stroop test is simple enough. Participants are presented with words, such as Red, Blue, and Orange, and are asked to respond to the color ink of the words. When the Stroop task was first invented its purpose was to test interference of a person's attention. The participant sees the word Blue and wants to say Blue but has to ignore the connotation of the word and instead say what color it is written in (Stroop, 1935). Since then, it has been used in several cultures and languages to test its validity (MacLeod, 1991). The overall idea of any Stroop task is to ignore the meaning of the word and respond according to another feature of the word, albeit color, amount, or size (Konkle & Oliva, 2012). One of the more simple variations is the numeric Stroop task (Windes, 1968). This task presents participants with a certain quantity of a given number of words and asks the participant to respond with how many words there are. For instance, the trial "Three Three" would require the response two.

Like the original Stroop task, there are congruent and incongruent trials. A congruent trial is when there is one instance of the number One, two instances of the number Two, or three instances of the number Three. Therefore incongruent trials would be any other permutation, such as three instances of Two. Logically the participant should automatically attend to the connotation of the word and not the amount of the word, as that is what reading has trained them to do. This difference in the time it takes a participant to respond correctly to incongruent trials compared to congruent trials is known as the Stroop effect. The Stroop effect, therefore, is a simple measure of how long it takes people to inhibit their automatic response to certain words.

As mentioned previously, because of its simplicity the Stroop task has been translated into many different languages (Hatta, 1981). It has been heavily tested with logographic languages (Vaid, 1985; Chen & Ho, 1986), languages that use symbols in place of certain words, such as Japanese and Chinese. The aforementioned derivations of the Stroop task, such as the numeric Stroop task, have not been given as much attention abroad (Zhou et al., 2007). Likewise, there is little research on comparing Stroop effects between an individual's primary and secondary language.

This study set out to be one of the first of its kind to test what effect primary versus secondary language has on the Stroop effect of a numeric Stroop. Specifically, we looked at how native Japanese speakers, who knew English, differ from non-native Japanese speakers, who knew Japanese, in their response to a numeric Stroop test. Because both groups of individuals had knowledge of both the English and Japanese language, the test was written in English and Japanese. However, the Japanese language has more than one system of writing. The most used system of writing is logographic and is called kanji. The second most used system of writing is syllabary, each symbol represents a sound (such as shi), and is called hiragana. Numbers are one of the first kanji that people learning Japanese are taught, therefore we used the numeric Stroop test in our study. Morikawa found that Japanese speakers showed a different Stroop effect depending on what writing system was used (Morikawa, 1981; Moriguchi & Morikawa, 1998). We wished to replicate Morikawa's findings and therefore used all three systems of writing or graphemes.

We predicted that English speakers, because of their proficiency in English, would have a larger Stroop effect for the Stroop test written in English. Although they would have knowledge of Japanese, they would not have any significant differences in Stroop effect between kanji and hiragana. Conversely, we expected to find that Japanese speakers show a higher Stroop effect for kanji and hiragana than for English. We expected to replicate Morikawa and find a difference between hiragana and kanji in Japanese speakers. We wished to see if there were any differences in the Stroop effect between primary and secondary languages. Moreover, we wished to see if there were any differences in the time it takes to inhibit learned responses between primary and secondary languages.



The participants consisted of 20 native Japanese speakers and 18 non-native Japanese speakers (N = 38). Japanese speaking participants were students of Akita International University (AIU) located in Akita City, Akita Prefecture, Japan. They were recruited via word of mouth and posters. Similarly, the English speaking participants were students at Beloit College in Beloit, Wisconsin in the United States. They were recruited by word of mouth and email campaigns.

Research Materials

All data were recorded using a program that presented participants with stimuli, such as "Two Two Two" and recorded how long it took them to answer. The program was written in Javascript and HTML and was delivered on a laptop running Windows 7.


Prior to the Stroop task, participants completed a short survey asking for their native language, years of study of their foreign language, and gender. Following the survey and the consent form, 10 test trials were given to ensure that the participants understood how the program worked. These test trials were not recorded because often it was during these trials that the participants had the most questions. The trials were simple numerical Stroop tasks: a stimulus (One, Two, or Three) was shown. The participants were asked to input, via the number pad, how many of the stimuli appeared—1 if only one of the stimuli was on the screen, 2 if two appeared, and 3 if there were three. Therefore, if the participants saw "Three Three" appear on the screen, they were asked to press 2 as that is the quantity. Participants were instructed to respond as quickly and accurately as possible. The Stroop effect can only be measured from correct responses, therefore if a participant answered incorrectly to a specific trial, that trial was thrown out and replaced with another random trial. There were 40 trials that the participants had to answer correctly in order to proceed to the next section. Each section was structured exactly the same way, even including the test trials. The only difference was one section was in English, another in hiragana and another in kanji. The order of the sections was randomly chosen for counterbalancing purposes.


Reaction time data generally are not distributed normally. Therefore, before any analysis could be done on the data, a logarithmic transformation was applied (Whelan, 2008). We first tested to make sure that there was actually a Stroop effect caused by the test. The Stroop effect was observed (M = 27ms) using a paired samples t-test; congruent trials were responded to more quickly than incongruent trials (t = 3.61, df = 4558, p < .001). Running an analysis of variance (ANOVA), there was no significant difference between groups (F(2, 11) = 2.089, p = .124). Both groups showed similar Stroop effects for every grapheme used. However, a post-hoc test found that English trials (M = 734ms) were significantly faster than hiragana (M = 767; p = .001) and kanji trials (M = 776; p = .003) (776), but hiragana and kanji trials were not significantly different (p = .956). An ANOVA also tested the difference in reaction times between native and non-native Japanese speakers. We found that Japanese speakers reacted faster, regardless of test, than their non-native Japanese speaking counterparts (F(1, 11) = 170.7, p < .001).


We began our experiment with the hypothesis that Japanese speakers would show a larger Stroop effect with kanji than with English. Likewise, we hypothesized that non-native Japanese speakers would show a larger Stroop effect for English words than with Japanese words. We also expected to replicate Morikawa's findings to show a difference in Stroop effect between hiragana and kanji for native speakers. We found no such evidence. Our results show no differences in Stroop effect across cultures for various graphemes. We can therefore conclude that inhibition of automatic responses is not different between an individual's primary and secondary language. It follows common sense that Japanese speakers would be more specialized in the Japanese language than the English language and therefore shows more of a Stroop effect for their native language. Conversely, it follows common sense that English speakers would be more specialized in English than Japanese and therefore shows more of a Stroop effect for their native language. However, this was not the case; participants showed similar Stroop effects for both primary and secondary languages.

There are several reasons why this was the case. Certainly, the simplicity of numeric kanji could be a factor. Even an individual who had no knowledge of kanji would be able to tell which numbers 一, 二, and 三 represented. However, hiragana was also used within the study and was not seen as different from kanji in terms of a Stroop effect. An immediate answer may be the fact that numbers are taught near the beginning of every language course, whereas colors are not taught in Japanese kanji until the third or fourth year of study. This familiarity with numbers in both languages may have caused the similar Stroop effects. This still speaks volumes; numbers in a second language that had only been studied for a year showed the same Stroop effect as numbers that have been studied and used in everyday life for almost 20 years.

The fact that we did not replicate Morikawa's results is an important point. Morikawa studied the Stroop effect using the classic Stroop task and found that Japanese speakers showed a larger Stroop effect for kanji than for hiragana. The only difference between the Morikawa experiment and ours was the use of numbers instead of colors. Our lack of ability to replicate such findings could point to a difference in how Japanese people perceive numeric kanji. It could also be explained by small sample sizes and low power. Our experiment may not have had as much power as Morikawa's experiment and thus we did not replicate their results. With our low power we were able, however, to replicate Lynn and Shigehisa's findings regarding reaction times. Lynn and Shigehisa (1991) found that Japanese children had significantly faster reaction times than British children. We, too, found that the Japanese participants reacted faster to stimuli than their non-native speaking counterparts. Due to our relatively low power, we may only conclude that significant difference in Stroop effect between groups in this study has a much smaller effect size or does not exist.


Chen, H., & Ho, C. (1986). Development of Stroop interference in Chinese-English bilinguals. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12, 397-401. 

Hatta, T. (1981). Differential processing of kanji and kana in Japanese people: Some implications from Stroop test results. Neuropsychologia, 19, 87-93

Konkle, T., & Oliva, A. (2012). A familiar-size Stroop effect: Real-world size is an automatic property of object representation. Journal of Experimental Psychology: Human Perception and Performance, 38, 561-569.

Lynn, R., & Shigehisa, T. (1991). Reaction times and intelligence: A comparison of Japanese and British children. Journal of Biosocial Science, 23 , 409-416.

MacLeod, C. M. (1991.) Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin, 109, 163-203.

Morikawa, Y. (1981). Stroop phenomena in the Japanese language: The case of ideographic characters and syllabic characters. Perceptual and Motor Skills, 53, 67-77.

Moriguchi, K., & Morikawa, Y. (1998). Time course analysis of the reverse-Stroop effect in Japanese kanji. Perceptual and Motor Skills, 87, 163-174.

Stroop, J. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18, 643–662.

Vaid, J. (1985). Numerical size comparisons in phonologically transparent script. Attention, Perception & Psychophysics, 37, 592-595.

Whelan, R. (2008). Effective analysis of reaction time data. The Psychological Record, 58, 475-482.

Windes, J. D. (1968). Reaction time for numerical coding and naming of numerals. Journal of Experimental Psychology, 78, 318-322.

Zhou, X., Chen, Y., Chen, C., Jiang, T., Zhang, H., Dong, Q. (2007). Chinese kindergarteners' automatic processing of numerical magnitude in Stroop-like tasks. Memory & Cognition, 35, 464-470.




©2002-2021 All rights reserved by the Undergraduate Research Community.

Research Journal: Vol. 1 Vol. 2 Vol. 3 Vol. 4 Vol. 5 Vol. 6 Vol. 7 Vol. 8 Vol. 9 Vol. 10 Vol. 11 Vol. 12 Vol. 13 Vol. 14 Vol. 15
High School Edition

Call for Papers ¦ URC Home ¦ Kappa Omicron Nu

KONbutton K O N KONbutton