Editor

JLLT edited by Thomas Tinnefeld
Journal of Linguistics and Language Teaching
Volume 5 (2014) Issue 2
pp. 161-180




The Maze Task:
Examining the Training Effect of Using a Psycholinguistic Experimental Technique for Second Language Learning


Elizabeth Enkin (Lincoln (Nebraska), USA) / Kenneth Forster (Tucson (Arizona), USA)


Abstract (English)
The maze task is a psycholinguistic procedure that measures “real time” sentence processing. However, unlike other psycholinguistic tasks, it is quite unique because it forces incremental processing of a sentence. This study therefore explores the task’s merits in a rather different arena, namely as a language training program for beginner (second semester) Spanish learners. Through a maze training-test paradigm, results showed that maze training can assist in developing procedural representations. Specifically, as compared to learners who were trained on Spanish structures that were similar to English (learners’ L1), those trained on structures that differed from English showed comparable reaction times to both structures. There was also a carryover effect from this training to a posttest (an untimed grammaticality judgment task), which suggests that maze training can also help develop explicit knowledge of a language. Moreover, results on a production-based paper-and-pencil pretest-posttest indicated that after maze training, learners showed significant improvement. Lastly, questionnaire results demonstrated enthusiasm towards the task.
Key words: Second language learning, psycholinguistics, Spanish language learning, maze task


Abstract (Español)
El "maze task" (en español, la tarea del laberinto) es un método psicolingüístico que mide el procesamiento de las oraciones en “tiempo real”. Sin embargo, a diferencia de otras técnicas psicolingüísticas, tiene la particularidad de forzar el procesamiento de una oración de una manera gradual. Por lo tanto, este estudio examina la efectividad de este método en un área diferente, específicamente como un programa de ayuda en el aprendizaje de lenguas para los estudiantes principiantes (de segundo semestre) de español. Utilizando un paradigma de instrucción-examen, los resultados mostraron que el entrenamiento con el maze task puede ayudar con el desarrollo de representaciones implícitas. Específicamente, en comparación con los estudiantes que fueron entrenados con estructuras españolas similares a estructuras del inglés (la L1 de los estudiantes), los estudiantes que fueron entrenados con estructuras diferentes a las del inglés, mostraron, para ambos tipos de estructuras, tiempos de reacción comparables. Asimismo, este tipo de entrenamiento tuvo un efecto adicional (en una tarea de juicios de gramaticalidad sin límite de tiempo). Este efecto implica que el entrenamiento con el maze task también puede ayudar con el desarrollo de conocimiento explícito de una lengua. Además, los estudiantes completaron un pretest-posttest a mano que examinó habilidades de producción. Los resultados del pretest-posttest mostraron que los estudiantes mejoraron significativamente. Finalmente, los resultados de un cuestionario mostraron una reacción positiva de los estudiantes hacia el maze task.
Palabras clave: Aprendizaje de segunda lengua, psicolingüística, aprendizaje de lengua española, maze task



1 Introduction

1.1 General Remarks

The maze task is a technique used in psycholinguistic experiments that is designed to measure reaction times as subjects read and comprehend sentences (Forster 2010, Forster, Guerrera & Elliot 2009, J. Witzel & Forster 2014, N. Witzel, J. Witzel & Forster 2012, Qiao, Shen & Forster 2012). The task requires subjects to 'weave' their way through a sentence word by word by choosing the correct grammatical alternative from two choices (thus the name 'maze'). As is common in psycholinguistic tasks, the maze task requires subjects to complete the activity as quickly as possible, but not so quickly that a mistake is made, thereby measuring real-time (or more formally, 'online') processing speed as opposed to other types of tasks that allow subjects more time to reflect on their responses (referred to as 'offline' tasks).

In this task, two words are presented side by side, and the participant is asked to decide which word could correctly continue the sentence. One of the alternatives is the correct choice, while the other choice would be ungrammatical and unnatural when taking into consideration the previous words that have already been chosen. Participants respond by pressing either the left arrow key, which indicates that they are choosing the word on the left-hand side of the screen, or the right arrow key for the word on the right-hand side. When a correct response is selected, another two words appear on the screen and this procedure continues until the entire sentence has been constructed. An example item can be found in Figure 1 below:


Figure 1. A sample maze task sentence, frame by frame
In each frame there is only one correct continuation to the sentence. Subjects view each frame separately and when a correct choice is made, they are then able to view the next frame.

In the first frame of the sentence in Figure 1, the participant would see the first word (in this case “The”) on the left-hand side of the screen and alongside it “x-x-x”, which would signal to the participant that this is the beginning of the sentence, and her / she is free to press any key. Once this happens, the subsequent frame is presented (in this case “walked school”), which replaces the previous frame that was seen. For the remainder of the frames, the correct sequence would be left and right, which would make the sentence The school needs books. In experiments that utilize this task, when an error occurs (that is, when an incorrect alternative is erroneously chosen), the trial is aborted1.

Unlike traditional procedures that measure online sentence processing time, the maze task is unique because it forces the reader into a processing mode that is incremental. As Forster et al. (2009) explain, this means that the task does not require the use of comprehension questions (i.e. comprehension checks), which are routinely incorporated into other procedures, such as eye tracking (Reichle Rayner & Pollatsek 2003) and self-paced reading (Just Carpenter & Woolley 1982). Evidence for not needing to use comprehension questions stems from Freedman & Forster (1985) in whose study the maze task procedure yielded processing costs for ungrammatical sentences containing a subjacency violation. Furthermore, the maze task has also been used to replicate past findings from both a sentence production task (Nicol, Forster & Veres 1997) and an eye tracking experiment (Forster et al. 2009).

The effectiveness of the maze task has recently been tested in a study comparing various online reading paradigms (N. Witzel, J. Witzel & Forster 2012). The authors included in their comparison two versions of the maze - one called lexicality maze (L-maze), and the other one grammaticality maze (G-maze). The difference between these two versions is that in an L-maze, the incorrect alternative is an orthographically legal nonword (and is therefore considered the easier version), whereas in the G-maze, it is an ungrammatical choice (as described at the beginning of this article). Both maze versions were highly effective in detecting the expected syntactic effects and, moreover, yielded highly robust localized effects for processing difficulties as compared to eye tracking and self-paced reading. On the background of these results in conjunction with the findings discussed thus far, two critical points that are also made in Forster et al. (2009) are important to highlight:
  • participants cannot perform the maze task without full comprehension of the input, and
  • participants are aware that they must process each word of each sentence deeply enough so that they can continue constructing the sentence.
As is evident, the maze task is generally used to measure the processing costs of specific constructions in psycholinguistic experiments. However, as it creates conditions i participants must adopt a rapid and incremental approach for processing and comprehending sentences, it is hypothesized that it could also be successfully used for language learning purposes - specifically, we use the G-maze here to ensure that learners must incrementally integrate words in a sentence. Moreover, although the maze task is a reading paradigm, it is quite a different procedure as compared to other tasks. In a sense, the maze task can be viewed as a procedure of building sentences, which may require participants to produce sentences as well. Thus, this attribute further makes the maze task a promising candidate for language learning since it is possible that training with the maze task may invoke production processes, as well as comprehension processes.

The present study therefore investigates the merits of the maze task in rather a different field as compared to past psycholinguistic studies utilizing this procedure. Namely, we investigate how the maze task fares as a language training / learning program for beginner (late / university-aged, second semester) learners of Spanish. When the maze task is used as a training instrument, it is possible that repeated practice on structures could strengthen the connections between already existing associations, thereby helping to develop language automaticity and even fluency. This study is therefore unique in nature and offers an interdisciplinary approach to studying second language (L2) acquisition. Specifically, it bridges the gap between psycholinguistic experimentation and actual language learning. The present article is a substantially revised and improved version of an earlier working paper on this topic (Enkin 2012). It has been revised with respect to all aspects, such as review of literature, description of participants and methodology, and data analyses and interpretation.


1.2 The Maze Task and Second Language Learning

In the field of second language acquisition, implicit and explicit knowledge have been identified as two knowledge bases that may affect ultimate L2 attainment (N. Ellis 1994; Gasparini 2004). Implicit knowledge is the intuitive understanding of the manner in which a language works, whereas explicit knowledge is considered conscious awareness of the grammatical rules of a language (R. Ellis 2009a). There is general agreement that linguistic knowledge is primarily comprised of intuitive (implicit) and tacit knowledge, and that acquisition of a second language involves the development of this type of knowledge (R. Ellis 2005, 1993). Although, when classroom learners (i.e. adult learners) are considered, explicit knowledge may be helpful (DeKeyser & Juffs 2005).

In this study, we ask whether the maze task can be used as a language learning tool for aiding in L2 knowledge development. Specifically, when used for training, the maze task can potentially reinforce instruction from class - learners must go through the process of selecting the correct continuation to a sentence, word by word, as quickly as possible. They are notified when they make a mistake and are permitted to try the sentence again. In this way, the maze task could play a key role in reinforcing formal instruction by way of practice.


1.3 The Present Study: Research Questions

In the present study, we investigate the main question of whether or not maze task training can assist with L2 learning. In order to investigate this question, three separate measures are considered. First, we examine whether or not learners can gain implicit knowledge from maze task training. As the maze task asks for rapid responses, when used in a training-test paradigm, it will be possible to measure implicit knowledge of the L2 (through an analysis of the test session) (Ellis 2009b for a review of measures of implicit and explicit knowledge). However, as the present study involves late learners, it is also important to ask our second research question, which is whether maze task training can offer benefits for explicit knowledge. This is the reason why an untimed grammaticality judgment task (a measure of explicit knowledge that was administered after the maze training-test portion) was also incorporated. If the results are overall encouraging, then it will be possible to say that maze training can be used as a complement to formal instruction.

The third research question asks whether there is an overall learning benefit from maze task training. We use a pretest-posttest design to examine whether or not learners would show an improvement in scores after the maze task training. This pretest-posttest is a paper-and-pencil production task in which learners must fill-in-the-blank with the correct verb and verb form (this was one of the constructions, copulative verbs, trained in the maze task – see Section 2). We use a fill-in-the-blank task because the logic here is that, as described earlier, the maze task may also require production processes. The format chosen for the production task (i.e. a fill-in-the-blank task) is one with which learners are familiar with (from exams, classroom work, and homework), and therefore we reason that this is a good fit to test improvement.

Lastly, as we are interested in using the maze task for the purposes of L2 learning, we also report on the results of a questionnaire. Participants were given an attitude questionnaire to complete at the conclusion of the experiment. In this survey, participants were asked about their perceptions of the task with respect to helpfulness, level of interest, and potential usefulness. It included questions that required students to rate their answers (from a strong Yes to a strong No).


1.4 The Maze Task Test Session and the Untimed Grammaticality Judgment Task

With respect to investigating the first and second questions described above, a brief explanation regarding the general design is needed. There is general agreement that the L1 influence affects L2 acquisition, and that this influence can be rather strong (Gass & Selinker 1992). In fact, Tokowicz & MacWhinney (2005) found that in an event-related potential (ERP) study, when presented with L2 constructions that differed from their L1, learners were not sensitive to grammatical violations (no P600 effect, which is a positive-going deflection in an ERP waveform caused by a grammatical violation; Osterhout & Holcomb 1992). These results emphasize that with respect to structures that differ in the L1 and L2, learners may have difficulties storing these structures as procedural representations, and thus may not process them in a native-like way. Therefore, sentence constructions used for two maze training types in this study had a “similar-to-English” version (henceforth called English-similar versions) and a “different-from-English” version (henceforth referred to as Spanish-specific versions), which referred to the state of similarity to English, i.e. the learners’ L1. Thus, participants were maze-trained on either English-similar or Spanish-specific versions.

We set out to measure if maze training on these sentence types could show learning benefits. The term learning benefits as used here refers to reaction times (in milliseconds) on a maze task session given after training sessions (i.e. the test session). This post-training maze task session contained both English-similar and Spanish-specific versions, and the objective was to examine how each training group fared with respect to response times for each version (i.e. trained structures as compared to untrained structures). We were particularly interested in measuring whether training on Spanish-specific structures would yield learning benefits as compared to those learners trained on English-similar structures. The logic here is that training with Spanish-specific types, if successful, would have important implications for language learning.

In order to test a carryover effect to a different type of task, we incorporated an untimed grammaticality judgment task as a posttest (administered after the maze training-test portion of the experiment). The logic here was that if there was a training effect observed on the post-training maze task session, then including a grammaticality judgment task as a posttest would allow us to examine whether the training effect could generalize to a measure of explicit knowledge. That is, due to the nature of maze training, it is also possible that there are benefits on explicit knowledge since, during the training, the task will display an error message in the precise location of an error (and then learners have the opportunity to try the sentence again). This allows learners to deduce for themselves why the error had occurred. Thus, we also question whether there would be maze training benefits (and specifically, from maze training with the Spanish-specific structures) on this type of task.


1.5 Hypotheses

For the maze training-test portion, one hypothesis is that the participants trained on the Spanish-specific versions will, after the training, show benefits for both the Spanish-specific and the English-similar sentence versions, but the same may not be true for the participants trained on the English-similar versions; for them, the Spanish-specific versions should be more difficult than the English-similar versions with which they were trained. One piece of evidence for this proposed asymmetry comes from aphasia research where investigators found that training on syntactically complex sentences benefitted performance on syntactically less complex sentences, but not the reverse (Thompson, Shapiro, Kiran & Sobecks 2003). Additional evidence comes from English as a second language (ESL) research where instruction on a difficult structure (object of prepositional relative clauses) improved performance on simpler structures (object relative and subject relative clauses), but the reverse was not true (Eckman, Bell & Nelson 1988). In the present study, beginner learners may not have implicit representations for the Spanish-specific versions, and thus, these structures may be more difficult to store as procedural representations, whereas English-similar versions are more readily stored in implicit knowledge. Therefore, it is hypothesized that the training effect examined here may show a similar asymmetry as discussed in the studies above.

For the untimed grammaticality judgment task, we hypothesize that maze training benefits will show a carryover effect. In other words, we anticipate a similar effect to what we expect for the post-training maze task session (as discussed above). However, since this is an untimed task, it allows a sufficient amount of time to access explicit knowledge (R. Ellis 2009b, Loewen 2009). Therefore, as it is possible that the maze task may be able to assist with building both implicit and explicit knowledge, when given enough time, learners trained on Spanish-specific versions may be able to call on both knowledge bases, thus yielding better accuracy overall (i.e., better performance on both “Spanish-specific” and “English-similar” versions).

For the pretest-posttest, we hypothesize that learners will show significant improvement after training. The idea here is that after undergoing the maze training, when asked to perform a fill-in-the-blank task that includes one of the sentence constructions used in the maze training, learners will show improvement. This task also serves as an indication regarding whether or not learners, after undergoing maze training, can show improvement on a production task.

Finally, for the attitude questionnaire, we hypothesize that learners will be welcoming of the task. Due to its fast-paced nature as well as its novelty, we believe that learners will be enthusiastic to complete the task, especially since it is such a different activity as compared to online workbooks (though online workbook activities certainly have their place). One of the most important results we are examining is whether learners perceive the task to be both helpful and fun.


2 Method

2.1 Participants

Twenty-one undergraduate students enrolled in one Spanish 102 class (a second semester beginner level Spanish class) at a large university in the United States Southwest participated for course credit. Participants were native speakers of English. Subjects were randomly assigned into one of two training groups, which was either English or Spanish (eleven subjects in the English training group and ten subjects in the Spanish training group). These groups referred to the type of sentences students received during maze training sessions. The English training group received sentence stimuli referred to as similar-to-English while the Spanish training group received different-from-English sentences. These sentence types are explained in more detail in the following section.


2.1.1 Proficiency Test

In order to be enrolled in the 102 level at this university, students must take a 20-25 minute computer-adaptive placement (proficiency) examination administered by the university. The web-based examination is the BYU (Brigham Young University) WebCAPE (Computer-Adaptive Placement Exam). This examination asks questions of varying difficulty level and adapts its questions according to students’ answers. The qualifying score to be placed into Spanish 102 is a range of 201-309. Students scoring below 201 are placed into Spanish 101. Scores above 480 place a student into advanced (third-year) classes. The only alternative way that students can register for the 102 level is by having college transfer credits from the previous level of Spanish (101). This proficiency requirement served to ensure that students were of comparable skill level in Spanish.


2.2 Materials and Design

Before discussing the specific design of the present study, a brief discussion regarding the types of structures used throughout the experiment is necessary. There were two types of sentence structures used: Spanish-specific types, which contained Spanish structures that were different-from-English, and English-similar types, which contained structures that were similar-to-English. In order to add variety for participants during the maze task training, the two conditions (English-similar and Spanish-specific types) were comprised of three sentence constructions – object relative clauses, direct object pronouns, and the to be verbs ser and estar (i.e. copulative verbs) – because they each have a Spanish-specific version and an English-similar version (these versions being illustrated in example maze sentences found in Table 1 further below).

For the object relative clauses and direct object pronouns, these can both be structurally English-similar or Spanish-specific. For the object relative clauses, the English-similar versions contained an overt subject after the relative pronoun, whereas in the Spanish-specific versions, there was an omission of the subject pronoun after the relative pronoun (Spanish is a pro-drop language whereas English is not). When pro-drop occurs after a relative pronoun, the object relative clause construction becomes quite Spanish-specific since in English, an overt subject (noun or pronoun) is obligatory after the relative pronoun. Thus, beginner Spanish learners (with English as the L1) must learn to pay close attention to verb conjugation, especially in this context, if the sentence is to be interpreted correctly.

With direct object pronouns, the placement of the clitic identified the sentence as English-similar or Spanish-specific. In Spanish sentences containing a tensed verb followed by an infinitive verb, a clitic can appear in one of two positions: preverbal (raised, appearing before the tensed verb), or postverbal (attached to the end of the infinitive verb). Thus, the English-similar versions contained a direct object pronoun attached to the end of an infinitive verb (since English only has postverbal placement), whereas the Spanish-specific versions contained a direct object pronoun that was preverbal (which would never occur in English).

The final construction focused on a lexical contrast between English and Spanish. Spanish has two copulative verbs (ser and estar) that both translate as to be. Ser expresses permanency while estar expresses temporary states, which is not a distinction made in English. According to VanPatten (1987), the verb ser is assimilated into its English translation first, and as learners whose L1 is English gradually acquire ser and estar, they linger at the stage of acquisition where they use and overuse ser exclusively. As VanPatten proposes, this may occur due to L1 influence because at that point in acquisition, only one copula exists in both languages. This would allow learners to equate Spanish ser with English to be, thereby facilitating learners to linger at the stage of acquisition in which they solely use ser. Due to this hypothesis, we used ser in the English-similar versions of the verb to be. Sentences focused on specific uses of the verb, such as describing occupations and expressing time. The counterpart verb, estar, was used in the Spanish-specific versions. These sentences focused on uses such as describing emotion and location.

Sentence Types
English-similar Sentences
Spanish-specific Sentences
Object Relative Clauses with vs. without overt subject
Los perros que muchas personas tienen comen mucha comida.
The dogs that many people have eat a lot of food.
El libro que necesitas cuesta mucho dinero.

The book that Ø (you) need costs a lot of money.
Direct Object Pronouns that follow vs. precede a verb
Me gustan los restaurantes italianos y quiero recomen-darlos para las fiestas.

I like Italian restaurants and I want to recommend them for parties.
Me gustan las mascotas y las quiero recomendar para las personas viejas.
I like pets and
I want to recommend them for elderly people.
To be as ser vs. estar
Trabajo en una oficina grande porque soy abogada.
I work in a big office because I am a lawyer.
No quiero salir porque estoy triste esta noche.
I do not want to go out because I am sad tonight.

Table 1: Sentence types

The sentence types used in this study were level-appropriate for Spanish 102 learners and their curriculum. All structures were familiar to students by mid-semester before the maze training began, but there was no additional classroom instruction specifically focusing on these structures after the experiment had started. The course was taught communicatively and fully in Spanish. Grammar instruction consisted of outlining rules and then, class time was largely devoted to practice in context. Class activities were geared towards developing language skills (listening, speaking, reading, and writing) and included pair and small group work focusing on interaction.


2.2.1 Pretest (Paper-and-Pencil Test)

One week prior to the start of the maze task training, all participants were given a paper-and-pencil pretest in class to complete during class time. This task was a fill-in-the-blank test where participants needed to complete sentences by choosing either ser or estar, and then needed to correctly conjugate the verb chosen. Thus, this task was comprised of the to be structures seen in the maze task portion of the experiment. The main reason why this construction was chosen was because it could easily be used for a fill-in-the-blank task and also because it requires learners to do two things (to choose and conjugate a verb). Therefore, using this construction allowed more room for improvement. Two versions of this task were created, and half of the subjects received Version A as their pretest whereas the other half received Version B as their pretest (this was done so that a counterbalanced design could be used with the corresponding posttest – see below). The two versions were of the same skill level (suitable for Spanish 102), with each one containing 10 total sentences (5 English-similar types and 5 Spanish-specific types). The only difference between the two versions consisted in the lexical items, which were all of appropriate level:
Examples:
Mi novio y yo _______ muy contentos con la idea de comer helado. -
My boyfriend and I ______ very happy with the idea of eating ice cream
Mi amigo y yo _______ muy tristes con la idea de ir a la clase. -My friend and I ______ very sad with the idea of going to the class.

2.2.2 Maze Task Training

Both training groups completed three training sessions, one per week, over a three-week period. There were 20 sentences in total in each training session: 15 sentences contained target structures (either English-similar or Spanish-specific, depending on the training group) while the remaining five sentences were grammatical fillers that were the same for both groups. Each group was trained on their sentence types, i.e. the “English” training group was trained on English-similar types, whereas the “Spanish” training group was trained on Spanish-specific types. The sentences for each group were the same from session to session as were the incorrect alternatives (which were of appropriate Spanish level). During the training sessions, the subjects were asked to try the sentence again if they made a mistake. The location of the mistake in the sentence was pointed out immediately so that students could see where they had made an error.


2.2.3 Maze Task Test

A final maze task test session, which was administered to participants on the fourth week, contained all new sentences, but an equal amount of both English-similar and Spanish-specific types. There were a total of 32 sentences, 28 experimental sentences, and subjects were not able to try the sentence again when they made a mistake. The feature of immediate feedback (i.e., pointing out the precise location of an error) was still present.


2.2.4 Grammaticality Judgment Task

An untimed grammaticality judgment task was administered to participants during the fifth week. There were a total of 24 items – 12 grammatical sentences and 12 ungrammatical filler sentences. From the 12 grammatical items, six were the experimental sentences (3 English-similar types and 3 Spanish-specific types) and the remaining six items were fillers. Each sentence had ten words each:
Example:Usualmente no como el jamón pero hoy lo quiero comer. –
Usually I do not eat ham but today I want to eat it.

2.2.5 Posttest (Paper-and-Pencil Test)

The paper-and-pencil posttest was administered to participants in class one week after the maze training-test portion of the experiment. The subjects completed the test during class time, and received the version of the test (Version A or B) they had not yet completed for the pretest, and therefore a counterbalanced design was implemented (e.g. if a subject completed Version A for the pretest, then they would complete Version B for the posttest, and vice versa).


2.2.6 Attitude Questionnaire

At the conclusion of the study, participants filled out a questionnaire in class about the maze task. There were 11 questions that asked for feedback on the likeability and usefulness of the task. Participants rated each question on a scale from 5 to 1 (5= strong yes, 4= yes, 3= neutral, 2= no, 1= strong no). The subjects were asked questions about how enjoyable and helpful the task was, how motivating it was for learning (especially compared to their online workbook), if the task could help improve performance elsewhere (e.g. on examinations), and if they could see it as part of a curriculum for Spanish and other languages.


2.3 Procedure for Computerized Sessions

2.3.1 Maze Task Training

The training sessions (as well as the maze task test and the grammaticality judgment task) were run using the DMDX software package, which was developed at the University of Arizona (Forster & Forster 2003). Each session was sent via email as a link, and once students clicked on a link, DMDX (Display Master DirectX) would automatically install on their PCs for the duration of the task. Students completed each session in one sitting and only one time. They had a full week to complete each session so as to allow them to do each one at their convenience.

The items were presented in black letters on a white background. Every item, each making up a sentence, consisted of a series of frames. After the first frame, each subsequent frame contained two words side by side, where one was the correct next word in the sentence, while the other was grammatically and semantically incorrect.
Example:
El x-x-x / libro unas / que contaminación / necesitas porque / cuesta banco / mucho comen / dinero. sin. –
The x-x-x / book a / that pollution / [you need] because / costs bank / [a lot of] [they eat] / money. without.
Correct and incorrect alternatives appeared randomly on the left-hand side or on the right-hand side of the screen. Furthermore, since the training sessions contained the same sentences and incorrect alternatives for each training group, the incorrect alternatives appeared on random sides of the screen (left or right) from session to session. This was done so that subjects could not memorize the position of the correct alternatives on the screen. The sentences were presented in a randomized order for each subject for each session.

The participants were instructed to choose the correct word in each frame as quickly and as accurately as possible by pushing the corresponding left or right button (either the left or right arrow key). If the word was correctly selected, the next frame was displayed immediately. If the incorrect alternative was selected, an "error" message was displayed. When an error occurred, subjects were given the choice to try the sentence again by pushing the corresponding key. If the participants made the correct choice throughout the frames for an item, the final frame was followed by a “correct” message. Subsequently, the beginning of the next item (i.e. the start of a new sentence) would appear. Participants received the same four practice sentences at the beginning of each session.


2.3.2 Maze Task Test

The link for the maze task test session was sent via email, and the participants had one week during which they could complete this session (in one sitting). The instructions remained the same and the items were presented in the same manner as in the training sessions. Once again, the sentences were presented in a randomized order for each subject. In this session, however, the participants were not given the choice of trying a sentence again, and thus, when an error occurred, the program would display an error message, and then move on to the next item. The participants were given six practice sentences at the beginning of the session.


2.3.3 Grammaticality Judgment Task

The link for the untimed grammaticality judgment task was sent to the subjects via email, and the participants had one week during which they could complete this session (in one sitting). Each frame displayed a full sentence, and the order of sentences was randomized for each subject. Each sentence appeared in black letters on a white background. The participants were instructed to decide whether the sentences they saw were grammatical or not (by pressing one of two response keys, when they were ready). At the start of the task, participants were given five practice items.


3 Results (Data Analyses)

Unless otherwise stated, all analyses were carried out, using linear mixed effects modeling. The analyses involved fitting linear mixed effects models (LMERs) to the data points of interest, which was done, using the LMER function from the lme4 package in R (Baayen 2008a, 2008b, Baayen, Davidson & Bates 2008, Pinheiro & Bates 2000, R Development Core Team 2013). The method using LMERs offers a critical advantage over the F1 / F2 method (i.e. traditional analysis carried out through ANOVA) because it allows for two crossed random effects (i.e., subjects and items can both be treated as random effects within the same model). Moreover, the LMER software analyzes the data for each individual trial without needing to aggregate over items and subjects, and then arrives at the best fitting linear model with both subjects and items as random effects. Thus, these models have the advantage of working over the complete set of data points for each subject and each item. The p-values for the effects were generated by Markov Chain Monte Carlo (MCMC) simulation, using 10,000 iterations (Baayen et al. 2008). In the analyses presented below, the main effects were derived from running a model only looking at main effects, whereas interactions of interest were obtained by running a model looking at an interaction between the factors.


3.1 Maze Task Test

Prior to the analyses, the raw reaction times (RTs) were log-converted in order to correct for the marked positive skew that is typical of reaction times. Filler sentences were removed from the analysis, and all trials where an error occurred were discarded. In addition, trials that were never seen due to a prior error were all ignored. This occurred if the subject would “error out” of a sentence thereby not ever seeing the rest of the sentence. The first word in each sentence (i.e. where the correct response was provided) was also removed from the analysis. Lastly, RTs were trimmed so that those under 300 ms and over 5000 ms were not included in the analysis.

LMERs were fitted to the data with subjects and items as random effects. The training group was analyzed as a fixed-effect between-subjects factor, with the levels English (for the training group receiving English-similar sentences) and Spanish (for the training group receiving Spanish-specific sentences). The sentence type was analyzed as a second fixed-effect within-subjects factor, with the levels English-similar types and Spanish-specific types.

Using RTs to whole sentences as the dependent variable (since the focus was on the training effect rather than specific sentence regions), the critical interaction of the training group by sentence type was significant (t = 2.74, p < .01), reflecting a training effect (there was no main effect of sentence type, t = 0.57, p > .05, nor of training group, t = 1.50, p > .05). This result meant that the difference in RTs between the two training groups for the Spanish-specific sentences (94 ms) was significantly greater than the difference for the English-similar sentences (36 ms). Stated differently, the interaction showed that there was little difference in difficulty (3 ms) between English-similar and Spanish-specific sentences for the "Spanish" training group, but a substantial difference (61 ms) for the “English” training group (Figure 2 below illustrates the mean RTs, which have been log-converted). The means illustrated that the students having been trained on Spanish-specific versions (those subjects in the “Spanish” training group) yielded comparable RTs on both Spanish-specific and English-similar versions, whereas the learners that had received training on English-similar structures (i.e. the “English” training group) found the Spanish-specific structures more difficult.


Figure 2. Mean reaction times (in ms)
(on English-similar and Spanish-specific sentence types for the maze task “test” session, where “English” and “Spanish” training groups refer to the sentence types received during training (“English-similar” or “Spanish-specific”))


3.2 Grammaticality Judgment Task

For the untimed grammaticality judgment task, the factors remained the same as in the maze task test session. Error rates on the target grammatical items (English-similar and Spanish-specific sentences) were used as the dependent variable. There was no significant interaction of training group by sentence type (t = 0.63, p > .05). However, there was a significant main effect of training group, (t = 1.87, p < .05) (no main effect of sentence type, t = 0.40, p > .05), which indicated that the “Spanish” training group, that was trained with the Spanish-specific types, made significantly fewer errors overall as compared to the “English” training group (those participants trained on English-similar types) (Figure 3 below).


Figure 3. Mean error rates
(on “English-similar” and “Spanish-specific” sentence types for the grammaticality judgment task, where “English” and “Spanish” training groups refer to the sentence types received during training (“English-similar” or “Spanish-specific”))


3.3 Pretest-Posttest (Paper-and-Pencil Test)

For the pretest-posttest, learners showed significant improvement from pretest (mean score = 7.6 out of 10) to posttest (mean score = 8.3 out of 10) by a related sample one-tailed within-subjects t-test, (t [20] = 1.95, p < .05).


3.4 Attitude Questionnaire

The attitude questionnaire yielded an average score of 4.3 out of 5 on all questions. Top scoring questions revealed that the maze task was an enjoyable supplement to online workbooks (most probably due to its interactive nature), that students felt motivated to complete the task, that it was helpful for Spanish learning, and that it was a fun task. Learners also indicated that they thought the task would be a good addition to a Spanish curriculum and that it may be helpful for learning other languages.


4 Discussion

The present study has focused on examining whether the maze task could be used for second language learning. The overall objective was to establish that the task could assist with language learning because
  • it forces incremental and rapid sentence processing (Forster 2010, Forster et al. 2009, and N. Witzel et al. 2012), and
  • one needs to not only comprehend the language stimuli fully, but also each word of each sentence must be processed quite deeply (Forster et al. 2009).
We used several different measures to assess the effect of maze task training, and the results presented here are promising.

Since the maze task asks for rapid responses, it can be considered as a measure of implicit knowledge, and thus, the first research question asked whether the maze task could be used successfully to develop L2 implicit knowledge for L2 learners. In order to test for a maze training effect, we utilized a training-test paradigm with two training groups. One training group was trained on structures that were English-similar, whereas the other group was trained on Spanish-specific structures (the learners’ L1 being English). The critical findings revealed that on the post-training maze task “test” session, there was an effect of training.

Supporting our hypothesis, learners who had been maze-trained on Spanish-specific structures showed little difference in difficulty when constructing Spanish-specific and English-similar structures (a difference of 3 ms in RTs), whereas the learners trained on English-similar sentences showed a substantial difference (61 ms). The mean RTs highlighted that participants trained on English-similar sentences experienced difficulty when constructing the Spanish-specific sentences on which they were not trained. Interestingly, this effect suggests that after maze training, learners in both training groups can understand English-similar sentences equally as quickly, but training with the Spanish-specific sentences may be needed so that learners can then construct them as quickly as the English-similar sentences - a result that may be related to past research where learning has been shown to generalize in the direction of more difficult to simple (e.g. Eckman et al. 1988 Thompson et al. 2003).

The findings of the post-training maze task test session suggest that maze training is effective for language learning, and importantly, that it can help build L2 implicit knowledge. Specifically, we reason that while learners may gain explicit knowledge (i.e. an understanding of grammar rules) from class, maze task training may act to reinforce classroom instruction, thus making this knowledge implicit. We suggest this conclusion because what was trained in the maze task were structures that had also been covered in class, and when considering L2 instruction, a large body of evidence has suggested that classroom instruction is most effective for building explicit knowledge (De Graaf & Housen 2009 for a summary). Thus, the maze task may play a role in converting explicit knowledge (i.e. representations stored in declarative memory) into implicit (procedural) representations, which is a key component for eventual fluency.

An untimed grammaticality judgment task was utilized in order to assess carryover benefits for explicit knowledge development; this task was completed after the maze training-test phase of the experiment. The results showed that participants trained on Spanish-specific types performed better (fewer errors) on both Spanish-specific and English-similar sentence types alike. This effect can be explained in light of the lack of time constraints in the task. That is, when allowed time to reflect on responses, learners trained on Spanish-specific structures may be able to perform significantly better on a task that asks them to judge the grammaticality of both Spanish-specific and English-similar structures. It is therefore possible that maze task training has benefits for developing explicit knowledge as well.

In order to determine whether or not the maze task can assist with learning as a whole, learners were given a pretest-posttest, i.e. a paper-and-pencil fill-in-the-blank task testing a maze-trained sentence construction. The results showed that participants improved significantly from pretest to posttest scores (that is, after undergoing maze training, participants improved on a posttest as compared to a pretest). Interestingly, this result not only suggests that, when content is similar, maze training can positively impact L2 learning, but also that the maze task may require an element of sentence production as well as comprehension. These findings thus have important implications for language learning.

Finally, the results from the attitude questionnaire indicated that learners were very welcoming of the maze task, that they saw the value in it, and that they thought it was both fun and helpful for learning. These results are interesting especially since getting language students excited about their homework may, at times, be a challenge. The format of this task is one that learners are not familiar with, and thus, it may offer them a break from their normal assignments. Furthermore, and anecdotally, learners expressed that the task felt more like a “videogame” than any other homework that had been assigned, which certainly could have added to the attractiveness of the task.

In general, getting learners to practice language skills where rapid processing is required may prove difficult since language learners, especially at the beginner levels, may be too unsure of their language abilities to engage in conversation. The maze task may therefore offer a unique opportunity for learners to practice an L2. Moreover, the task can be completed in the comfort of students’ homes, without anyone else present, which could help to eliminate the anxiety that may accompany practicing the L2 with others - although the task is not meant to substitute for critical language interaction, which is necessary for successful learning. Thus, the conclusion that the maze task can be beneficial for learning is an important one. However, as using this procedure for language learning is a new idea, with further testing, it will be possible to investigate whether the task can assist learning at various other proficiency levels.


5 Conclusion and Further Research

The adage practice makes perfect certainly does apply to maze task learning, and in light of the encouraging results presented here, there may be a place for the maze task in second language learning. Therefore, further research with regard to developing the maze task into a more complete language learning activity may be important to investigate. This is particularly the case because university-aged learners are more enthusiastic to complete their work if computer use is involved (Blake 2012). By expanding upon the bare bone structure of the maze task, extensions to creating a more “videogame-like” environment could be the future for learning with this task. The maze task lends itself well to a videogame arena partly because it requires people to construct sentences so quickly. Task materials would preferably need to be made available through the Internet or a smartphone, such as with an application software.



References

Baayen, Harald R (2008a). Analyzing linguistic data: A practical introduction to statistics. Cambridge, UK: Cambridge University Press.

Baayen, Harald, R (2008b). languageR: Data sets and functions with "Analyzing Linguistic Data: A practical introduction to statistics".

Baayen, Harald R., Douglas J. Davidson & Douglas M. Bates (2008). Mixed-effects modeling with crossed random effects for subjects and items. In: Journal of Memory and Language 59 (2008), 390–412.

Blake, Robert (2012). Best practices in online learning: Is it for everyone? In: Rubio, Fernando, Joshua J. Thoms & Stacey Katz Bourns (Eds.) (2012). Hybrid language teaching and learning: Exploring theoretical, pedagogical and curricular issues. Boston, MA: Heinle Cengage Learning: AAUSC 2012 Volume, 10-26.

De Graaf, Rick & Alex Housen (2009). Investigating the effects and effectiveness of L2 instruction. In: Long, Michael & Catherine Doughty (Eds.) (2009). The handbook of language teaching. Malden, MA: Blackwell, 726-755.

Dekeyser, Robert & Alan Juffs (2005). Cognitive considerations in L2 learning. In: Hinkel, Eli (Ed.) (2005). Handbook of research in second language teaching and learning. Mahwah, NJ: Lawrence Erlbaum Associates, 437-454.

Eckman, Fred R., Lawrence Bell & Diane Nelson (1988). On the generalization of relative clause instruction in the acquisition of English as a second language: In: Applied Linguistics 9 (1988), 1-20.

Ellis, Nick C (1994). Implicit and explicit language learning: An overview. In: Ellis, Nick C. (Ed.) (1994). Implicit and explicit learning of languages. San Diego, CA: Academic Press, 1-32.

Ellis, Rod (2005). Measuring implicit and explicit knowledge of a second language: A Psychometric study: In: Studies in Second Language Acquisition 27 (2005), 141-172.

Ellis, Rod (1993). A theory of instructed second language acquisition. In: Ellis, Nick C. (Ed.) (1993). Implicit and explicit language learning. San Diego, CA: Academic Press, 79-114.

Ellis, Rod (2009a). Implicit and explicit learning, knowledge and instruction. In: Ellis Rod, Shawn Loewen, Catherine Elder, Rosemary Erlam, Jenefer Philp & Hayo Reinders (Eds.) (2009). Implicit and explicit knowledge in second language learning, testing, and teaching. Tonawanda, NY: Multilingual Matters, 3-25.

Ellis, Rod (2009b). Measuring implicit and explicit knowledge of a second language. In: Ellis Rod, Shawn Loewen, Catherine Elder, Rosemary Erlam, Jenefer Philp & Hayo Reinders (Eds.) (2009). Implicit and explicit knowledge in second language learning, testing, and teaching. Tonawanda, NY: Multilingual Matters, 237-261.

Enkin, Elizabeth (2012). The maze task: Training methods for second language learning: In: Arizona Working Papers in Second Language Acquisition & Teaching 19 (2012) 5, 56-81. 

Forster, Kenneth I. (2010) Using a maze task to track lexical and sentence processing: In: The Mental Lexicon 5 (2010) 3, 347-357.

Forster, Kenneth I. & Jonathan C. Forster (2003). DMDX: A windows display program with millisecond accuracy: In: Behavior Research Methods, Instruments, & Computers 35 (2003) 1, 116-124.

Forster, Kenneth. I., Christine Guerrera & Lisa Elliot (2009). The maze task: measuring forced incremental sentence processing time: In: Behavior Research Methods 41 (2009) 1, 163-171.

Freedman, Sandra E. & Kenneth I. Forster (1985). The psychological status of overgenerated sentences: In: Cognition 19 (1985), 101-131.

Gass, Susan, & Larry Selinker (Eds.) (1992). Language transfer in language learning. Amsterdam: Benjamins.

Gasparini, Silvia (2004). Implicit versus explicit learning: Some implications for L2 teaching: In: European Journal of Psychology of Education 19 (2004) (2), 203-219.

Just, Marcel A., Patricia A. Carpenter & Jacqueline D. Woolley (1982). Paradigms and processes in reading comprehension: In: Journal of Experimental Psychology: General 111 (1982), 228-238.

Loewen, Shawn (2009). Grammaticality judgment tests and the measurement of implicit and explicit L2 knowledge. In: Ellis Rod, Shawn Loewen, Catherine Elder, Rosemary Erlam, Jenefer Philp & Hayo Reinders (Eds.) (2009). Implicit and explicit knowledge in second language learning, testing, and teaching. Tonawanda, NY: Multilingual Matters, 94-112.

Nicol, Janet L., Kenneth I. Forster & Csaba Veres (1997). Subject-verb agreement processes in comprehension: In: Journal of Memory and Language 36 (1997), 569-587.

Osterhout, Lee & Holcomb, Phillip J. (1992). Event-related potentials elicited by syntactic anomaly: In: Journal of Memory and Language 31 (1992) 6 (1992), 785-806.

Pinheiro, José C. & Douglas M. Bates (2000). Mixed-effects models in S and S-PLUS. NY: Springer.

Qiao, Xiaomei, Liyao Shen & Kenneth I. Forster (2012). Relative clause processing in Mandarin: Evidence from the maze task: In: Language and Cognitive Processes 27 (2012) 4, 611-630.

R Development Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Reichle, Erik D., Keith Rayner & Alexander Pollatsek (2003). The E-Z reader model of eye-movement control in reading: Comparisons to other models: In: Behavioral and Brain Sciences 26 (2003), 445-526.

Thompson, Cynthia K., Lewis P. Shapiro, Swathi Kiran & Jana Sobecks (2003). The role of syntactic complexity in treatment of sentence deficits in agrammatic aphasia: The complexity account of treatment efficacy (CATE): In: Journal of Speech, Language, and Hearing Research 46 (2003), 591-607.

Tokowicz, Natasha & Brian MacWhinney (2005). Implicit and explicit measures of sensitivity to violations in second language grammar: An event-related potential investigation: In: Studies in Second Language Acquisition 27 (2005), 173-204.

VanPatten, Bill. (1987). Classroom learners’ acquisition of ser and estar. Accounting for developmental patterns. In: VanPatten, Bill, Trisha R. Dvorak & James F. Lee (Eds.) (1987). Foreign language learning: A research perspective. Cambridge, MA: Newbury House, 61-75.

Witzel, Jeffrey & Forster, Kenneth I. (2014). Lexical co-occurrence and ambiguity resolution: In: Language, Cognition and Neuroscience 29 (2014) 2, 158-185.


Witzel, Naoko, Jeffrey Witzel & Kenneth I. Forster (2012). Comparisons of online reading paradigms: eye tracking, moving window, and maze: In: Journal of Psycholinguistic Research 41 (2012) 2, 105-128.




Authors:

Dr. Elizabeth Enkin
Assistant Professor of Spanish Applied Linguistics
University of Nebraska-Lincoln
Department of Modern Languages and Literatures
Lincoln, NE 68588, USA
E-mail: eenkin@unl.edu

Kenneth Forster
Professor of Psychology
University of Arizona
Department of Psychology
Tucson, AZ 85721, USA
E-mail: kforster@u.arizona.edu



1 A live demonstration of the task can be found at the following website:
http://www.u.arizona.edu/~kforster/MAZE/).