Unraveling the Effects of Task Sequencing on the Syntactic Complexity, Accuracy, Lexical Complexity, and Fluency of L2 Written Production

Unraveling the Effects of Task Sequencing on the Syntactic Complexity, Accuracy,


Background Task Complexity and Task Sequencing: Framing Theoretical Perspectives
Over recent decades, many task complexity studies have been driven by two robust and competing models: Skehan's (1998Skehan's ( , 2001 Limited Attentional Capacity (LAC) Model and Robinson's (2001Robinson's ( , 2003 Cognition Hypothesis which make different predictions about the cognitive operations and attentional resources affecting L2 development. The LAC Model, grounded in psycholinguistic theories of first language (L1) acquisition, conceptualizes the relationship between cognitive and attentional resources during L2 processing (Skehan, 1998). This model predicts that increasing task complexity reduces cognitive capacity for monitoring linguistic form because complexity, accuracy, and fluency (CAF) compete intensely for the same attentional resources. Therefore, learners allocate attentional resources to CAF such that one of these constructs (e.g., accuracy) benefits at the expense of others (e.g., complexity and fluency), resulting in a tradeoff effect due to reduced cognitive capacity for monitoring formal aspects of the task (Abrams, 2019). The LAC model proposes that tasks should balance and distribute learner attention so that no CAF elements are neglected during L2 development. Conversely, Robinson's (2001Robinson's ( , 2003 Cognition Hypothesis (CH), grounded in functional/cognitive linguistics, takes an alternate view of learners' cognitive abilities, arguing that learners possess multiple, rather than limited, attentional resources that do not compete. This forms the basis for the central claim of the CH which states that, since different CAF elements belong to different attentional resources, complexity and accuracy can be attended to concurrently with possible decays in fluency.
To provide a more comprehensive classification for determining task complexity, Robinson (2007) expanded on the CH with the Triadic Componential Framework (TCF), recently renamed the SSARC Model (Robinson, 2010). The TCF classifies task characteristics into three categories: task conditions, task difficulty, and task complexity. Task conditions represent interactional factors influencing the type and quantity of interactions required in a task, while task difficulty describes individual abilities and affective factors learners bring to task performance. Task complexity refers to task features that can be manipulated to challenge learners' cognitive resources and is further divided into resource-dispersing and resource-directing factors (see Table 1). Resource-directing factors direct learner attention to language needed for task completion, while resourcedispersing variables make procedural and performative cognitive demands. Robinson (2001Robinson ( , 2003 stipulates that simple tasks yield greater fluency as resource-dispersing variables become more complex, while further complexification along resource-directing variables yields greater accuracy. Such complexification increases linguistic production and pushes learners to adjust and expand their interlanguage but decreases fluency.

Table 1
Task Variables in Robinson's Triadic Componential Framework (Adapted from Robinson & Gilabert, 2007, p. 164)  Several studies have investigated the effects of task complexity on learners' CAF by manipulating resource-directing and resource-dispersing factors. Relevant to the present research are studies examining the effect of raising task complexity along the resourcedirecting variable of number of elements (± elements) and the resource-dispersing variable of pre-task planning time (± planning) on L2 written production.
Expanding on earlier theories of task complexity, Robinson's (2010) SSARC model distinguishes which task variables should be manipulated during task sequencing, favouring cognitive factors over interactional and learner factors. This comprises the first principle in the SSARC model: Tasks should be sequenced only according to cognitive complexity, operationalized as resource-directing and resource-dispersing variables, while other variables remain constant. The second task sequencing principle states that task complexity should first be increased along resource-dispersing variables, followed by resource-directing variables (Robinson, 2010).
The SSARC model illustrates the rationale for these two principles. In Step 1, Stabilize and Simplify, learners complete simple tasks, engaging their current interlanguage. In Step 2, Automatize, complexity is increased along resource-dispersing variables to encourage quicker access to learners' interlanguage. In Step 3, Restructure and Complexify, complexity is raised along both resource-dispersing and resource-directing variables so that learners' interlanguage systems are destabilized and restructured, thus promoting more complex interlanguage (Robinson, 2010). Therefore, the SSARC model offers clear pedagogical implications for task-based syllabus design by proposing a predetermined sequence that promotes interlanguage development, as shifts in task complexity induce gradual shifts in interlanguage throughout the task sequence.

Review of Empirical Studies: Effects of Task Sequencing on L2 Production
To test Robinson's SSARC Model, several studies investigating the effect of task complexity and sequencing on L2 development have been conducted on L2 oral and written production by raising task complexity along resource-directing and/or resourcedispersing variables over different sequencing orders.
Studies testing the SSARC model by manipulating only resource-directing variables have reported a range of findings. Levkina and Gilabert (2014) tested the effect of different sequences (simple-complex, complex-simple, and randomized) complexified along two variables (± spatial reasoning; ± perspective-taking) on learners' retention of spatial expressions over time. Immediate post-tests revealed that complex-to-simple sequencing led to greater short-term retention of spatial expressions, while delayed posttests showed that simple-complex sequencing resulted in greater long-term retention. Baralt (2014) examined the effects of four simple-complex sequences (SSC, SCS, CSC, CCS) on L2 oral and written production by raising complexity along ± reasoning demands. Results showed that sequences with more complex tasks (CCS and CSC) generated increased learning opportunities and greater L2 development. Malicka (2014Malicka ( , 2018 tested the effects of different sequences (simple-to-complex, randomized, or individual tasks) on L2 oral production by raising complexity along ± reasoning demands and ± few elements. Both of her studies reported that simple-to-complex task sequencing led to improved speech rate, accuracy, and structural complexity. While the studies above indicated that simple-to-complex sequencing is effective, the SSARC model was not fully tested since only resource-directing variables were manipulated.
Only two studies have examined the effect of task sequencing by manipulating both resource-directing and resource-dispersing factors according to the SSARC model's proposed order. Lambert and Robinson (2014) explored the effects of simple-complex and randomized task sequencing on L2 written production, modifying both resource-directing (± few elements; ± reasoning demands) and resource-dispersing variables (± planning; ± prior knowledge; ± number of steps; ± multi-tasking). Findings indicated the simple-tocomplex task sequence led to greater overall long-term benefits. More recently, Allaw and McDonough (2019) tested the effect of simple-complex versus complex-simple sequences on L2 written production by manipulating both resource-directing (± spatial reasoning) and resource-dispersing variables (± task structure). While both sequences yielded increased lexical diversity, grammatical accuracy, and fluency, the simple-complex sequence led to greater overall performance and long-term improvement. This provides strong support for Robinson's SSARC model, as the model's proposed simple-complex sequence yielded the greatest performance gains.

The Present Study
The current study was motivated by two gaps identified in the TBLT literature: First, several studies have explored the effects of task sequencing on L2 oral production (Malicka, 2014(Malicka, , 2018, acquisition (Baralt, 2014), and interaction (Kim & Payant, 2014), but the limited number of studies investigating the impacts of task sequencing on L2 writing yielded contradictory results (Allaw & McDonough, 2019;Lambert & Robinson, 2014, Levkina & Gilabert, 2014. Second, while some TBLT studies have explored task sequencing by manipulating either resource-directing or resource-dispersing factors, only two task sequencing studies have simultaneously manipulated both of these task complexity factors as proposed in the SSARC model (Allaw & McDonough, 2019;Lambert & Robinson, 2014), producing different results. Thus, further exploration is warranted to provide additional empirical evidence for the SSARC model. To address the aforementioned gaps, the current study seeks to answer the following question: What is the effect of simple ̶ complex task sequencing on the syntactic complexity, accuracy, lexical complexity, and fluency of L2 written production when compared to individual task performance?

Participants and Context
This study included a sample of 90 undergraduate learners (46 women and 44 men) who enrolled in a fifteen-week, advanced multilingual writing course at a large university in the United States. They were recruited from five different classes taught by two different instructors using a communicative-based syllabus (Ellis, 2003) and employing similar teaching materials. Participants were aged between 19-21 years old (M = 20.6, SD = 8.61) and had lived in English-speaking countries for three years (M = 2.6, SD = .8). The participants came from a variety of L1 backgrounds. Half of the participants were Hispanic (N = 45) while the rest were from China (N = 8), South Korea (N = 7), France (N = 6), , Holland (N = 2), and Nigeria (N = 1). They had learned English as an L2 for 9 to 12 years (M = 10.3, SD = 4.28) at the time of data collection. Participants received nearly 39 hours of formal classroom teaching before data collection and all had upper-intermediate proficiency levels, based on their TOEFL-iBT scores in the 17-23 range and performance on a writing placement test administered by the university annually. After signing informed consent forms, participants were randomly divided into two groups to either perform written tasks in a simple ̶ complex sequence or complete the same tasks individually.

Procedure
Prior to starting the main experiment, a pilot study was conducted to determine the time needed for performing different versions of a written decision-making task and validate the assumptions about task complexity manipulations before the main experiment. Thus, 30 participants first performed a simple decision-making task with 10 minutes of planning time, then the same task without planning time, and finally a complex version of the task without planning time. After performing each task, the participants completed the task complexity questionnaire to validate the task complexity manipulations. The time each participant spent on different versions of the task was recorded by the researcher, for which descriptive statistics are presented in Table 2. As shown in Table 2, there was a gradual pattern of increasing lengths of time spent on the written tasks as cognitive complexity increased, such that the simple task with planning time was the shortest task to perform, followed by the simple task without planning, and the complex task without planning time as the longest. To ensure that pretask planning was properly operationalized, the average amount of time spent by participants to perform the tasks was used as the time limit in the main writing experiment.
After conducting the pilot study, the researcher visited regularly scheduled multilingual writing classes and randomly assigned participants to two groups to conduct the main experiment: 1. the simple ̶ complex sequencing group (n = 30), and 2. the individual task group (n = 60). Following the SSARC model of task sequencing and the times set in the pilot study, the first group performed the written tasks in the simple-to-complex order with 5-minute intervals; participants performed the simple task in 17.2 minutes with 10 minutes of planning time, then the same task in 23.8 minutes without planning time, and finally the complex one in 30.2 minutes without planning time. Similar to Malicka's (2018) study, 60 participants in the individual task group were subdivided into three groups of equal size (n = 20); each subgroup completed only one task under the same task conditions as in the sequencing group. Thus, the difference between the two groups is that in the simple ̶ complex group, all participants performed the tasks successively, while in the individual task group, participants in each subgroup completed only one task at a pre-determined cognitive complexity level.

Writing Tasks
In line with the principles of the SSARC model, different versions of a decisionmaking task varying in terms of inherent cognitive complexity were created and sequenced in the simple ̶ complex order. The versions of the writing task, manipulated along resourcedirecting and resource-dispersing factors, were different in terms of cognitive complexity by decreasing or increasing the number of task elements (a resource-directing factor) and providing or removing pre-task planning time (a resource-dispersing factor). The decisionmaking task included specific descriptions of different job candidates who applied for a software engineering position at a well-known company; the number of candidates varied in different versions of the written task. Table 3 summarizes the sequenced writing task in light of the SSARC model. In Stage I, participants were given four job candidates' application dossiers, and based on the information given, had to decide which two candidates would be the most qualified for the company's software engineering position. Before writing, they had 10 minutes of planning time to inspect each candidate's application dossiers and write some planning notes, allowing participants to prepare for the task performance and focus on content, language, and organization. More importantly, the provision of planning time was hypothesized to mitigate the online processing load during writing and free up attentional resources for focusing on different aspects of the task; furthermore, the task consisted of only four elements placing a lower demand on their working memory. The combination of a simple version of the task and planning time was intended to simplify input corresponding to the first stage in the SSARC model.
In Stage II, the number of task elements remained unchanged and participants again decided which two out of four candidates would be top-tier candidates for the software engineering position. Following the second stage in the SSARC model, in which tasks should be cognitively demanding on resource-dispersing factors (Robinson, 2010), planning time was removed to increase the cognitive complexity on this version of the task, in order to improve automatization of the writing process. Thus, participants had to produce similar ideas as in the simple task, while engaging in planning and writing at the same time. Compared to the simpler task, this task was considered more cognitively demanding since participants wrote without planning time.
In Stage III, participants were given six job candidates' application dossiers and had to decide which two would be the most qualified candidates for the company's open position. The increase in the number of elements along with the removal of planning time bolstered the complexity of the writing task such that participants had to simultaneously process and analyze six candidates, then plan and organize their arguments before finally expressing their new ideas. This increase in cognitive complexity corresponds to stage three in the SSARC model postulating that tasks should be cognitively demanding along both resource-directing and resource-dispersing factors, connecting learners to novel linguistic forms, and pushing them to complexify their interlanguage (Robinson, 2010).

Validation of Cognitive Task Complexity Assumptions
TBLT researchers have called for examining the validity of task complexity assumptions to confirm that tasks designed to be cognitively complex actually result in varying levels of cognitive complexity for learners (Norris, 2010;Révész, 2014). In response, numerous studies have used various subjective and objective techniques to verify the assumptions about the impacts of task manipulations on task complexity, as summarized in Table 4.  Révész, Kourtali, & Mazgutova (2017) Before the main experiment, a pilot study with 30 students and 10 teachers was conducted to validate the speculated differences in cognitive complexity levels between the written tasks via two different analytical, subjective techniques: learner self-ratings and expert judgments. To examine learner perceptions and attitudes about the perceived difficulty, stress, and cognitive load of the tasks, a nine-point Likert-scale questionnaire adapted from Lee's (2018) original questionnaire was used to elicit participants' responses to 11 items representing different categories. In addition, all participants were requested to answer four open-ended questions: 1. Did you notice any difference between performance of the writing task with the provision of planning time versus no provision of planning time? 2. Which one was more difficult to perform and why? 3. In your opinion, was there any difference between the two writing tasks in terms of difficulty? 4. Which of the two versions of the written task was more difficult to perform and why? As presented in Table  5, the descriptive results demonstrated a steady increase in participants' self-ratings of perceived difficulty, mental effort, and stress as cognitive complexity intensified. In line with the predictions made, the overall ratings for the simple versions of the task were lower than those for the complex task. In addition, the simple task with planning time obtained lower means than the version without planning time in terms of the three affective variables. Regarding participants' responses to the follow-up questions, one student wrote, The preparation time before writing was really accommodating and beneficial because it helped me gather my thoughts create a clear outline and write with more comfort and confidence. But I had a hard time performing the second task without planning time and it became much harder while doing the third task including more job candidates. Compared to the third task, the second one was less difficult because I already carried it out and had some familiarity with the task performance. To see how experts judge the cognitive load of tasks, 10 university instructors with considerable experience teaching L2 writing courses were asked to rate the difficulty of tasks on a Likert scale originally adapted from Robinson's (2001) questionnaire. They were also provided with open-ended questions regarding the overall perceived difficulty and mental effort required to complete the tasks. Their responses to the questionnaire items were analyzed by one rater and the open-ended responses were examined by two raters. The results are presented in Table 6. As expected, students' and instructors' ratings were consistent: the complex task was rated higher than the simple task, and the simple task with no planning time was rated higher than the simple task with planning time. One expert wrote, In my opinion, the final task was the difficult one because it would require learners to engage in processing and analyzing in more depth. In addition, they would have to choose two out of six candidates with strong applications and support their choices with evidence. However, I would perceive the first task aligned with planning time as the least difficult task because learners can prepare their written plans beforehand and use them while writing. The same version of the task without planning is the medium task which inevitably requires more mental effort and time to process and perform due to the absence of planning time.

CALF Measures
Following Norris and Ortega (2009), who argue for assessing syntactic complexity multidimensionally while warning against using redundant measures which result in a multicollinearity effect, four different measures were utilized to gauge different subconstructs of complexity: 1. the mean length of T-unit (MLT) as a general measure of syntactic complexity calculated by dividing the total number of words by the total number of T-units in a text, 2. the ratio of dependent clauses to T-units (DC/T) as a measure of subordination complexity gauged by the ratio of dependent clauses to T-units in a text, 3. the number of complex nominals per T-unit (CN/T) analyzed by dividing the total number of complex nominals by the total number of clauses, and 4. the number of complex nominals per clause (CN/C) computed by the total number of complex nominals by the total number of T-units (Lu, 2010). The last two measures (CN/T and CN/C) were chosen to examine phrasal complexity in participants' written production. In our analysis, T-unit was used rather than C-unit or AS-unit because the nature of written tasks was monologic (Foster, Tonkyn, & Wigglesworth, 2000). It is defined as "one main clause plus any subordinate clause or non-clausal structure that is attached to or embedded in it" (Hunt, 1970, p. 4).
In line with Ellis and Yuan (2004), the accuracy of L2 written production was examined by calculating the ratio of error-free clauses to the total number of clauses and the ratio of correct verb forms to the total number of verbs used in each text. To calculate error-free clauses, participants' written production was divided into clauses and lexical, morphological, and syntactic errors were identified and marked. Any unmarked clause was considered error-free. For each participant, the proportion of error-free clauses was regarded as the resulting score. Given that the error-free clauses metric is a holistic measure of accuracy, correct verb forms were used as a specific measure. For each participant, the proportion of verbs free of tense, aspect, modality, or agreement errors was used as a score of analysis.
Lexical complexity was gauged with respect to diversity and sophistication. The measure of textual lexical diversity (MLTD) was used by computing "the mean length of sequential word strings in a text that maintain a given TTR value" (Mazgutova & Kormos, 2015, p. 5). MLTD was used rather than other vocabulary diversity metrics, specifically mean segmental type-token ratio (MSTTR), because it is least affected by text length (Mazgutova & Kormos, 2015;McCarthy & Jarvis, 2010) and therefore is considered a more reliable indicator of vocabulary. Additionally, there is a growing demand for using lexical sophistication metrics in measuring L2 written production (Johnson, 2017). In response, the log frequency of content words was utilized to examine the lexical sophistication of texts and was calculated by the ratio of the log frequency to content words in the CELEX database (McNamara et al., 2014). There were two main reasons for the selection of the log frequency of content words: 1. compared to frequency band measures, the log frequency of content words can better represent large and small improvements in the participants' written production due to inclusion of the frequency counts from a large corpus (Kyle & Crossley, 2015); and 2. this metric measures lexical sophistication with a higher degree of reliability in comparison to the raw frequency of content words (Kormos, 2011). Finally, fluency was calculated by counting the number of words produced within a set time (Abrams, 2019). This product-based measure was selected for two reasons: 1. it has ecological validity such that teachers can use it in curriculum-based assessment, and 2. it allows comparability of the results with the findings of past studies.

Statistical Analysis
In the analyses of written performances, we used nine different measures to examine the effects of task sequencing as an independent variable on different constructs of L2 written production as dependent variables. First, means and standard deviations were calculated for all the variables in the different groups. Then, data sets were directly imported into R version 3.6.1 (R Development Core Team, 2019) to check the normality assumption through normal Q-Q plots and the Kolmogorov-Smirnov test. Given that six out of nine variables were associated with a violation of the normality assumption, nonparametric statistics (i.e., Mann-Whitney U test) were used to answer the research question. The Mann-Whitney U test was performed to detect significant comparisons between the simple ̶ complex sequencing group and the individual task group. The level of significance for this study was set at an alpha level of 0.05. For the Mann-Whitney U test, Cohen's d was employed to measure effect sizes. Following Plonsky and Oswald's (2014) benchmarks, d values of .40, .70 and 1.00 were considered as small, medium, and large, respectively.

Effects on Syntactic Complexity
The descriptive statistics for the measures of general complexity (MLT), subordination complexity (DC/T), and phrasal complexity (CN/T and CN/C) are presented for the simple ̶ complex sequencing group and the individual task group in Figure 1. The inferential statistics revealed significant differences in performance between the two groups with respect to MLT in the simple task with planning time, MLT and DC/T in the simple task without planning time, and DC/T in the complex task. The participants in the simple ̶ complex sequencing group produced more complex structures than their counterparts in the individual task group on two simple tasks with and without planning (simple task + planning, p = .000 and simple task ̶ planning, p = .000). In both cases, the d scores indicated large effect sizes corroborating a substantial difference between the two groups' performance in the case of MLT. Similar results were also found between the simple ̶ complex sequencing group and the individual task group in terms of complex subordinations in the simple task without planning time (p = .000) and the complex task (p = .000) with large effect sizes (d = 1.08 and 1.03, respectively). Table 7 displays the performance of the simple ̶ complex sequencing group as opposed to the individual task group for complexity subconstructs.

Effects on Accuracy
The descriptive statistics revealed that the simple ̶ complex sequencing group yielded higher mean scores than the individual task group regarding the two accuracy measures in the complex task, but the opposite was found for the simple task without planning time. In the case of the simple task with planning time, the simple ̶ complex sequencing group produced a higher mean score than the individual task group on correct verb forms; however, the latter had a higher mean score on error-free clauses (see Figure  2). The results from the Mann-Whitney U test revealed that, whereas the individual task group outscored the simple ̶ complex sequencing group regarding error-free clauses and correct verb forms in the simple tasks with and without planning time, the comparisons did not reach statistical significance. The d scores showed small effect sizes between the two groups' performance in the simple task with planning time (d = .2 for EFCs and d = .17 for CVFs) and in the simple task without planning time (d = .21 for EFCs and d = .12 for CVFs). Nonetheless, the proportion of error-free clauses and correct verb forms in the simple ̶ complex sequencing group was higher than in the individual task group in the complex task and the comparisons were statistically significant with large effect sizes (p = .000, d = .95 in the case of EFCs and p = .001, d = .86 in the case of CVFs). The inferential statistics are summarized in Table 8.

Figure 2 Mean Scores for EFCs and CVFs
Note. IND = individual task group, SCS = simple ̶ complex sequencing group

Effects on Lexical Complexity
The descriptive statistics showed the simple ̶ complex sequencing group produced higher mean scores than the individual task group with respect to lexical diversity (MTLD) and sophistication (WRDFRQmc) in the three writing tasks (see Figure 3). The comparisons between the two groups' performance reached statistical significance only in the case of MTLD with a medium effect size in the simple task (p = .004, d = .60). In addition, significant comparisons were found between the simple ̶ complex sequencing group and the individual task group regarding MTLD and WRDFRQmc in the complex task. The d scores indicated a medium effect size for the simple ̶ complex sequencing group compared to the individual task group (d = .66 for MTLD and d = .63 for WRDFRQmc). Table 9 presents the inferential statistics for two measures of lexical complexity.

Figure 3 Mean Scores for MTLD and WRDFRQmc
Note. IND = individual task group, SCS = simple ̶ complex sequencing group

Effects on Fluency
The descriptive statistics showed that participants in the simple ̶ complex sequencing produced more words within the designated time limit in the simple task with planning time and in the complex task when compared to the individual task group. Nevertheless, the latter generated more words in the simple task without planning time than the former (see Figure 4). As can be observed in Table 10, the results revealed that the comparison between the two groups was statistically significant only in the case of the simple task with planning time (p = .001) and this was reflected by a large effect size (d = .87). However, no comparisons between the two groups were found to be statistically significant with respect to fluency in the simple task without planning time (p = .350) and in the complex task (p = .320). Overall, these results revealed that the simple ̶ complex sequencing group produced greater syntactic complexity at the general level (simple tasks with and without planning time) and at the subordination level (simple task without planning and complex task); wrote more error-free clauses and correct verb forms (complex task); exhibited more diverse and sophisticated vocabulary (simple task with planning and complex task); and finally, wrote faster and generated more words within a set time (simple task with planning time) as compared to the individual task group.

Discussion
The major thrust of this study was to explore the effects of task sequencing on multilingual learners' written production in accordance with the SSARC model. Given that past studies tested the role of the SSARC model in L2 oral production (Baralt, 2014;Malicka, 2014Malicka, , 2018, finding that the simple-to-complex order led to significant gains in L2 production over the short or long term, the current study examined the effectiveness of the SSARC model in L2 written production using a simple-complex task design. The results demonstrated that simple-complex task sequencing favoured syntactic and lexical complexity, promoted accuracy, and assisted fluency, providing empirical evidence supporting the theoretical claim of the SSARC model regarding the beneficial role of simple-complex task sequencing for L2 written production. With respect to syntactic complexity, whereas production of complex structures gradually decreased in both the simple-complex sequencing and individual performance groups over the sequence, the former produced both a significantly higher mean length of T-unit in the sequence's first two tasks, characterizing general complexity, and significantly more subordination, recognized as the most indicative source of complexification at the intermediate level, in the last two tasks in the sequence. Compared to the individual task group, the simple-complex sequencing group also produced more complex nominal structures throughout the sequence, manifesting greater phrasal-level complexification as a result of increased cognitive complexity, although not at a statistically significant level. These results partially corroborate Malicka (2014Malicka ( , 2018 which also found that simple-complex sequencing increased syntactic complexity at the clausal level, but conversely reported decreases in subordination. Furthermore, the results revealed that in the simple task with planning time, only general syntactic complexity was significantly fostered in the simple-complex sequencing group. As reported previously (Abdi Tabari, 2020;Ellis & Yuan, 2004;Farahani & Meraji, 2011;Rostamian et al., 2018), pre-task planning provided learners with the opportunity to plan, linguistically encode their messages, and produce more complex writing. Increases in general complexity and complex subordinations can also be explained by the SSARC model, postulating that removing pre-task planning time and increasing cognitive complexity induced learners to exhibit higher levels of syntactic complexity because their exposure to the simple task created more scaffolded opportunities for rehearsal, serving as a preparatory mechanism to complexify their writing at a deeper level (Malicka, 2014). Nevertheless, increases in the number of elements did not prompt learners to stretch their syntactic resources in the complex task to produce significantly more complex structures at the phrasal complexity level, possibly due to their proficiency level. Overall, these results partially support the Cognition Hypothesis (Robinson, 2003) which predicts that increasing task complexity along resource-directing factors can push learners to extend their existing L2 repertoire to meet task demands.
Regarding L2 writing accuracy, the results revealed that on both measures for this construct, correctness of linguistic forms steadily decreased in both groups' performance, with the simple task with planning generating the most accurate forms. Nevertheless, significant differences were observed only in the complex task resulting in increased accuracy in the simple-complex group. These results suggest that when tasks are performed in the simple-to-complex order, learners can produce more correct clauses and verbs and improve the accuracy of their written production as compared to tasks performed in isolation. These results echo findings of past oral-and written-performance task sequencing studies (Allaw & McDonough, 2019;Malicka, 2014Malicka, , 2018 reporting increases in grammatical accuracy under the simple ̶ complex task condition. Following the SSARC model, the simple task with planning time may provide learners with the opportunity to direct more attention to linguistic forms, activate their monitoring behaviour, and rehearse target-like structures. In the simple task without planning time and the complex task, they can recall those structures, be cognizant of problematic linguistic areas, and avoid errors although pre-task planning time is removed. Consequently, less deviation from target-like structures should occur on the less complex and complex versions of the task. These results support Robinson's Cognition Hypothesis which postulates that complex tasks can lead to more accurate language if complexification occurs along resource-directing factors. However, it is imperative to mention that the results of this study only provide some preliminary evidence confirming the prediction of dual increases in complexity and accuracy. Concerning lexical complexity, results showed that lexically diverse items gradually increased in both groups' performance and the complex task triggered the highest lexical diversity. Regarding lexical sophistication, a different pattern was found for the two groups such that the simple ̶ complex sequencing group produced the most sophisticated vocabulary in the complex task, but the individual task group generated the most sophisticated lexical items in the simple task with planning. These results support previous findings that increased task complexity induces greater lexical diversity in both oral and written performance (Abdi Tabari, 2020;Frear & Bitchener, 2015;Kuiken & Vedder, 2007, 2008Levkina & Gilabert, 2012;Ong & Zhang, 2010;Rahimi, 2018), as well as Allaw and McDonough's (2019) findings, providing further confirmation of the SSARC model's claim that increasing cognitive complexity in the simple ̶ complex sequence induces greater lexical complexity. The simple task helped learners focus on simplified input and stabilize recently learned lexical items, while also creating prerequisite conditions for the subsequent complexification of their lexical production. The less complex task, removing pre-task planning, encouraged greater independence in using a wider range of lexical items, as well as fostering consolidation and automatization. Finally, the complex task, increasing the number of elements, aided the learners in complexifying their lexical production over the simple-to-complex sequencing order.
Regarding L2 writing fluency, both groups generated the highest number of words at the fastest speed in the simple task with planning time. Notably, under the individual task performance condition, the proportion of words produced and writing fluency decreased steadily as a result of increasing cognitive complexity, while task performance under the simple ̶ complex condition displayed a U-shaped pattern in writing fluency, demonstrating the multidimensional, rather than strictly linear, nature of written production even over a short-term sequence (Malicka, 2014). Furthermore, the participants produced more words and demonstrated greater fluency in the simple-to-complex order than in isolation. Increased fluency in the simple task with planning corroborate the findings of previous studies (Farahani & Meraji, 2011;Levkina & Gilabert, 2012;Rahimi, 2018;Rostamian et al., 2018) and support Robinson's (2005) claim that the resource-directing factor's role (e.g., planning time) is to facilitate production under time pressure, assist access to learners' existing L2 knowledge, promote automaticity of the interlanguage system, and build fluency. Moreover, these results confirm the prediction of the Cognition Hypothesis that simpler tasks will improve production fluency (Baralt, Gilabert, & Robinson, 2014). Additionally, it is necessary to compare our results with the findings of Allaw and McDonough's (2019) study to better understand the short-and long-term effects of task sequencing on L2 writing fluency. While our results reveal a non-linear pattern regarding writing fluency over a short-term sequence, Allaw and McDonough (2019) reported that writing fluency is promoted as the result of increasing cognitive task complexity in the long term. Although the two studies manipulated tasks along resourcedirecting and resource-dispersing factors and employed the same measure to gauge L2 writing fluency, they differ from each other in terms of (1) the type of written task, (2) the types of resource-directing and resource-dispersing factors manipulated, (3) the explicit instructions participants received, and (4) the time intervals between the simple, less complex, and complex tasks. These differences can possibly account for diverging results between the two studies.

Implications and Future Directions
The current study offers theoretical and pedagogical implications for TBLT researchers and L2 teachers. Theoretically, our findings partially support the predictions of the Cognition Hypothesis and the SSARC model, revealing the progressive and variable nature of the effects of task sequencing when manipulated along resource-directing and resource-dispersing factors on a short-term simple-to-complex sequence. Increases in task complexity along the resource-directing factor induced learners to surpass their L2 knowledge base to meet the demands of the task, thus increasing the complexity of L2 written production. Conversely, the resource-dispersing factors facilitated access to existing L2 knowledge, promoting automatization and fluency. To help learners achieve more balanced CALF performance in writing, increasing these factors alone would not necessarily extend learners' L2 knowledge base and increase task complexity unless these factors are specifically integrated into simpler tasks in the sequence. It should be noted only a shorter sequence was tested and more research is warranted to examine how specific combinations of resource-directing and resource-dispersing factors could be integrated into a longer sequence to be used for a classroom syllabus. In line with the SSARC model, our findings also suggest that simple tasks designed for stabilization and complex tasks aimed at automatization should be implemented before subsequent restructuring and complexification can occur. This simple-to-complex sequence can provide learners with more scaffolded opportunities to practice and consolidate newly learned lexical and grammatical structures in their writing, promoting complexification of their production. However, this claim still requires more fine-grained evidence to flesh out the proposals of the SSARC model in the long term and disclose what happens after the processes of restructuring and complexification have occurred.
Pedagogically, the SSARC model associated with the Cognition Hypothesis offers several useful implications for classroom practice and syllabus design. Following the SSARC model, teachers can design lesson plans including a set of tasks with varying levels of cognitive complexity and provide multiple practice opportunities for learners to perform written tasks while benefiting from extra preparatory options such as pre-task planning time. At the pre-task stage, L2 learners with varying writing proficiency levels can experience less pressure and gain mental and linguistic preparedness for subsequent tasks due to scaffolding opportunities, improving focus on the content of their writing and increasing their speed of production. In the following stage, teachers can prompt learners to repeat the same task to facilitate greater autonomy, gradually increasing their independent performance as extra preparation is removed. Thus, learners may be more involved in rehearsal, the extension of their L2 repertoire, and development of complex structures, gaining prerequisite readiness for performing more sophisticated written tasks. In the posttask stage of writing, teachers can design more sophisticated resource-directing tasks, which stretch learners' linguistic resources, encourage them to face the challenges of performing these complex tasks, increase their ability to take risks, and push their interlanguage to its limits to use more innovative language. Teachers can also adjust the complexity level of tasks by manipulating different resource-directing and resourcedispersing factors; however, such task sequencing decisions should be made based on learners' needs, considering their background knowledge, proficiency levels, motivation, and readiness. In addition, teachers are cautioned against designing a task sequence with different cognitive complexity levels in a vacuum. They should ensure that tasks with different instructional demands match relevant areas of their syllabus and promote learners' existing L2 knowledge over time. Finally, teachers with limited teaching experience may struggle to understand and interpret research-supported ideas, manipulate and sequence tasks, and regularly and effectively implement them in classroom settings. To solve this problem, teachers should receive ongoing training, support, and hands-on practice to understand the principles of the Triadic Componential Framework and the SSARC model, make sound task-sequencing decisions, and effectively operationalize classroom tasks with some degree of confidence.
This study has several limitations which require more consideration for future studies. As task sequencing was operationalized along both resource-directing and resource-dispersing factors, increasing the number of elements or the length of pre-task planning time could lead to different task sequencing effects on the CALF of L2 written production. Additionally, this study examined the effects of two task complexity factors (± number of task elements and ± planning time) on L2 writing. Future studies can extend this line of research by investigating the effects of manipulating other resource-directing and resource-dispersing factors on L2 written production. Furthermore, only subjective techniques were employed to verify the assumptions regarding the impacts of task manipulations on task complexity. Future studies could use objective techniques such as eye-tracking or dual task methodology to validate task difficulty assumptions in the L2 writing context. Finally, this study recruited participants at the upper-intermediate proficiency level, but more significant results could be obtained if advanced-level participants are examined to discover any relationships between proficiency and task sequencing effects.