TITLE: Adaptively Scaffolding Cognitive Engagement with Batch Constrained Deep Q-Networks ABSTRACT: Scaffolding student engagement is a central challenge in adaptive learning environments. The ICAP framework defines levels of cognitive engagement with a learning activity in terms of four different engagement modes (Interactive, Constructive, Active, and Passive) and it predicts that increased cognitive engagement will yield improved learning. However, a key open question is how best to translate the ICAP theory into the design of adaptive scaffolding in adaptive learning environments. Specifically, should scaffolds be designed to require the highest levels of cognitive engagement (i.e., Interactive and Constructive modes) with every instance of feedback or knowledge component? To answer this question, in this paper we investigate a data-driven pedagogical modeling framework based on batch-constrained deep Q-networks, a type of deep reinforcement learning (RL) method, to induce policies for delivering ICAP-inspired scaffolding in adaptive learning environments. The policies are trained with log data from 487 learners as they interacted with an adaptive learning environment that provided ICAP-inspired feedback and remediation. Results suggest that adaptive scaffolding policies induced with batch-constrained deep Q-networks outperform heuristic policies that strictly follow the ICAP model without RL-based tailoring. The findings demonstrate the utility of deep RL for tailoring scaffolding for learner cognitive engagement. AUTHORS: Fahmid Morshed Fahid, Jonathan Rowe, Randall Spain, Benjamin Goldberg, Robert Pokorny, James Lester NOTE: Presented in the workshop as part of the ENCORE track. The paper has been accepted at 22nd Conference on Artificial Intelligence in Education (AIED21). URL: https://link.springer.com/chapter/10.1007/978-3-030-78292-4_10