TITLE: Differential Segmentation of Categorical Log Data from Open-Ended Learning Environments ABSTRACT: Open-ended learning environments afford students opportunities to explore, manipulate, and test concepts, and have the potential to provide students with feedback and support by leveraging the log data generated by them. However, within open-ended contexts, student log data is often noisy and identifying periods of meaningful activity is difficult. This paper introduces a new sequence mining method to overcome this challenge. The Differential Segmentation of Categorical Sequences (DiSCS) algorithm finds segments within a sequence of actions that are maximally or near-maximally different from their immediate neighbors. Segments are then clustered to reveal common periods of student activity. We examine the performance of this method under a variety of conditions to find how well DiSCS can identify where different states of simulated activity start and end. We report that when provided with only the observed actions, DiSCS is able to identify the hidden states of simulated student activity with strong and very strong associations. This strong performance is robust across a variety of contexts including those where observed actions are noisy or common to multiple states. We discuss the implications and limitations of using this method for reinforcement learning in open-ended learning environments. AUTHORS: James Bywater, Mark Floryan, Jennifer Chiu NOTE: Presented in the workshop as part of the ENCORE track. This paper is from AIED 2021 conference. The paper can be accessed at the following link: https://link.springer.com/chapter/10.1007/978-3-030-78292-4_8