Measuring the State of Open Science in Transportation Using Large Language Models - Open science monitoring initiative

“Open science initiatives have strengthened scientific integrity and accelerated research progress across many fields, but the state of their practice within transportation research remains under-investigated. Key features of open science, defined here as data and code availability, are difficult to extract due to the inherent complexity of the field. Previous work has either been limited to small-scale studies due to the labor-intensive nature of manual analysis or has relied on large-scale bibliometric approaches that sacrifice contextual richness. This paper introduces an automatic and scalable feature-extraction pipeline to measure data and code availability in transportation research. (…) Open science represents a comprehensive vision integrating a wide array of concepts. However,using such broad terminology—which encompasses concepts ranging from open access publishing to
general knowledge sharing—introduces unnecessary ambiguity to this study and the associated feature
extraction. Therefore, we narrow the scope of open science practices to focus specifically on data and code
availability connected with a paper. To align with established open science monitoring principles, we
adopt clear, transparent decision rules as detailed below.”