Document Type
Article
Publication Date
1-1-2025
Abstract
World literature plays a key role in understanding the global diversity of human storytelling. However, datasets suitable for large-scale cross-cultural analysis remain limited. Responding to the increasing digitization of literary texts and the need for more diverse and multilingual resources, we introduce Mini Worldlit, a manually curated dataset of 1,192 works of contemporary fiction from 13 countries, representing nine languages across five continents. Mini Worldlit employs consistent cross-cultural selection criteria, overseen by scholarly experts, to ensure geographic, linguistic, and stylistic coherence. The dataset provides a foundation for future comparative studies of global literary cultures, offering a template for cross-cultural sampling. Our methodology pairs geographic boundaries with linguistic communities, enabling a structured exploration of world literature. This dataset is designed to facilitate a comparative approach to understanding literature and support the growing field of multilingual digital humanities.
Publication Source (Journal or Book title)
Journal of Open Humanities Data
Recommended Citation
Piper, A., Orhero, M., Bamman, D., Peksoy, E., Han, C., Rastogi, P., Bjerring-Hansen, J., Rasmussen, S., Long, H., Smeets, R., Marienberg-Milikowsky, I., Stuart, A., McEnaney, T., & Thomsen, M. (2025). Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents. Journal of Open Humanities Data, 11 https://doi.org/10.5334/johd.248