Foundations of Computational Stylometry

Project at IJP PAN aimed at identifying hidden patterns and similarities in texts, computationally.

Working as a postdoc on this project (also known as FoCS), which is run by Maciej Eder in the Computational Stylistics Group of the Methodology Department at the Institute of Polish Language. Our goal is to understand the differences between texts and the forces that shape them, such as authorship, genre, and style. We frequently publish together with Joanna Byszuk, Michał Woźniak, and Ben Nagy. Plus many other friends!

You can find more information on the official page.

References

Byszuk, Joanna, Michał Woźniak, Mike Kestemont, Albert Leśniak, Wojciech Łukasik, Artjoms Šeļa, and Maciej Eder. 2020. “Detecting Direct Speech in 19th-Century Novels.” In LREC 2020. https://aclanthology.org/2020.lt4hala-1.15.
Idziak, Jan, Artjoms Šeļa, Michał Woźniak, Albert Leśniak, Joanna Byszuk, and Maciej Eder. 2021. “Scalable Handwritten Text Recognition System for Lexicographic Sources of Under-Resourced Languages and Alphabets.” In Computational ScienceICCS 2021, edited by Maciej Paszynski, Dieter Kranzlmüller, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, and Peter M. A. Sloot, 137–50. Lecture Notes in Computer Science. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-77961-0_13.
Šeļa, Artjoms. 2021. “Differences, Distances and Fingerprints: The Fundamentals of Stylometry and Multivariate Text Analysis [Erinevused, Kaugused Ja Sõrmejäljed: Stilomeetria Ja Mitmemõõtmelise Tekstianalüüsi Alused].” Keel Ja Kirjandus, no. 8-9: 696–718. https://doi.org/10.54013/kk764a3.
Šeļa, Artjoms, Ben Nagy, Joanna Byszuk, Laura Hernández-Lorenzo, Botond Szemes, and Maciej Eder. 2023. “From Stage to Page: Language Independent Bootstrap Measures of Distinctiveness in Fictional Speech.” arXiv. https://arxiv.org/abs/2301.05659.