Publications (Updating in progress: February 2021)

Marina Santini (PhD)

Senior Research Scientist, Computational Linguist, Data Analyst, Machine Learning Practitioner

RISE, Research Institutes of Sweden. Division: Digital Systems, Dept: Prototyping Societies, Unit: Digital Health.; (preferred)
DIVA (Digitala Vetenskapliga Arkivet)

Santini M. and Shih M.-C. Exploring the Potential of an Extensible Domain-Specific Web Corpus for “Layfication”: The Case of Cross-Lingual Classification. International Journal of Cyber-Physical Systems (IJCPS) Volume 2, Issue 1. (PDF).

Jerdhaf O., Santini M., Lundberg P., Karlsson A. and Jönsson A. (2020) Implant Terms: Focused Terminology Extraction with Swedish BERT - Preliminary Results. Eighth Swedish Language Technology Conference (SLTC2020), 25–27 November 2020. (Long Abstract - - Workshop website)

Blomqvist E., Alirezaie M. and Santini M. (2020). Towards Causal Knowledge Graphs - Position Paper. KDH 2020: 5th International Workshop on Knowledge Discovery in Healthcare Data, held in conjunction with ECAI 2020, Santiago de Compostela, Spain. (Paper - Workshop website)

Santini M. and Jönsson A. (2020). Pinning Down Text Complexity: An Exploratory Study on the Registers of the Stockholm-Umeå Corpus (SUC). Register Studies 2:2. John Benjamins Publishing. Journal article; Register Studies 2:2. John Benjamins Publishing. Journal article. (Preprint - Companion Website)

Santini M., Jönsson A. and Evelina Rennes (2020). Visualizing Facets of Text Complexity across Registers. Workshop LREC 2020 – READI (Tools and Resources to Empower People with REAding DIfficulties). Poster paper (recommended version); Workshop Proceedings; Workshop READI at LREC 2020 (Programme).


Santini M., Danielsson, B. and Jönsson, A. (2019). Comparing the Performance of Feature Representations for the Categorization of the Easy-to-Read Variety vs Standard Language. NoDaLiDa 2019 September 30 - October 2, 2019, Turku, Finland. (conference paper, presentation, companion website).

Santini M., Danielsson, B. and Jönsson, A. (2019). Introducing the Notion of ‘Contrast’ Features for Language Technology. In International Conference on Database and Expert Systems Applications (pp. 189-198). Springer, Cham. (workshop paper, presentation. proceedings).

Santini M., Strandqvist W. and Jönsson, A. (2019). Profiling specialized web corpus qualities: A progress report on "Domainhood". Argentinian Journal of Applied Linguistics, 7(1). (Journal article, AJAL Journal)

Santini, M., Jönsson, A., Strandqvist, W., Cederblad, G., Nyström, M., Alirezaie, M., Lind, L., Blomqvist, E., Lindén, M. and Kristoffersson, A. (2019). Designing an Extensible Domain-Specific Web Corpus for “Layfication”: A Case Study in eCare at Home . In Cyber-Physical Systems for Social Applications (pp. 98-155). IGI Global. ( Book chapter, book).

Santini M., Strandqvist W. and Jönsson A. (2018). Profiling Domain Specificity of Specialized Web Corpora using Burstiness. Explorations and Open Issues. SLTC2018 - Swedish Language Technology Conference 2018, 7-9 November 2018, Stockholm, Sweden. (Conference poster paper, poster).
Santini M., Strandqvist W., Nyström M., Alirezai M. and Jönsson A. (2018). Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity in Web Corpora. In International Conference on Database and Expert Systems Applications (pp. 207-217). Springer, Cham. (TIR workshop paper, presentation).
Strandqvist W., Santini M., Lind L. and Jönsson A. (2018). Towards a Quality Assessment of Web Corpora for Language Technology Applications. In: Read T., Montaner S. and Sedano B. (2018). Technological Innovation for Specialized Linguistic Domains Languages for Digital Lives and Cultures Proceedings of TISLID’18. Editions universitaires europeenne. (abstract, paper, presentation).
Santini M. and Jönsson A.(2017). E-care@home: Towards a better communication between patients and doctors using Language Technology. Medicinteknikdagarna 2017, Västerås, Sweden, 10-11 October 2017. (Oral presentation).
Santini M., Jönsson A., Nystrom M. and Alirezai M. (2017). A Web Corpus for eCare: Collection, Lay Annotation and Learning. First Results. Proceedings of LTA'17, FedCSIS 2017, Prague. (paper, presentation).
Falkenjack J., Santini M., Jonsson A. (2016). An Exploratory Study on Genre Classification using Readability Features. The Sixth Swedish Language Technology Conference (SLTC), Umeå University, Umeå, Sweden, 17-18 November, 2016. (Paper, Poster).
Santini M. (2006). Interpreting Genre Evolution on the Web: Preliminary Results. In: Proceedings of the Workshop on New Text – Wikis and blogs and other dynamic text sources, held in conjunction with EACL06, Trento, Italy. (Paper).