Center for Philosophy of Artificial Intelligence (CPAI)

Publications

2025

Søgaard, A., & Stamatiou, F. (2025). Mele’s Digital Zygote: Developer Responsibility for Neural Networks. Science and Engineering Ethics, 31(6), 40.

Schiller, S. R., Signorelli, C. M., & Stamatiou, F. (2025, October). The Intercepted Self: How Generative AI Challenges the Dynamics of the Relational Self. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (Vol. 8, No. 3, pp. 2284-2291).

Stamatiou, F., & Karakonstantis, X. (2024). Should you trust your voice assistant? It’s complicated, but no. In Proceedings of the 14th Scandinavian Conference on Artificial Intelligence (SCAI 2024) (pp. 131-137).

Oldenburg, Ninell; Dhar, Ruchira; Søgaard, Anders. 2026. Realist and Pluralist Conceptions of Intelligence and Their Implications on AI (Alignment) Research. The Alignment Track of The 40th Annual AAAI Conference on Artificial Intelligence (AAAI). Singapore, Singapore.

Chen, Xinyi; Yuan, Yifei; Li, Jiaang; Belongie, Serge; de Rijke, Marten; Søgaard, Anders. 2025. What if Othello-Playing Language Models Could See? Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025. Suzhou, China.

Grünbaum, Thor; Søgaard, Anders. 2025. Large language models and linguistic understanding. Artificial Intelligence, Humans and the Law. Oxford: Taylor Francis.

Dhar, Ruchira; Brandl, Stephanie; Oldenburg, Ninell; Søgaard, Anders. 2025. Beyond Technocratic XAI: The who, What and How in Explanation Design. Proceedings of the 8th AAAI/ACM Conference on AI, Ethics, and Society. Madrid, Spain.

Oldenburg, N., & Papyshev, G. (2025, October). The Stories We Govern By: AI, Risk, and the Power of Imaginaries. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (Vol. 8, No. 2, pp. 1939-1950).

Søgaard, Anders. 2025. Do Language Models Have Semantics? On the Five Standard Positions. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL). Vienna, Austria.

Fierro, Constanza; Foroutan, Negar; Elliott, Desmond; Søgaard, Anders. 2025. How Do Multilingual Language Models Remember Facts? Findings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL). Vienna, Austria.

Oldenburg, Ninell; Søgaard, Anders. 2025. Navigating the Informativeness-Compression Trade-Off in XAI. AI & Ethics.

Bangsgaard, Alberte Romme; Ryelund, Cecilia Kløve; Nilsson, Mathilde Marie Lind; Søgaard, Anders. 2025. Digital Friends and Empathy Blindness. Open Philosophy.

Yuan, Yifei; Søgaard, Anders. 2025. Revisiting the Othello World Model Hypothesis. ICLR 2025 World Models Workshop. Singapore, Singapore.

Karamolegkou, Antonia; Schiller Hansen, Sandrine; Christopoulou, Ariadni; Stamatiou, Filippos; Lauscher, Anne; Søgaard, Anders. 2025. Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics Statements. Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL). Alberquerque, New Mexico.

2024

Søgaard, Anders. 2024. Externalist XAI? Theoria.

Søgaard, Anders. 2024. Is Unsupervised Learning Somehow Truer? Minds and Machines 34(4): 43.

Peng, Qiwei; Søgaard, Anders. 2024. Concept Space Alignment in Multilingual LLMs. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024. Miami, Florida.

Fierro, Constanza; Dhar, Ruchira; Stamatiou, Filippos; Garneau, Nicolas; Søgaard, Anders. 2024. Defining Knowledge: Bridging Epistemology and Large Language Models. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024. Miami, Florida.

Li, Jiaang; Kementchedjhieva, Yova; Fierro, Constanza; Søgaard, Anders. 2024. Do Vision and Language Models Share Concepts? A Vector Space Alignment Study. Transactions of the Association for Computational Linguistics (TACL) 12: 1232-1249.

Dhar, Ruchira; Søgaard, Anders. 2024. From Words to Worlds: Compositionality for Cognitive Architectures. ICML 2024 Workshop on LLMs and Cognition. Vienna, Austria.

Li, Jiaang; Karamolegkou, Antonia; Kementchedjhieva, Yova; Abdou, Mostafa; Lehmann, Sune Lehmann; Søgaard, Anders. 2024. Structural Similarities Between Language Models and Neural Response Measurements. NeurIPS 2023 Workshop on Symmetry and Geometry in Neural Representations. New Orleans, LA.

Søgaard, Anders. 2024. On the Opacity of Deep Neural Networks. Canadian Journal of Philosophy 53(3): 224-239.

Søgaard, Anders. 2024. Identity Theory and Falsifiability. Acta Analytica.

Schiller, Sandrine; Søgaard, Anders. 2024. The Challenge of Generative AI Optimized for Engagement. Robophilosophy Conference 2024. Aarhus, Denmark.

Smidt, Mathilde; Anegaard, Olivia; Søgaard, Anders. 2024. How Good Are We at Assessing the Trustworthiness of LLMs? Robophilosophy Conference 2024. Aarhus, Denmark.

van Zee, Anna Katrine; van Zee, Marc; Søgaard, Anders. 2024. Group Fairness in Multilingual Speech Recognition Models. Findings of the North American Chapter of the Association for Computational Linguistics (NAACL). Mexico City, Mexico.

Søgaard, Anders; Kappel, Klemens; Grünbaum, Thor. 2024. On Hedden’s Proof that Machine Learning Fairness Metrics are Flawed. Inquiry.

2023

Søgaard, Anders. 2023. Can Machines Be Trustworthy? AI & Ethics.

Karamolegkoy, Antonia; Li, Jiaang; Zhou, Li; Søgaard, Anders. 2023. Copyright Violations and Large Language Models. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023. Singapore, Singapore.