Skip to content

Joni Salminen Posts

The Cross-Sectionality Problem in Machine Learning Benchmarking Datasets

(Written with Claude-3.5-Sonnet. I have checked and edited the content. -Joni) Summary of the Issue The holdout paradigm, commonly implemented as a train-test split, is a fundamental technique in machine learning for assessing model performance. However, when applied to cross-sectional data, it can lead to significant challenges in terms of model generalizability. Cross-sectional data, by its nature, provides a snapshot…

The Agency Theory of Chat-Based AI Personas: Insights from Economics

Written in collaboration with Claude-3.5.-Sonnet. All information checked by human author. Background Because of the utility and value of large language models (LLMs) in tasks supporting persona creation and users’ interaction with personas, chat-based AI personas are becoming increasingly prevalent. However, these personas present unique challenges that can be understood through the lens of agency theory (also known as the…

The illusion of similarity (in academic research papers)

The illusion of similarity: two papers (A and B) look the same.They have the same structure.They have the same number of words.Both use fluent, nearly flawless English.Both reference the same number of papers.They have the same number of tables and figures. For a layperson, both these papers look the same. Yet, one of them is the worst horsesh*t you’ve ever…

List of Research Superpowers (Especially for PhD Students)

Observed myself citing various “superpowers” to my PhD students in occasional emails. So, thought of writing these down (the list might get updated). Currently, I can think of nine ten superpowers [updated: Sep 1, 2024] for researchers: Being specific is one #researchsuperpower (but it applies well beyond research). Be bold in your writing. I mean, literally, bold your contributions. Just…

What are the real benefits of human educators vs. ChatGPT?

Hi! Saw some interesting claims in this presentation about teacher strengths relative to AI: https://sway.cloud.microsoft/CqKZjoSsoTvWoTMB?ref=Link I’m assessing each one by one below. (Btw, if you’re interested in GenAI for bachelor’s level education, take a look at my tips and experiences.) “Teacher Strengths in Relation to language model ChatGPT:” “Artificial intelligence is an efficient tool that can support teachers in many…

Thoughts on integrating Generative AI in university education (specifically at bachelor’s level): Good and bad use cases

Sharing some thoughts, based on a discussion with colleagues, on how generative AI (GenAI) should and should not be intergrated into teaching at bachelor’s degree level. (Also, since you’re here, you might be interested in this post about human educators’ benefits relative to AI.) First, in my opinion, teachers should have the final authority to decide if and how AI…

Rediscovering Digital Divide with GenAI

With digital tools, we keep rediscovering the same truth, which is that they increase the productivity differences (performance) between people. Prolific people become more prolific and non-prolific become relatively worse off. Consider a simple example from the previous generation: ad platforms. Compare Copywriter A (mediocre) and Copywriter B (skilled). The skilled copywriter can draw huge performance gain compared to the…

Different personas representing AI attitudes

Different personas representing AI attitudes: All these types have their own reasons and backgrounds for touting the message that they toute. I personally prefer the silent doers who are less vocal and therefore harder to find. I prefer them because learning from them helps me apply lessons to my job.