How to Handle AI in Education?

In this post, I’m discussing the use of AI — specifically LLMs or large language models, also known as generative AI, of which Open AI’s GPT models (including ChatGPT) are prominent examples. I will first present a case for why detection and prevention of such models in education is not feasible or even possible. Then, I’ll propose ways of instructing students to properly use AI in learning. Finally, I’ll discuss the implications for different student types.

[UPDATE: September 29, 2023 — adding three tips for students.]

[UPDATE: June 22, 2023 — adding some pointers at the end of this article about the use of ChatGPT and other LLMs in thesis work.]

1. Three Tips for Students

Three tips for students to cope with the use of AI for classwork:

Use common sense, as in, ask yourself: “Is the way I’m using this tool ethical?” We all have a moral compass and an intuitive sense of wrong and right.
Declare — meaning, write up exactly and precisely how you used the AI tool(s). This activity in itself will help in the process of determining “is this okay?”. If, in the course of writing, you realize things like “okay, this could lead into misinformation” or “I didn’t do anything to validate the information” or “I didn’t manually edit the text afterwards but just copy-pasted”, those are all red flags. So, making the use explicit and transparent helps yourself as much as others to determine if the use was appropriate. See an example of AI use declaration.
Ask! Whenever in doubt, immediately ask your teacher, “I used/am planning to use ChatGPT like this, is it okay?”. Many of the rules and practices are currently forming, so we can create these in collaboration together with students and teachers.

2. Why Detection and Prevention Fails

There might be some indicative cues of whether a text piece was written by student or by GPT, such as the lack of grammar mistakes (GPT writes near-perfect language, students often don’t :), lack of aforisms and metaphors (people often use but rarely seeing GPT use them), etc.

However, the detection and prevention method seems fundamentally flawed for at least the following reasons:

there’ll be a non-trivial number of false positives and negatives which means there’ll be need for additional verification which is likely to involve guesswork (i.e., ultimately we don’t know whether a text was written by an AI, a human, or combination thereof).
as soon as such cues become public information students will start circumventing them, resulting in a game of cat-and-mouse, and
the hybrid use of GPT is particularly difficult to detect as relatively minor edits to a text can significantly change its appearance and thus fool both algorithmic and manual detection.

The result of this is that AI-generated text is often indistinguishable from human-written text and undetectable by algorithms. There are some “cues”, though, but we cannot rely on them to 100% argue that a student used Gen AI or not. We ask ask, and they can deny, and there’s little more. The conclusion is that prohibiting the use of AI is difficult or impossible for this reason alone. We also don’t want to prohibit its use entirely, because its use is quickly becoming a professional skill that we should teach.

3. How to Instruct Students to Use AI

Based on above reasoning, my take is that students should be allowed the use of GPT (as we cannot prevent them from using it and they’re likely to use it in their jobs anyway), but we must teach its ethical use — that is,

declare its use instead of pretending the text was 100% created by you,
explain how you used it in detail (the prompts, the editing process, etc.), and
verify all facts presented by GPT as it has a tendency to hallucinate (verification done using credible sources such as academic research articles, government/institute/industry reports, statistical authorities). Facts refer to numerical information — the concept definitions of GPT tend to be accurate, based on my own experience. See how to triangulate AI-generated information.

Concerning the GPT models, we must bear in mind the instrumentality maxim: technology is just a tool. A bad student will use it in a bad way; a good student will use it in a good way. While we cannot remove the badness from this system, we can engage in some measures to tilt it in favor of the good, such as encouraging ethical behaviors and penalizing unethical ones. The bottom line is that “Zero GPT” just isn’t a realistic policy option, like “Zero Wikipedia” (or “Zero Google”) wasn’t either.

4. Using ChatGPT and Other LLMs in Thesis Work

The academic year 2023-2024 is rapidly approaching, and I find there are still no solutions for the concern of students writing their bachelor’s or master’s theses using ChatGPT.

Turnitin has a detector — is not reliable enough to make a determination
there are some other detectors developed by researchers and commercial parties — also not reliable enough to make a determination
there are also ways to try and manually find out about this new form of plagiarism: e.g., by detecting cues like “too perfect” text (ChatGPT writes grammatically better text than students!) and certain ways of expression that ChatGPT applies (e.g., it uses a lot of lists and very few metaphors). However, these are fallible and at best, the student will just deny using ChatGPT and we cannot argue against without proper evidence.

So, how to tackle this issue? Here are some ways I’m currently thinking:

First, ask students sign a pledge at the beginning of the course:

“This pledge contains two promises: (a) I promise that I declare my use of ChatGPT or similar tools — also in the case I didn’t use them. If I did use them as a part of the writing process, I’ll explain in detail how I used them, how I verified the information given by them (via triangulation based on academic sources: see instructions here: [link]), and how I edited the computer-generated text to make it mine. Also, (b) I promise that I won’t use the computer-generated text as is, but I will verify its content and edit according to the guidelines provided by Joni: [link].”

So, the point is not to ban the use of ChatGPT (because we cannot enforce such a ban) but instead allow its use, given that the student uses it ethically and declares its use. In this process, we educators must provide guidelines. That is, specific instructions on how to use ChatGPT in a responsible way and how to declare its use.

Second, I’m thinking of asking students to “write less, tabulate more”. That is, when they review the articles, I want to see their Excel files that contain structured data based on information from the articles. While ChatGPT can help in this process, it is not (at least yet) able to carry out all the steps in the process, including (a) identification of relevant literature, (b) defining a coding scheme that’s relevant for the research questions, (c) extracting information from the articles according to the scheme. Again, it can help in these steps, but the student needs to input considerable creative effort in managing the whole process. So, they need to engage in proper research activities, which is the point of doing a thesis.

In a way, writing a thesis during the ChatGPT era might become more demanding, because now it’s now longer adequate to just produce “text” — instead, students need to demonstrate their thinking process in other ways as well. So, ChatGPT can actually raise the bar of theses and make them better. This is exactly how a good tool should affect an activity – it should make the completion of an activity easier while maintaining high quality.

And, from fairness perspective, this new requirement can be met with or without using ChatGPT. Just like other activities in professional knowledge work.

5. Implications of AI for Different Student Types

Furthermore, let us take this apart a bit more by considering four student types:

A good student = one that wants to learn and do a good job passing their courses/degree.
A talented student = one that has good intentions and is good at learning.
A poor student = one that has good intentions (is a good student) but struggles to learn from some reason or another.
A bad student = one that doesn’t want to learn but just wants to pass a course/degree with minimum effort.

Implications of AI for different types are the interesting part. I am not too much interested in the bad students. They might be viewed as being out of scope – in some cases, the attitude is what it is and cannot be changed. We cannot force learning. In other cases, it might be possible to convert a bad student to being a good student, but I don’t see GPT relevant for this (cf. the instrumentality hypothesis).

For the good students, we need to tell how to use GPT in a good way, so they know how to do so and so that they can use it to learn more efficiently. My hypothesis is that GPT supports the learning of talented students as they can use GPT to amplify their already good learning strategies. However, I am not sure about the implications for poor students, but whatever they may be, those students also need guidance.

In terms of skills that educators would need to pass on to their students, at least two readily come to mind:

The ability to ask questions: for GPT to be useful, one needs to ask it the right questions. A “right question” is one that supports learning. Learning a topic requires coming up with *many* questions that become progressively more advanced. So, the student needs to be able to craft progressively more difficult questions in order to increase his or her knowledge (in between, the student obviously needs to read and reflect on the answers). This skill relies equally on formulating the substance of the question (i.e., what is actually being asked?) as well as phrasing the question (i.e., how is the question being asked?). Both factors affect the response quality – for example, student who wants to know about the history of AI could learn that there is a concept called “AI winter” and then ask the AI to explain this concept. But, there are in fact two AI winters, so the sequence and formulation of the questions can yield some gaps in the student’s learning. Thus, “prompting strategies” and “prompt engineering” are skills relevant here.
The ability to evaluate the quality of answers: once the student receives an answer, they need to be able to assess the quality of the answer. What does quality mean? At least two criteria: veracity, so that the answer is true or correct, and comprehensiveness, so that the answer contains the necessary information to satisfy the information request. A third criterion could be connection – i.e., the answer introduces related concept for the student to increase their learning by formulating new questions to the AI based on these concepts.

In terms of overall learning (i.e., ensuring that the student masters what he or she is expected to master to get a degree), the optimal mix would be going back to controlled exams (they’ve been diminishing in use over time, this might reverse some of that change) for some courses, while teaching the correct use of GPT in others.

6. Conclusion

AI is coming into education (or, rather, it’s already here). Educators cannot prevent the use of AI models in learning. Instead, they should make sure such models are used ethnically and in a way that supports students’ learning.

Acknowledgments

Thanks to Mikko Piippo for the LinkedIn discussion that inspired this post 🙂 The tips were invented during a lesson with Bachelor’s thesis students at the University of Vaasa.