ai chatbot writing essays

'ZDNET Recommends': What exactly does it mean?

ZDNET's recommendations are based on many hours of testing, research, and comparison shopping. We gather data from the best available sources, including vendor and retailer listings as well as other relevant and independent reviews sites. And we pore over customer reviews to find out what matters to real people who already own and use the products and services we’re assessing.

When you click through from our site to a retailer and buy a product or service, we may earn affiliate commissions. This helps support our work, but does not affect what we cover or how, and it does not affect the price you pay. Neither ZDNET nor the author are compensated for these independent reviews. Indeed, we follow strict guidelines that ensure our editorial content is never influenced by advertisers.

ZDNET's editorial team writes on behalf of you, our reader. Our goal is to deliver the most accurate information and the most knowledgeable advice possible in order to help you make smarter buying decisions on tech gear and a wide array of products and services. Our editors thoroughly review and fact-check every article to ensure that our content meets the highest standards. If we have made an error or published misleading information, we will correct or clarify the article. If you see inaccuracies in our content, please report the mistake via this form .

How ChatGPT (and other AI chatbots) can help you write an essay

ChatGPT is capable of doing many different things very well, with one of the biggest standout features being its ability to compose all sorts of text within seconds, including songs, poems, bedtime stories, and essays .

The chatbot's writing abilities are not only fun to experiment with, but can help provide assistance with everyday tasks. Whether you are a student, a working professional, or just getting stuff done, we constantly take time out of our day to compose emails, texts, posts, and more. ChatGPT can help you claim some of that time back by helping you brainstorm and then compose any text you need.

How to use ChatGPT to write: Code | Excel formulas | Resumes | Cover letters

Contrary to popular belief, ChatGPT can do much more than just write an essay for you from scratch (which would be considered plagiarism). A more useful way to use the chatbot is to have it guide your writing process.

Below, we show you how to use ChatGPT to do both the writing and assisting, as well as some other helpful writing tips.

How ChatGPT can help you write an essay

If you are looking to use ChatGPT to support or replace your writing, here are five different techniques to explore.

It is also worth noting before you get started that other AI chatbots can output the same results as ChatGPT or are even better, depending on your needs.

Also: The best AI chatbots of 2024: ChatGPT and alternatives

For example, Copilot has access to the internet, and as a result, it can source its answers from recent information and current events. Copilot also includes footnotes linking back to the original source for all of its responses, making the chatbot a more valuable tool if you're writing a paper on a more recent event, or if you want to verify your sources.

Regardless of which AI chatbot you pick, you can use the tips below to get the most out of your prompts and from AI assistance.

1. Use ChatGPT to generate essay ideas

Before you can even get started writing an essay, you need to flesh out the idea. When professors assign essays, they generally give students a prompt that gives them leeway for their own self-expression and analysis.

As a result, students have the task of finding the angle to approach the essay on their own. If you have written an essay recently, you know that finding the angle is often the trickiest part -- and this is where ChatGPT can help.

Also: ChatGPT vs. Copilot: Which AI chatbot is better for you?

All you need to do is input the assignment topic, include as much detail as you'd like -- such as what you're thinking about covering -- and let ChatGPT do the rest. For example, based on a paper prompt I had in college, I asked:

Can you help me come up with a topic idea for this assignment, "You will write a research paper or case study on a leadership topic of your choice." I would like it to include Blake and Mouton's Managerial Leadership Grid, and possibly a historical figure.

Also: I'm a ChatGPT pro but this quick course taught me new tricks, and you can take it for free

Within seconds, the chatbot produced a response that provided me with the title of the essay, options of historical figures I could focus my article on, and insight on what information I could include in my paper, with specific examples of a case study I could use.

2. Use the chatbot to create an outline

Once you have a solid topic, it's time to start brainstorming what you actually want to include in the essay. To facilitate the writing process, I always create an outline, including all the different points I want to touch upon in my essay. However, the outline-writing process is usually tedious.

With ChatGPT, all you have to do is ask it to write the outline for you.

Also: Thanks to my 5 favorite AI tools, I'm working smarter now

Using the topic that ChatGPT helped me generate in step one, I asked the chatbot to write me an outline by saying:

Can you create an outline for a paper, "Examining the Leadership Style of Winston Churchill through Blake and Mouton's Managerial Leadership Grid."

After a couple of seconds, the chatbot produced a holistic outline divided into seven different sections, with three different points under each section.

This outline is thorough and can be condensed for a shorter essay or elaborated on for a longer paper. If you don't like something or want to tweak the outline further, you can do so either manually or with more instructions to ChatGPT.

As mentioned before, since Copilot is connected to the internet, if you use Copilot to produce the outline, it will even include links and sources throughout, further expediting your essay-writing process.

3. Use ChatGPT to find sources

Now that you know exactly what you want to write, it's time to find reputable sources to get your information. If you don't know where to start, you can just ask ChatGPT.

Also: How to make ChatGPT provide sources and citations

All you need to do is ask the AI to find sources for your essay topic. For example, I asked the following:

Can you help me find sources for a paper, "Examining the Leadership Style of Winston Churchill through Blake and Mouton's Managerial Leadership Grid."

The chatbot output seven sources, with a bullet point for each that explained what the source was and why it could be useful.

Also: How to use ChatGPT to make charts and tables

The one caveat you will want to be aware of when using ChatGPT for sources is that it does not have access to information after 2021, so it will not be able to suggest the freshest sources. If you want up-to-date information, you can always use Copilot.

Another perk of using Copilot is that it automatically links to sources in its answers.

4. Use ChatGPT to write an essay

It is worth noting that if you take the text directly from the chatbot and submit it, your work could be considered a form of plagiarism since it is not your original work. As with any information taken from another source, text generated by an AI should be clearly identified and credited in your work.

Also: ChatGPT will now remember its past conversations with you (if you want it to)

In most educational institutions, the penalties for plagiarism are severe, ranging from a failing grade to expulsion from the school. A better use of ChatGPT's writing features would be to use it to create a sample essay to guide your writing.

If you still want ChatGPT to create an essay from scratch, enter the topic and the desired length, and then watch what it generates. For example, I input the following text:

Can you write a five-paragraph essay on the topic, "Examining the Leadership Style of Winston Churchill through Blake and Mouton's Managerial Leadership Grid."

Within seconds, the chatbot gave the exact output I required: a coherent, five-paragraph essay on the topic. You could then use that text to guide your own writing.

Also: ChatGPT vs. Microsoft Copilot vs. Gemini: Which is the best AI chatbot?

At this point, it's worth remembering how tools like ChatGPT work : they put words together in a form that they think is statistically valid, but they don't know if what they are saying is true or accurate.

As a result, the output you receive might include invented facts, details, or other oddities. The output might be a useful starting point for your own work, but don't expect it to be entirely accurate, and always double-check the content.

5. Use ChatGPT to co-edit your essay

Once you've written your own essay, you can use ChatGPT's advanced writing capabilities to edit the piece for you.

You can simply tell the chatbot what you want it to edit. For example, I asked ChatGPT to edit our five-paragraph essay for structure and grammar, but other options could have included flow, tone, and more.

Also: AI meets AR as ChatGPT is now available on the Apple Vision Pro

Once you ask the tool to edit your essay, it will prompt you to paste your text into the chatbot. ChatGPT will then output your essay with corrections made. This feature is particularly useful because ChatGPT edits your essay more thoroughly than a basic proofreading tool, as it goes beyond simply checking spelling.

You can also co-edit with the chatbot, asking it to take a look at a specific paragraph or sentence, and asking it to rewrite or fix the text for clarity. Personally, I find this feature very helpful.

How to use ChatGPT

The best ai chatbots: chatgpt isn't the only one worth trying, how ai can rescue it pros from job burnout and alert fatigue.

The College Essay Is Dead

Nobody is prepared for how AI will transform academia.

An illustration of printed essays arranged to look like a skull

Suppose you are a professor of pedagogy, and you assign an essay on learning styles. A student hands in an essay with the following opening paragraph:

The construct of “learning styles” is problematic because it fails to account for the processes through which learning styles are shaped. Some students might develop a particular learning style because they have had particular experiences. Others might develop a particular learning style by trying to accommodate to a learning environment that was not well suited to their learning needs. Ultimately, we need to understand the interactions among learning styles and environmental and personal factors, and how these shape how we learn and the kinds of learning we experience.

Pass or fail? A- or B+? And how would your grade change if you knew a human student hadn’t written it at all? Because Mike Sharples, a professor in the U.K., used GPT-3, a large language model from OpenAI that automatically generates text from a prompt, to write it. (The whole essay, which Sharples considered graduate-level, is available, complete with references, here .) Personally, I lean toward a B+. The passage reads like filler, but so do most student essays.

Sharples’s intent was to urge educators to “rethink teaching and assessment” in light of the technology, which he said “could become a gift for student cheats, or a powerful teaching assistant, or a tool for creativity.” Essay generation is neither theoretical nor futuristic at this point. In May, a student in New Zealand confessed to using AI to write their papers, justifying it as a tool like Grammarly or spell-check: “I have the knowledge, I have the lived experience, I’m a good student, I go to all the tutorials and I go to all the lectures and I read everything we have to read but I kind of felt I was being penalised because I don’t write eloquently and I didn’t feel that was right,” they told a student paper in Christchurch. They don’t feel like they’re cheating, because the student guidelines at their university state only that you’re not allowed to get somebody else to do your work for you. GPT-3 isn’t “somebody else”—it’s a program.

The world of generative AI is progressing furiously. Last week, OpenAI released an advanced chatbot named ChatGPT that has spawned a new wave of marveling and hand-wringing , plus an upgrade to GPT-3 that allows for complex rhyming poetry; Google previewed new applications last month that will allow people to describe concepts in text and see them rendered as images; and the creative-AI firm Jasper received a $1.5 billion valuation in October. It still takes a little initiative for a kid to find a text generator, but not for long.

The essay, in particular the undergraduate essay, has been the center of humanistic pedagogy for generations. It is the way we teach children how to research, think, and write. That entire tradition is about to be disrupted from the ground up. Kevin Bryan, an associate professor at the University of Toronto, tweeted in astonishment about OpenAI’s new chatbot last week: “You can no longer give take-home exams/homework … Even on specific questions that involve combining knowledge across domains, the OpenAI chat is frankly better than the average MBA at this point. It is frankly amazing.” Neither the engineers building the linguistic tech nor the educators who will encounter the resulting language are prepared for the fallout.

A chasm has existed between humanists and technologists for a long time. In the 1950s, C. P. Snow gave his famous lecture, later the essay “The Two Cultures,” describing the humanistic and scientific communities as tribes losing contact with each other. “Literary intellectuals at one pole—at the other scientists,” Snow wrote. “Between the two a gulf of mutual incomprehension—sometimes (particularly among the young) hostility and dislike, but most of all lack of understanding. They have a curious distorted image of each other.” Snow’s argument was a plea for a kind of intellectual cosmopolitanism: Literary people were missing the essential insights of the laws of thermodynamics, and scientific people were ignoring the glories of Shakespeare and Dickens.

The rupture that Snow identified has only deepened. In the modern tech world, the value of a humanistic education shows up in evidence of its absence. Sam Bankman-Fried, the disgraced founder of the crypto exchange FTX who recently lost his $16 billion fortune in a few days , is a famously proud illiterate. “I would never read a book,” he once told an interviewer . “I don’t want to say no book is ever worth reading, but I actually do believe something pretty close to that.” Elon Musk and Twitter are another excellent case in point. It’s painful and extraordinary to watch the ham-fisted way a brilliant engineering mind like Musk deals with even relatively simple literary concepts such as parody and satire. He obviously has never thought about them before. He probably didn’t imagine there was much to think about.

The extraordinary ignorance on questions of society and history displayed by the men and women reshaping society and history has been the defining feature of the social-media era. Apparently, Mark Zuckerberg has read a great deal about Caesar Augustus , but I wish he’d read about the regulation of the pamphlet press in 17th-century Europe. It might have spared America the annihilation of social trust .

These failures don’t derive from mean-spiritedness or even greed, but from a willful obliviousness. The engineers do not recognize that humanistic questions—like, say, hermeneutics or the historical contingency of freedom of speech or the genealogy of morality—are real questions with real consequences. Everybody is entitled to their opinion about politics and culture, it’s true, but an opinion is different from a grounded understanding. The most direct path to catastrophe is to treat complex problems as if they’re obvious to everyone. You can lose billions of dollars pretty quickly that way.

As the technologists have ignored humanistic questions to their peril, the humanists have greeted the technological revolutions of the past 50 years by committing soft suicide. As of 2017, the number of English majors had nearly halved since the 1990s. History enrollments have declined by 45 percent since 2007 alone. Needless to say, humanists’ understanding of technology is partial at best. The state of digital humanities is always several categories of obsolescence behind, which is inevitable. (Nobody expects them to teach via Instagram Stories.) But more crucially, the humanities have not fundamentally changed their approach in decades, despite technology altering the entire world around them. They are still exploding meta-narratives like it’s 1979, an exercise in self-defeat.

Read: The humanities are in crisis

Contemporary academia engages, more or less permanently, in self-critique on any and every front it can imagine. In a tech-centered world, language matters, voice and style matter, the study of eloquence matters, history matters, ethical systems matter. But the situation requires humanists to explain why they matter, not constantly undermine their own intellectual foundations. The humanities promise students a journey to an irrelevant, self-consuming future; then they wonder why their enrollments are collapsing. Is it any surprise that nearly half of humanities graduates regret their choice of major ?

The case for the value of humanities in a technologically determined world has been made before. Steve Jobs always credited a significant part of Apple’s success to his time as a dropout hanger-on at Reed College, where he fooled around with Shakespeare and modern dance, along with the famous calligraphy class that provided the aesthetic basis for the Mac’s design. “A lot of people in our industry haven’t had very diverse experiences. So they don’t have enough dots to connect, and they end up with very linear solutions without a broad perspective on the problem,” Jobs said . “The broader one’s understanding of the human experience, the better design we will have.” Apple is a humanistic tech company. It’s also the largest company in the world.

Despite the clear value of a humanistic education, its decline continues. Over the past 10 years, STEM has triumphed, and the humanities have collapsed . The number of students enrolled in computer science is now nearly the same as the number of students enrolled in all of the humanities combined.

And now there’s GPT-3. Natural-language processing presents the academic humanities with a whole series of unprecedented problems. Practical matters are at stake: Humanities departments judge their undergraduate students on the basis of their essays. They give Ph.D.s on the basis of a dissertation’s composition. What happens when both processes can be significantly automated? Going by my experience as a former Shakespeare professor, I figure it will take 10 years for academia to face this new reality: two years for the students to figure out the tech, three more years for the professors to recognize that students are using the tech, and then five years for university administrators to decide what, if anything, to do about it. Teachers are already some of the most overworked, underpaid people in the world. They are already dealing with a humanities in crisis. And now this. I feel for them.

And yet, despite the drastic divide of the moment, natural-language processing is going to force engineers and humanists together. They are going to need each other despite everything. Computer scientists will require basic, systematic education in general humanism: The philosophy of language, sociology, history, and ethics are not amusing questions of theoretical speculation anymore. They will be essential in determining the ethical and creative use of chatbots, to take only an obvious example.

The humanists will need to understand natural-language processing because it’s the future of language, but also because there is more than just the possibility of disruption here. Natural-language processing can throw light on a huge number of scholarly problems. It is going to clarify matters of attribution and literary dating that no system ever devised will approach; the parameters in large language models are much more sophisticated than the current systems used to determine which plays Shakespeare wrote, for example . It may even allow for certain types of restorations, filling the gaps in damaged texts by means of text-prediction models. It will reformulate questions of literary style and philology; if you can teach a machine to write like Samuel Taylor Coleridge, that machine must be able to inform you, in some way, about how Samuel Taylor Coleridge wrote.

The connection between humanism and technology will require people and institutions with a breadth of vision and a commitment to interests that transcend their field. Before that space for collaboration can exist, both sides will have to take the most difficult leaps for highly educated people: Understand that they need the other side, and admit their basic ignorance. But that’s always been the beginning of wisdom, no matter what technological era we happen to inhabit.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

View all journals
My Account Login
Explore content
About the journal
Publish with us
Sign up for alerts
Open access
Published: 30 October 2023

A large-scale comparison of human-written versus ChatGPT-generated essays

Steffen Herbold 1 ,
Annette Hautli-Janisz 1 ,
Ute Heuer 1 ,
Zlata Kikteva 1 &
Alexander Trautsch 1

Scientific Reports volume 13 , Article number: 18617 ( 2023 ) Cite this article

17k Accesses

12 Citations

94 Altmetric

Metrics details

Computer science
Information technology

ChatGPT and similar generative AI models have attracted hundreds of millions of users and have become part of the public discourse. Many believe that such models will disrupt society and lead to significant changes in the education system and information generation. So far, this belief is based on either colloquial evidence or benchmarks from the owners of the models—both lack scientific rigor. We systematically assess the quality of AI-generated content through a large-scale study comparing human-written versus ChatGPT-generated argumentative student essays. We use essays that were rated by a large number of human experts (teachers). We augment the analysis by considering a set of linguistic characteristics of the generated essays. Our results demonstrate that ChatGPT generates essays that are rated higher regarding quality than human-written essays. The writing style of the AI models exhibits linguistic characteristics that are different from those of the human-written essays. Since the technology is readily available, we believe that educators must act immediately. We must re-invent homework and develop teaching concepts that utilize these AI models in the same way as math utilizes the calculator: teach the general concepts first and then use AI tools to free up time for other learning objectives.

ChatGPT-3.5 as writing assistance in students’ essays

Željana Bašić, Ana Banovac, … Ivan Jerković

Perception, performance, and detectability of conversational artificial intelligence across 32 university courses

Hazem Ibrahim, Fengyuan Liu, … Yasir Zaki

The model student: GPT-4 performance on graduate biomedical science exams

Daniel Stribling, Yuxing Xia, … Rolf Renne

Introduction

The massive uptake in the development and deployment of large-scale Natural Language Generation (NLG) systems in recent months has yielded an almost unprecedented worldwide discussion of the future of society. The ChatGPT service which serves as Web front-end to GPT-3.5 1 and GPT-4 was the fastest-growing service in history to break the 100 million user milestone in January and had 1 billion visits by February 2023 2 .

Driven by the upheaval that is particularly anticipated for education 3 and knowledge transfer for future generations, we conduct the first independent, systematic study of AI-generated language content that is typically dealt with in high-school education: argumentative essays, i.e. essays in which students discuss a position on a controversial topic by collecting and reflecting on evidence (e.g. ‘Should students be taught to cooperate or compete?’). Learning to write such essays is a crucial aspect of education, as students learn to systematically assess and reflect on a problem from different perspectives. Understanding the capability of generative AI to perform this task increases our understanding of the skills of the models, as well as of the challenges educators face when it comes to teaching this crucial skill. While there is a multitude of individual examples and anecdotal evidence for the quality of AI-generated content in this genre (e.g. 4 ) this paper is the first to systematically assess the quality of human-written and AI-generated argumentative texts across different versions of ChatGPT 5 . We use a fine-grained essay quality scoring rubric based on content and language mastery and employ a significant pool of domain experts, i.e. high school teachers across disciplines, to perform the evaluation. Using computational linguistic methods and rigorous statistical analysis, we arrive at several key findings:

AI models generate significantly higher-quality argumentative essays than the users of an essay-writing online forum frequented by German high-school students across all criteria in our scoring rubric.

ChatGPT-4 (ChatGPT web interface with the GPT-4 model) significantly outperforms ChatGPT-3 (ChatGPT web interface with the GPT-3.5 default model) with respect to logical structure, language complexity, vocabulary richness and text linking.

Writing styles between humans and generative AI models differ significantly: for instance, the GPT models use more nominalizations and have higher sentence complexity (signaling more complex, ‘scientific’, language), whereas the students make more use of modal and epistemic constructions (which tend to convey speaker attitude).

The linguistic diversity of the NLG models seems to be improving over time: while ChatGPT-3 still has a significantly lower linguistic diversity than humans, ChatGPT-4 has a significantly higher diversity than the students.

Our work goes significantly beyond existing benchmarks. While OpenAI’s technical report on GPT-4 6 presents some benchmarks, their evaluation lacks scientific rigor: it fails to provide vital information like the agreement between raters, does not report on details regarding the criteria for assessment or to what extent and how a statistical analysis was conducted for a larger sample of essays. In contrast, our benchmark provides the first (statistically) rigorous and systematic study of essay quality, paired with a computational linguistic analysis of the language employed by humans and two different versions of ChatGPT, offering a glance at how these NLG models develop over time. While our work is focused on argumentative essays in education, the genre is also relevant beyond education. In general, studying argumentative essays is one important aspect to understand how good generative AI models are at conveying arguments and, consequently, persuasive writing in general.

Related work

Natural language generation.

The recent interest in generative AI models can be largely attributed to the public release of ChatGPT, a public interface in the form of an interactive chat based on the InstructGPT 1 model, more commonly referred to as GPT-3.5. In comparison to the original GPT-3 7 and other similar generative large language models based on the transformer architecture like GPT-J 8 , this model was not trained in a purely self-supervised manner (e.g. through masked language modeling). Instead, a pipeline that involved human-written content was used to fine-tune the model and improve the quality of the outputs to both mitigate biases and safety issues, as well as make the generated text more similar to text written by humans. Such models are referred to as Fine-tuned LAnguage Nets (FLANs). For details on their training, we refer to the literature 9 . Notably, this process was recently reproduced with publicly available models such as Alpaca 10 and Dolly (i.e. the complete models can be downloaded and not just accessed through an API). However, we can only assume that a similar process was used for the training of GPT-4 since the paper by OpenAI does not include any details on model training.

Testing of the language competency of large-scale NLG systems has only recently started. Cai et al. 11 show that ChatGPT reuses sentence structure, accesses the intended meaning of an ambiguous word, and identifies the thematic structure of a verb and its arguments, replicating human language use. Mahowald 12 compares ChatGPT’s acceptability judgments to human judgments on the Article + Adjective + Numeral + Noun construction in English. Dentella et al. 13 show that ChatGPT-3 fails to understand low-frequent grammatical constructions like complex nested hierarchies and self-embeddings. In another recent line of research, the structure of automatically generated language is evaluated. Guo et al. 14 show that in question-answer scenarios, ChatGPT-3 uses different linguistic devices than humans. Zhao et al. 15 show that ChatGPT generates longer and more diverse responses when the user is in an apparently negative emotional state.

Given that we aim to identify certain linguistic characteristics of human-written versus AI-generated content, we also draw on related work in the field of linguistic fingerprinting, which assumes that each human has a unique way of using language to express themselves, i.e. the linguistic means that are employed to communicate thoughts, opinions and ideas differ between humans. That these properties can be identified with computational linguistic means has been showcased across different tasks: the computation of a linguistic fingerprint allows to distinguish authors of literary works 16 , the identification of speaker profiles in large public debates 17 , 18 , 19 , 20 and the provision of data for forensic voice comparison in broadcast debates 21 , 22 . For educational purposes, linguistic features are used to measure essay readability 23 , essay cohesion 24 and language performance scores for essay grading 25 . Integrating linguistic fingerprints also yields performance advantages for classification tasks, for instance in predicting user opinion 26 , 27 and identifying individual users 28 .

Limitations of OpenAIs ChatGPT evaluations

OpenAI published a discussion of the model’s performance of several tasks, including Advanced Placement (AP) classes within the US educational system 6 . The subjects used in performance evaluation are diverse and include arts, history, English literature, calculus, statistics, physics, chemistry, economics, and US politics. While the models achieved good or very good marks in most subjects, they did not perform well in English literature. GPT-3.5 also experienced problems with chemistry, macroeconomics, physics, and statistics. While the overall results are impressive, there are several significant issues: firstly, the conflict of interest of the model’s owners poses a problem for the performance interpretation. Secondly, there are issues with the soundness of the assessment beyond the conflict of interest, which make the generalizability of the results hard to assess with respect to the models’ capability to write essays. Notably, the AP exams combine multiple-choice questions with free-text answers. Only the aggregated scores are publicly available. To the best of our knowledge, neither the generated free-text answers, their overall assessment, nor their assessment given specific criteria from the used judgment rubric are published. Thirdly, while the paper states that 1–2 qualified third-party contractors participated in the rating of the free-text answers, it is unclear how often multiple ratings were generated for the same answer and what was the agreement between them. This lack of information hinders a scientifically sound judgement regarding the capabilities of these models in general, but also specifically for essays. Lastly, the owners of the model conducted their study in a few-shot prompt setting, where they gave the models a very structured template as well as an example of a human-written high-quality essay to guide the generation of the answers. This further fine-tuning of what the models generate could have also influenced the output. The results published by the owners go beyond the AP courses which are directly comparable to our work and also consider other student assessments like Graduate Record Examinations (GREs). However, these evaluations suffer from the same problems with the scientific rigor as the AP classes.

Scientific assessment of ChatGPT

Researchers across the globe are currently assessing the individual capabilities of these models with greater scientific rigor. We note that due to the recency and speed of these developments, the hereafter discussed literature has mostly only been published as pre-prints and has not yet been peer-reviewed. In addition to the above issues concretely related to the assessment of the capabilities to generate student essays, it is also worth noting that there are likely large problems with the trustworthiness of evaluations, because of data contamination, i.e. because the benchmark tasks are part of the training of the model, which enables memorization. For example, Aiyappa et al. 29 find evidence that this is likely the case for benchmark results regarding NLP tasks. This complicates the effort by researchers to assess the capabilities of the models beyond memorization.

Nevertheless, the first assessment results are already available – though mostly focused on ChatGPT-3 and not yet ChatGPT-4. Closest to our work is a study by Yeadon et al. 30 , who also investigate ChatGPT-3 performance when writing essays. They grade essays generated by ChatGPT-3 for five physics questions based on criteria that cover academic content, appreciation of the underlying physics, grasp of subject material, addressing the topic, and writing style. For each question, ten essays were generated and rated independently by five researchers. While the sample size precludes a statistical assessment, the results demonstrate that the AI model is capable of writing high-quality physics essays, but that the quality varies in a manner similar to human-written essays.

Guo et al. 14 create a set of free-text question answering tasks based on data they collected from the internet, e.g. question answering from Reddit. The authors then sample thirty triplets of a question, a human answer, and a ChatGPT-3 generated answer and ask human raters to assess if they can detect which was written by a human, and which was written by an AI. While this approach does not directly assess the quality of the output, it serves as a Turing test 31 designed to evaluate whether humans can distinguish between human- and AI-produced output. The results indicate that humans are in fact able to distinguish between the outputs when presented with a pair of answers. Humans familiar with ChatGPT are also able to identify over 80% of AI-generated answers without seeing a human answer in comparison. However, humans who are not yet familiar with ChatGPT-3 are not capable of identifying AI-written answers about 50% of the time. Moreover, the authors also find that the AI-generated outputs are deemed to be more helpful than the human answers in slightly more than half of the cases. This suggests that the strong results from OpenAI’s own benchmarks regarding the capabilities to generate free-text answers generalize beyond the benchmarks.

There are, however, some indicators that the benchmarks may be overly optimistic in their assessment of the model’s capabilities. For example, Kortemeyer 32 conducts a case study to assess how well ChatGPT-3 would perform in a physics class, simulating the tasks that students need to complete as part of the course: answer multiple-choice questions, do homework assignments, ask questions during a lesson, complete programming exercises, and write exams with free-text questions. Notably, ChatGPT-3 was allowed to interact with the instructor for many of the tasks, allowing for multiple attempts as well as feedback on preliminary solutions. The experiment shows that ChatGPT-3’s performance is in many aspects similar to that of the beginning learners and that the model makes similar mistakes, such as omitting units or simply plugging in results from equations. Overall, the AI would have passed the course with a low score of 1.5 out of 4.0. Similarly, Kung et al. 33 study the performance of ChatGPT-3 in the United States Medical Licensing Exam (USMLE) and find that the model performs at or near the passing threshold. Their assessment is a bit more optimistic than Kortemeyer’s as they state that this level of performance, comprehensible reasoning and valid clinical insights suggest that models such as ChatGPT may potentially assist human learning in clinical decision making.

Frieder et al. 34 evaluate the capabilities of ChatGPT-3 in solving graduate-level mathematical tasks. They find that while ChatGPT-3 seems to have some mathematical understanding, its level is well below that of an average student and in most cases is not sufficient to pass exams. Yuan et al. 35 consider the arithmetic abilities of language models, including ChatGPT-3 and ChatGPT-4. They find that they exhibit the best performance among other currently available language models (incl. Llama 36 , FLAN-T5 37 , and Bloom 38 ). However, the accuracy of basic arithmetic tasks is still only at 83% when considering correctness to the degree of $10^{-3}$ , i.e. such models are still not capable of functioning reliably as calculators. In a slightly satiric, yet insightful take, Spencer et al. 39 assess how a scientific paper on gamma-ray astrophysics would look like, if it were written largely with the assistance of ChatGPT-3. They find that while the language capabilities are good and the model is capable of generating equations, the arguments are often flawed and the references to scientific literature are full of hallucinations.

The general reasoning skills of the models may also not be at the level expected from the benchmarks. For example, Cherian et al. 40 evaluate how well ChatGPT-3 performs on eleven puzzles that second graders should be able to solve and find that ChatGPT is only able to solve them on average in 36.4% of attempts, whereas the second graders achieve a mean of 60.4%. However, their sample size is very small and the problem was posed as a multiple-choice question answering problem, which cannot be directly compared to the NLG we consider.

Research gap

Within this article, we address an important part of the current research gap regarding the capabilities of ChatGPT (and similar technologies), guided by the following research questions:

RQ1: How good is ChatGPT based on GPT-3 and GPT-4 at writing argumentative student essays?

RQ2: How do AI-generated essays compare to essays written by students?

RQ3: What are linguistic devices that are characteristic of student versus AI-generated content?

We study these aspects with the help of a large group of teaching professionals who systematically assess a large corpus of student essays. To the best of our knowledge, this is the first large-scale, independent scientific assessment of ChatGPT (or similar models) of this kind. Answering these questions is crucial to understanding the impact of ChatGPT on the future of education.

Materials and methods

The essay topics originate from a corpus of argumentative essays in the field of argument mining 41 . Argumentative essays require students to think critically about a topic and use evidence to establish a position on the topic in a concise manner. The corpus features essays for 90 topics from Essay Forum 42 , an active community for providing writing feedback on different kinds of text and is frequented by high-school students to get feedback from native speakers on their essay-writing capabilities. Information about the age of the writers is not available, but the topics indicate that the essays were written in grades 11–13, indicating that the authors were likely at least 16. Topics range from ‘Should students be taught to cooperate or to compete?’ to ‘Will newspapers become a thing of the past?’. In the corpus, each topic features one human-written essay uploaded and discussed in the forum. The students who wrote the essays are not native speakers. The average length of these essays is 19 sentences with 388 tokens (an average of 2.089 characters) and will be termed ‘student essays’ in the remainder of the paper.

For the present study, we use the topics from Stab and Gurevych 41 and prompt ChatGPT with ‘Write an essay with about 200 words on “[ topic ]”’ to receive automatically-generated essays from the ChatGPT-3 and ChatGPT-4 versions from 22 March 2023 (‘ChatGPT-3 essays’, ‘ChatGPT-4 essays’). No additional prompts for getting the responses were used, i.e. the data was created with a basic prompt in a zero-shot scenario. This is in contrast to the benchmarks by OpenAI, who used an engineered prompt in a few-shot scenario to guide the generation of essays. We note that we decided to ask for 200 words because we noticed a tendency to generate essays that are longer than the desired length by ChatGPT. A prompt asking for 300 words typically yielded essays with more than 400 words. Thus, using the shorter length of 200, we prevent a potential advantage for ChatGPT through longer essays, and instead err on the side of brevity. Similar to the evaluations of free-text answers by OpenAI, we did not consider multiple configurations of the model due to the effort required to obtain human judgments. For the same reason, our data is restricted to ChatGPT and does not include other models available at that time, e.g. Alpaca. We use the browser versions of the tools because we consider this to be a more realistic scenario than using the API. Table 1 below shows the core statistics of the resulting dataset. Supplemental material S1 shows examples for essays from the data set.

Annotation study

Study participants.

The participants had registered for a two-hour online training entitled ‘ChatGPT – Challenges and Opportunities’ conducted by the authors of this paper as a means to provide teachers with some of the technological background of NLG systems in general and ChatGPT in particular. Only teachers permanently employed at secondary schools were allowed to register for this training. Focusing on these experts alone allows us to receive meaningful results as those participants have a wide range of experience in assessing students’ writing. A total of 139 teachers registered for the training, 129 of them teach at grammar schools, and only 10 teachers hold a position at other secondary schools. About half of the registered teachers (68 teachers) have been in service for many years and have successfully applied for promotion. For data protection reasons, we do not know the subject combinations of the registered teachers. We only know that a variety of subjects are represented, including languages (English, French and German), religion/ethics, and science. Supplemental material S5 provides some general information regarding German teacher qualifications.

The training began with an online lecture followed by a discussion phase. Teachers were given an overview of language models and basic information on how ChatGPT was developed. After about 45 minutes, the teachers received a both written and oral explanation of the questionnaire at the core of our study (see Supplementary material S3 ) and were informed that they had 30 minutes to finish the study tasks. The explanation included information on how the data was obtained, why we collect the self-assessment, and how we chose the criteria for the rating of the essays, the overall goal of our research, and a walk-through of the questionnaire. Participation in the questionnaire was voluntary and did not affect the awarding of a training certificate. We further informed participants that all data was collected anonymously and that we would have no way of identifying who participated in the questionnaire. We orally informed participants that they consent to the use of the provided ratings for our research by participating in the survey.

Once these instructions were provided orally and in writing, the link to the online form was given to the participants. The online form was running on a local server that did not log any information that could identify the participants (e.g. IP address) to ensure anonymity. As per instructions, consent for participation was given by using the online form. Due to the full anonymity, we could by definition not document who exactly provided the consent. This was implemented as further insurance that non-participation could not possibly affect being awarded the training certificate.

About 20% of the training participants did not take part in the questionnaire study, the remaining participants consented based on the information provided and participated in the rating of essays. After the questionnaire, we continued with an online lecture on the opportunities of using ChatGPT for teaching as well as AI beyond chatbots. The study protocol was reviewed and approved by the Research Ethics Committee of the University of Passau. We further confirm that our study protocol is in accordance with all relevant guidelines.

Questionnaire

The questionnaire consists of three parts: first, a brief self-assessment regarding the English skills of the participants which is based on the Common European Framework of Reference for Languages (CEFR) 43 . We have six levels ranging from ‘comparable to a native speaker’ to ‘some basic skills’ (see supplementary material S3 ). Then each participant was shown six essays. The participants were only shown the generated text and were not provided with information on whether the text was human-written or AI-generated.

The questionnaire covers the seven categories relevant for essay assessment shown below (for details see supplementary material S3 ):

Topic and completeness

Logic and composition

Expressiveness and comprehensiveness

Language mastery

Vocabulary and text linking

Language constructs

These categories are used as guidelines for essay assessment 44 established by the Ministry for Education of Lower Saxony, Germany. For each criterion, a seven-point Likert scale with scores from zero to six is defined, where zero is the worst score (e.g. no relation to the topic) and six is the best score (e.g. addressed the topic to a special degree). The questionnaire included a written description as guidance for the scoring.

After rating each essay, the participants were also asked to self-assess their confidence in the ratings. We used a five-point Likert scale based on the criteria for the self-assessment of peer-review scores from the Association for Computational Linguistics (ACL). Once a participant finished rating the six essays, they were shown a summary of their ratings, as well as the individual ratings for each of their essays and the information on how the essay was generated.

Computational linguistic analysis

In order to further explore and compare the quality of the essays written by students and ChatGPT, we consider the six following linguistic characteristics: lexical diversity, sentence complexity, nominalization, presence of modals, epistemic and discourse markers. Those are motivated by previous work: Weiss et al. 25 observe the correlation between measures of lexical, syntactic and discourse complexities to the essay gradings of German high-school examinations while McNamara et al. 45 explore cohesion (indicated, among other things, by connectives), syntactic complexity and lexical diversity in relation to the essay scoring.

Lexical diversity

We identify vocabulary richness by using a well-established measure of textual, lexical diversity (MTLD) 46 which is often used in the field of automated essay grading 25 , 45 , 47 . It takes into account the number of unique words but unlike the best-known measure of lexical diversity, the type-token ratio (TTR), it is not as sensitive to the difference in the length of the texts. In fact, Koizumi and In’nami 48 find it to be least affected by the differences in the length of the texts compared to some other measures of lexical diversity. This is relevant to us due to the difference in average length between the human-written and ChatGPT-generated essays.

Syntactic complexity

We use two measures in order to evaluate the syntactic complexity of the essays. One is based on the maximum depth of the sentence dependency tree which is produced using the spaCy 3.4.2 dependency parser 49 (‘Syntactic complexity (depth)’). For the second measure, we adopt an approach similar in nature to the one by Weiss et al. 25 who use clause structure to evaluate syntactic complexity. In our case, we count the number of conjuncts, clausal modifiers of nouns, adverbial clause modifiers, clausal complements, clausal subjects, and parataxes (‘Syntactic complexity (clauses)’). The supplementary material in S2 shows the difference between sentence complexity based on two examples from the data.

Nominalization is a common feature of a more scientific style of writing 50 and is used as an additional measure for syntactic complexity. In order to explore this feature, we count occurrences of nouns with suffixes such as ‘-ion’, ‘-ment’, ‘-ance’ and a few others which are known to transform verbs into nouns.

Semantic properties

Both modals and epistemic markers signal the commitment of the writer to their statement. We identify modals using the POS-tagging module provided by spaCy as well as a list of epistemic expressions of modality, such as ‘definitely’ and ‘potentially’, also used in other approaches to identifying semantic properties 51 . For epistemic markers we adopt an empirically-driven approach and utilize the epistemic markers identified in a corpus of dialogical argumentation by Hautli-Janisz et al. 52 . We consider expressions such as ‘I think’, ‘it is believed’ and ‘in my opinion’ to be epistemic.

Discourse properties

Discourse markers can be used to measure the coherence quality of a text. This has been explored by Somasundaran et al. 53 who use discourse markers to evaluate the story-telling aspect of student writing while Nadeem et al. 54 incorporated them in their deep learning-based approach to automated essay scoring. In the present paper, we employ the PDTB list of discourse markers 55 which we adjust to exclude words that are often used for purposes other than indicating discourse relations, such as ‘like’, ‘for’, ‘in’ etc.

Statistical methods

We use a within-subjects design for our study. Each participant was shown six randomly selected essays. Results were submitted to the survey system after each essay was completed, in case participants ran out of time and did not finish scoring all six essays. Cronbach’s $\alpha$ 56 allows us to determine the inter-rater reliability for the rating criterion and data source (human, ChatGPT-3, ChatGPT-4) in order to understand the reliability of our data not only overall, but also for each data source and rating criterion. We use two-sided Wilcoxon-rank-sum tests 57 to confirm the significance of the differences between the data sources for each criterion. We use the same tests to determine the significance of the linguistic characteristics. This results in three comparisons (human vs. ChatGPT-3, human vs. ChatGPT-4, ChatGPT-3 vs. ChatGPT-4) for each of the seven rating criteria and each of the seven linguistic characteristics, i.e. 42 tests. We use the Holm-Bonferroni method 58 for the correction for multiple tests to achieve a family-wise error rate of 0.05. We report the effect size using Cohen’s d 59 . While our data is not perfectly normal, it also does not have severe outliers, so we prefer the clear interpretation of Cohen’s d over the slightly more appropriate, but less accessible non-parametric effect size measures. We report point plots with estimates of the mean scores for each data source and criterion, incl. the 95% confidence interval of these mean values. The confidence intervals are estimated in a non-parametric manner based on bootstrap sampling. We further visualize the distribution for each criterion using violin plots to provide a visual indicator of the spread of the data (see Supplementary material S4 ).

Further, we use the self-assessment of the English skills and confidence in the essay ratings as confounding variables. Through this, we determine if ratings are affected by the language skills or confidence, instead of the actual quality of the essays. We control for the impact of these by measuring Pearson’s correlation coefficient r 60 between the self-assessments and the ratings. We also determine whether the linguistic features are correlated with the ratings as expected. The sentence complexity (both tree depth and dependency clauses), as well as the nominalization, are indicators of the complexity of the language. Similarly, the use of discourse markers should signal a proper logical structure. Finally, a large lexical diversity should be correlated with the ratings for the vocabulary. Same as above, we measure Pearson’s r . We use a two-sided test for the significance based on a $\beta$ -distribution that models the expected correlations as implemented by scipy 61 . Same as above, we use the Holm-Bonferroni method to account for multiple tests. However, we note that it is likely that all—even tiny—correlations are significant given our amount of data. Consequently, our interpretation of these results focuses on the strength of the correlations.

Our statistical analysis of the data is implemented in Python. We use pandas 1.5.3 and numpy 1.24.2 for the processing of data, pingouin 0.5.3 for the calculation of Cronbach’s $\alpha$ , scipy 1.10.1 for the Wilcoxon-rank-sum tests Pearson’s r , and seaborn 0.12.2 for the generation of plots, incl. the calculation of error bars that visualize the confidence intervals.

Out of the 111 teachers who completed the questionnaire, 108 rated all six essays, one rated five essays, one rated two essays, and one rated only one essay. This results in 658 ratings for 270 essays (90 topics for each essay type: human-, ChatGPT-3-, ChatGPT-4-generated), with three ratings for 121 essays, two ratings for 144 essays, and one rating for five essays. The inter-rater agreement is consistently excellent ( $\alpha >0.9$ ), with the exception of language mastery where we have good agreement ( $\alpha =0.89$ , see Table 2 ). Further, the correlation analysis depicted in supplementary material S4 shows weak positive correlations ( $r \in 0.11, 0.28]$ ) between the self-assessment for the English skills, respectively the self-assessment for the confidence in ratings and the actual ratings. Overall, this indicates that our ratings are reliable estimates of the actual quality of the essays with a potential small tendency that confidence in ratings and language skills yields better ratings, independent of the data source.

Table 2 and supplementary material S4 characterize the distribution of the ratings for the essays, grouped by the data source. We observe that for all criteria, we have a clear order of the mean values, with students having the worst ratings, ChatGPT-3 in the middle rank, and ChatGPT-4 with the best performance. We further observe that the standard deviations are fairly consistent and slightly larger than one, i.e. the spread is similar for all ratings and essays. This is further supported by the visual analysis of the violin plots.

The statistical analysis of the ratings reported in Table 4 shows that differences between the human-written essays and the ones generated by both ChatGPT models are significant. The effect sizes for human versus ChatGPT-3 essays are between 0.52 and 1.15, i.e. a medium ( $d \in [0.5,0.8)$ ) to large ( $d \in [0.8, 1.2)$ ) effect. On the one hand, the smallest effects are observed for the expressiveness and complexity, i.e. when it comes to the overall comprehensiveness and complexity of the sentence structures, the differences between the humans and the ChatGPT-3 model are smallest. On the other hand, the difference in language mastery is larger than all other differences, which indicates that humans are more prone to making mistakes when writing than the NLG models. The magnitude of differences between humans and ChatGPT-4 is larger with effect sizes between 0.88 and 1.43, i.e., a large to very large ( $d \in [1.2, 2)$ ) effect. Same as for ChatGPT-3, the differences are smallest for expressiveness and complexity and largest for language mastery. Please note that the difference in language mastery between humans and both GPT models does not mean that the humans have low scores for language mastery (M=3.90), but rather that the NLG models have exceptionally high scores (M=5.03 for ChatGPT-3, M=5.25 for ChatGPT-4).

When we consider the differences between the two GPT models, we observe that while ChatGPT-4 has consistently higher mean values for all criteria, only the differences for logic and composition, vocabulary and text linking, and complexity are significant. The effect sizes are between 0.45 and 0.5, i.e. small ( $d \in [0.2, 0.5)$ ) and medium. Thus, while GPT-4 seems to be an improvement over GPT-3.5 in general, the only clear indicator of this is a better and clearer logical composition and more complex writing with a more diverse vocabulary.

We also observe significant differences in the distribution of linguistic characteristics between all three groups (see Table 3 ). Sentence complexity (depth) is the only category without a significant difference between humans and ChatGPT-3, as well as ChatGPT-3 and ChatGPT-4. There is also no significant difference in the category of discourse markers between humans and ChatGPT-3. The magnitude of the effects varies a lot and is between 0.39 and 1.93, i.e., between small ( $d \in [0.2, 0.5)$ ) and very large. However, in comparison to the ratings, there is no clear tendency regarding the direction of the differences. For instance, while the ChatGPT models write more complex sentences and use more nominalizations, humans tend to use more modals and epistemic markers instead. The lexical diversity of humans is higher than that of ChatGPT-3 but lower than that of ChatGPT-4. While there is no difference in the use of discourse markers between humans and ChatGPT-3, ChatGPT-4 uses significantly fewer discourse markers.

We detect the expected positive correlations between the complexity ratings and the linguistic markers for sentence complexity ( $r=0.16$ for depth, $r=0.19$ for clauses) and nominalizations ( $r=0.22$ ). However, we observe a negative correlation between the logic ratings and the discourse markers ( $r=-0.14$ ), which counters our intuition that more frequent use of discourse indicators makes a text more logically coherent. However, this is in line with previous work: McNamara et al. 45 also find no indication that the use of cohesion indices such as discourse connectives correlates with high- and low-proficiency essays. Finally, we observe the expected positive correlation between the ratings for the vocabulary and the lexical diversity ( $r=0.12$ ). All observed correlations are significant. However, we note that the strength of all these correlations is weak and that the significance itself should not be over-interpreted due to the large sample size.

Our results provide clear answers to the first two research questions that consider the quality of the generated essays: ChatGPT performs well at writing argumentative student essays and outperforms the quality of the human-written essays significantly. The ChatGPT-4 model has (at least) a large effect and is on average about one point better than humans on a seven-point Likert scale.

Regarding the third research question, we find that there are significant linguistic differences between humans and AI-generated content. The AI-generated essays are highly structured, which for instance is reflected by the identical beginnings of the concluding sections of all ChatGPT essays (‘In conclusion, [...]’). The initial sentences of each essay are also very similar starting with a general statement using the main concepts of the essay topics. Although this corresponds to the general structure that is sought after for argumentative essays, it is striking to see that the ChatGPT models are so rigid in realizing this, whereas the human-written essays are looser in representing the guideline on the linguistic surface. Moreover, the linguistic fingerprint has the counter-intuitive property that the use of discourse markers is negatively correlated with logical coherence. We believe that this might be due to the rigid structure of the generated essays: instead of using discourse markers, the AI models provide a clear logical structure by separating the different arguments into paragraphs, thereby reducing the need for discourse markers.

Our data also shows that hallucinations are not a problem in the setting of argumentative essay writing: the essay topics are not really about factual correctness, but rather about argumentation and critical reflection on general concepts which seem to be contained within the knowledge of the AI model. The stochastic nature of the language generation is well-suited for this kind of task, as different plausible arguments can be seen as a sampling from all available arguments for a topic. Nevertheless, we need to perform a more systematic study of the argumentative structures in order to better understand the difference in argumentation between human-written and ChatGPT-generated essay content. Moreover, we also cannot rule out that subtle hallucinations may have been overlooked during the ratings. There are also essays with a low rating for the criteria related to factual correctness, indicating that there might be cases where the AI models still have problems, even if they are, on average, better than the students.

One of the issues with evaluations of the recent large-language models is not accounting for the impact of tainted data when benchmarking such models. While it is certainly possible that the essays that were sourced by Stab and Gurevych 41 from the internet were part of the training data of the GPT models, the proprietary nature of the model training means that we cannot confirm this. However, we note that the generated essays did not resemble the corpus of human essays at all. Moreover, the topics of the essays are general in the sense that any human should be able to reason and write about these topics, just by understanding concepts like ‘cooperation’. Consequently, a taint on these general topics, i.e. the fact that they might be present in the data, is not only possible but is actually expected and unproblematic, as it relates to the capability of the models to learn about concepts, rather than the memorization of specific task solutions.

While we did everything to ensure a sound construct and a high validity of our study, there are still certain issues that may affect our conclusions. Most importantly, neither the writers of the essays, nor their raters, were English native speakers. However, the students purposefully used a forum for English writing frequented by native speakers to ensure the language and content quality of their essays. This indicates that the resulting essays are likely above average for non-native speakers, as they went through at least one round of revisions with the help of native speakers. The teachers were informed that part of the training would be in English to prevent registrations from people without English language skills. Moreover, the self-assessment of the language skills was only weakly correlated with the ratings, indicating that the threat to the soundness of our results is low. While we cannot definitively rule out that our results would not be reproducible with other human raters, the high inter-rater agreement indicates that this is unlikely.

However, our reliance on essays written by non-native speakers affects the external validity and the generalizability of our results. It is certainly possible that native speaking students would perform better in the criteria related to language skills, though it is unclear by how much. However, the language skills were particular strengths of the AI models, meaning that while the difference might be smaller, it is still reasonable to conclude that the AI models would have at least comparable performance to humans, but possibly still better performance, just with a smaller gap. While we cannot rule out a difference for the content-related criteria, we also see no strong argument why native speakers should have better arguments than non-native speakers. Thus, while our results might not fully translate to native speakers, we see no reason why aspects regarding the content should not be similar. Further, our results were obtained based on high-school-level essays. Native and non-native speakers with higher education degrees or experts in fields would likely also achieve a better performance, such that the difference in performance between the AI models and humans would likely also be smaller in such a setting.

We further note that the essay topics may not be an unbiased sample. While Stab and Gurevych 41 randomly sampled the essays from the writing feedback section of an essay forum, it is unclear whether the essays posted there are representative of the general population of essay topics. Nevertheless, we believe that the threat is fairly low because our results are consistent and do not seem to be influenced by certain topics. Further, we cannot with certainty conclude how our results generalize beyond ChatGPT-3 and ChatGPT-4 to similar models like Bard ( https://bard.google.com/?hl=en ) Alpaca, and Dolly. Especially the results for linguistic characteristics are hard to predict. However, since—to the best of our knowledge and given the proprietary nature of some of these models—the general approach to how these models work is similar and the trends for essay quality should hold for models with comparable size and training procedures.

Finally, we want to note that the current speed of progress with generative AI is extremely fast and we are studying moving targets: ChatGPT 3.5 and 4 today are already not the same as the models we studied. Due to a lack of transparency regarding the specific incremental changes, we cannot know or predict how this might affect our results.

Our results provide a strong indication that the fear many teaching professionals have is warranted: the way students do homework and teachers assess it needs to change in a world of generative AI models. For non-native speakers, our results show that when students want to maximize their essay grades, they could easily do so by relying on results from AI models like ChatGPT. The very strong performance of the AI models indicates that this might also be the case for native speakers, though the difference in language skills is probably smaller. However, this is not and cannot be the goal of education. Consequently, educators need to change how they approach homework. Instead of just assigning and grading essays, we need to reflect more on the output of AI tools regarding their reasoning and correctness. AI models need to be seen as an integral part of education, but one which requires careful reflection and training of critical thinking skills.

Furthermore, teachers need to adapt strategies for teaching writing skills: as with the use of calculators, it is necessary to critically reflect with the students on when and how to use those tools. For instance, constructivists 62 argue that learning is enhanced by the active design and creation of unique artifacts by students themselves. In the present case this means that, in the long term, educational objectives may need to be adjusted. This is analogous to teaching good arithmetic skills to younger students and then allowing and encouraging students to use calculators freely in later stages of education. Similarly, once a sound level of literacy has been achieved, strongly integrating AI models in lesson plans may no longer run counter to reasonable learning goals.

In terms of shedding light on the quality and structure of AI-generated essays, this paper makes an important contribution by offering an independent, large-scale and statistically sound account of essay quality, comparing human-written and AI-generated texts. By comparing different versions of ChatGPT, we also offer a glance into the development of these models over time in terms of their linguistic properties and the quality they exhibit. Our results show that while the language generated by ChatGPT is considered very good by humans, there are also notable structural differences, e.g. in the use of discourse markers. This demonstrates that an in-depth consideration not only of the capabilities of generative AI models is required (i.e. which tasks can they be used for), but also of the language they generate. For example, if we read many AI-generated texts that use fewer discourse markers, it raises the question if and how this would affect our human use of discourse markers. Understanding how AI-generated texts differ from human-written enables us to look for these differences, to reason about their potential impact, and to study and possibly mitigate this impact.

Data availability

The datasets generated during and/or analysed during the current study are available in the Zenodo repository, https://doi.org/10.5281/zenodo.8343644

Code availability

All materials are available online in form of a replication package that contains the data and the analysis code, https://doi.org/10.5281/zenodo.8343644 .

Ouyang, L. et al. Training language models to follow instructions with human feedback (2022). arXiv:2203.02155 .

Ruby, D. 30+ detailed chatgpt statistics–users & facts (sep 2023). https://www.demandsage.com/chatgpt-statistics/ (2023). Accessed 09 June 2023.

Leahy, S. & Mishra, P. TPACK and the Cambrian explosion of AI. In Society for Information Technology & Teacher Education International Conference , (ed. Langran, E.) 2465–2469 (Association for the Advancement of Computing in Education (AACE), 2023).

Ortiz, S. Need an ai essay writer? here’s how chatgpt (and other chatbots) can help. https://www.zdnet.com/article/how-to-use-chatgpt-to-write-an-essay/ (2023). Accessed 09 June 2023.

Openai chat interface. https://chat.openai.com/ . Accessed 09 June 2023.

OpenAI. Gpt-4 technical report (2023). arXiv:2303.08774 .

Brown, T. B. et al. Language models are few-shot learners (2020). arXiv:2005.14165 .

Wang, B. Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX. https://github.com/kingoflolz/mesh-transformer-jax (2021).

Wei, J. et al. Finetuned language models are zero-shot learners. In International Conference on Learning Representations (2022).

Taori, R. et al. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca (2023).

Cai, Z. G., Haslett, D. A., Duan, X., Wang, S. & Pickering, M. J. Does chatgpt resemble humans in language use? (2023). arXiv:2303.08014 .

Mahowald, K. A discerning several thousand judgments: Gpt-3 rates the article + adjective + numeral + noun construction (2023). arXiv:2301.12564 .

Dentella, V., Murphy, E., Marcus, G. & Leivada, E. Testing ai performance on less frequent aspects of language reveals insensitivity to underlying meaning (2023). arXiv:2302.12313 .

Guo, B. et al. How close is chatgpt to human experts? comparison corpus, evaluation, and detection (2023). arXiv:2301.07597 .

Zhao, W. et al. Is chatgpt equipped with emotional dialogue capabilities? (2023). arXiv:2304.09582 .

Keim, D. A. & Oelke, D. Literature fingerprinting : A new method for visual literary analysis. In 2007 IEEE Symposium on Visual Analytics Science and Technology , 115–122, https://doi.org/10.1109/VAST.2007.4389004 (IEEE, 2007).

El-Assady, M. et al. Interactive visual analysis of transcribed multi-party discourse. In Proceedings of ACL 2017, System Demonstrations , 49–54 (Association for Computational Linguistics, Vancouver, Canada, 2017).

Mennatallah El-Assady, A. H.-J. & Butt, M. Discourse maps - feature encoding for the analysis of verbatim conversation transcripts. In Visual Analytics for Linguistics , vol. CSLI Lecture Notes, Number 220, 115–147 (Stanford: CSLI Publications, 2020).

Matt Foulis, J. V. & Reed, C. Dialogical fingerprinting of debaters. In Proceedings of COMMA 2020 , 465–466, https://doi.org/10.3233/FAIA200536 (Amsterdam: IOS Press, 2020).

Matt Foulis, J. V. & Reed, C. Interactive visualisation of debater identification and characteristics. In Proceedings of the COMMA workshop on Argument Visualisation, COMMA , 1–7 (2020).

Chatzipanagiotidis, S., Giagkou, M. & Meurers, D. Broad linguistic complexity analysis for Greek readability classification. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications , 48–58 (Association for Computational Linguistics, Online, 2021).

Ajili, M., Bonastre, J.-F., Kahn, J., Rossato, S. & Bernard, G. FABIOLE, a speech database for forensic speaker comparison. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) , 726–733 (European Language Resources Association (ELRA), Portorož, Slovenia, 2016).

Deutsch, T., Jasbi, M. & Shieber, S. Linguistic features for readability assessment. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications , 1–17, https://doi.org/10.18653/v1/2020.bea-1.1 (Association for Computational Linguistics, Seattle, WA, USA $\rightarrow$ Online, 2020).

Fiacco, J., Jiang, S., Adamson, D. & Rosé, C. Toward automatic discourse parsing of student writing motivated by neural interpretation. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022) , 204–215, https://doi.org/10.18653/v1/2022.bea-1.25 (Association for Computational Linguistics, Seattle, Washington, 2022).

Weiss, Z., Riemenschneider, A., Schröter, P. & Meurers, D. Computationally modeling the impact of task-appropriate language complexity and accuracy on human grading of German essays. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications , 30–45, https://doi.org/10.18653/v1/W19-4404 (Association for Computational Linguistics, Florence, Italy, 2019).

Yang, F., Dragut, E. & Mukherjee, A. Predicting personal opinion on future events with fingerprints. In Proceedings of the 28th International Conference on Computational Linguistics , 1802–1807, https://doi.org/10.18653/v1/2020.coling-main.162 (International Committee on Computational Linguistics, Barcelona, Spain (Online), 2020).

Tumarada, K. et al. Opinion prediction with user fingerprinting. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021) , 1423–1431 (INCOMA Ltd., Held Online, 2021).

Rocca, R. & Yarkoni, T. Language as a fingerprint: Self-supervised learning of user encodings using transformers. In Findings of the Association for Computational Linguistics: EMNLP . 1701–1714 (Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 2022).

Aiyappa, R., An, J., Kwak, H. & Ahn, Y.-Y. Can we trust the evaluation on chatgpt? (2023). arXiv:2303.12767 .

Yeadon, W., Inyang, O.-O., Mizouri, A., Peach, A. & Testrow, C. The death of the short-form physics essay in the coming ai revolution (2022). arXiv:2212.11661 .

TURING, A. M. I.-COMPUTING MACHINERY AND INTELLIGENCE. Mind LIX , 433–460, https://doi.org/10.1093/mind/LIX.236.433 (1950). https://academic.oup.com/mind/article-pdf/LIX/236/433/30123314/lix-236-433.pdf .

Kortemeyer, G. Could an artificial-intelligence agent pass an introductory physics course? (2023). arXiv:2301.12127 .

Kung, T. H. et al. Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models. PLOS Digital Health 2 , 1–12. https://doi.org/10.1371/journal.pdig.0000198 (2023).

Article Google Scholar

Frieder, S. et al. Mathematical capabilities of chatgpt (2023). arXiv:2301.13867 .

Yuan, Z., Yuan, H., Tan, C., Wang, W. & Huang, S. How well do large language models perform in arithmetic tasks? (2023). arXiv:2304.02015 .

Touvron, H. et al. Llama: Open and efficient foundation language models (2023). arXiv:2302.13971 .

Chung, H. W. et al. Scaling instruction-finetuned language models (2022). arXiv:2210.11416 .

Workshop, B. et al. Bloom: A 176b-parameter open-access multilingual language model (2023). arXiv:2211.05100 .

Spencer, S. T., Joshi, V. & Mitchell, A. M. W. Can ai put gamma-ray astrophysicists out of a job? (2023). arXiv:2303.17853 .

Cherian, A., Peng, K.-C., Lohit, S., Smith, K. & Tenenbaum, J. B. Are deep neural networks smarter than second graders? (2023). arXiv:2212.09993 .

Stab, C. & Gurevych, I. Annotating argument components and relations in persuasive essays. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers , 1501–1510 (Dublin City University and Association for Computational Linguistics, Dublin, Ireland, 2014).

Essay forum. https://essayforum.com/ . Last-accessed: 2023-09-07.

Common european framework of reference for languages (cefr). https://www.coe.int/en/web/common-european-framework-reference-languages . Accessed 09 July 2023.

Kmk guidelines for essay assessment. http://www.kmk-format.de/material/Fremdsprachen/5-3-2_Bewertungsskalen_Schreiben.pdf . Accessed 09 July 2023.

McNamara, D. S., Crossley, S. A. & McCarthy, P. M. Linguistic features of writing quality. Writ. Commun. 27 , 57–86 (2010).

McCarthy, P. M. & Jarvis, S. Mtld, vocd-d, and hd-d: A validation study of sophisticated approaches to lexical diversity assessment. Behav. Res. Methods 42 , 381–392 (2010).

Article PubMed Google Scholar

Dasgupta, T., Naskar, A., Dey, L. & Saha, R. Augmenting textual qualitative features in deep convolution recurrent neural network for automatic essay scoring. In Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications , 93–102 (2018).

Koizumi, R. & In’nami, Y. Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens. System 40 , 554–564 (2012).

spacy industrial-strength natural language processing in python. https://spacy.io/ .

Siskou, W., Friedrich, L., Eckhard, S., Espinoza, I. & Hautli-Janisz, A. Measuring plain language in public service encounters. In Proceedings of the 2nd Workshop on Computational Linguistics for Political Text Analysis (CPSS-2022) (Potsdam, Germany, 2022).

El-Assady, M. & Hautli-Janisz, A. Discourse Maps - Feature Encoding for the Analysis of Verbatim Conversation Transcripts (CSLI lecture notes (CSLI Publications, Center for the Study of Language and Information, 2019).

Hautli-Janisz, A. et al. QT30: A corpus of argument and conflict in broadcast debate. In Proceedings of the Thirteenth Language Resources and Evaluation Conference , 3291–3300 (European Language Resources Association, Marseille, France, 2022).

Somasundaran, S. et al. Towards evaluating narrative quality in student writing. Trans. Assoc. Comput. Linguist. 6 , 91–106 (2018).

Nadeem, F., Nguyen, H., Liu, Y. & Ostendorf, M. Automated essay scoring with discourse-aware neural models. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications , 484–493, https://doi.org/10.18653/v1/W19-4450 (Association for Computational Linguistics, Florence, Italy, 2019).

Prasad, R. et al. The Penn Discourse TreeBank 2.0. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08) (European Language Resources Association (ELRA), Marrakech, Morocco, 2008).

Cronbach, L. J. Coefficient alpha and the internal structure of tests. Psychometrika 16 , 297–334. https://doi.org/10.1007/bf02310555 (1951).

Article MATH Google Scholar

Wilcoxon, F. Individual comparisons by ranking methods. Biom. Bull. 1 , 80–83 (1945).

Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6 , 65–70 (1979).

MathSciNet MATH Google Scholar

Cohen, J. Statistical power analysis for the behavioral sciences (Academic press, 2013).

Freedman, D., Pisani, R. & Purves, R. Statistics (international student edition). Pisani, R. Purves, 4th edn. WW Norton & Company, New York (2007).

Scipy documentation. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html . Accessed 09 June 2023.

Windschitl, M. Framing constructivism in practice as the negotiation of dilemmas: An analysis of the conceptual, pedagogical, cultural, and political challenges facing teachers. Rev. Educ. Res. 72 , 131–175 (2002).

Download references

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

Faculty of Computer Science and Mathematics, University of Passau, Passau, Germany

Steffen Herbold, Annette Hautli-Janisz, Ute Heuer, Zlata Kikteva & Alexander Trautsch

You can also search for this author in PubMed Google Scholar

Contributions

S.H., A.HJ., and U.H. conceived the experiment; S.H., A.HJ, and Z.K. collected the essays from ChatGPT; U.H. recruited the study participants; S.H., A.HJ., U.H. and A.T. conducted the training session and questionnaire; all authors contributed to the analysis of the results, the writing of the manuscript, and review of the manuscript.

Corresponding author

Correspondence to Steffen Herbold .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information 1., supplementary information 2., supplementary information 3., supplementary tables., supplementary figures., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Herbold, S., Hautli-Janisz, A., Heuer, U. et al. A large-scale comparison of human-written versus ChatGPT-generated essays. Sci Rep 13 , 18617 (2023). https://doi.org/10.1038/s41598-023-45644-9

Download citation

Received : 01 June 2023

Accepted : 22 October 2023

Published : 30 October 2023

DOI : https://doi.org/10.1038/s41598-023-45644-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Defense against adversarial attacks: robust and efficient compressed optimized neural networks.

Insaf Kraidia
Afifa Ghenai
Samir Brahim Belhaouari

Scientific Reports (2024)

AI-driven translations for kidney transplant equity in Hispanic populations

Oscar A. Garcia Valencia
Charat Thongprayoon
Wisit Cheungpasitporn

How will the state think with ChatGPT? The challenges of generative artificial intelligence for public administrations

Thomas Cantens

AI & SOCIETY (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

Explore articles by subject
Guide to authors
Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

PRO Courses Guides New Tech Help Pro Expert Videos About wikiHow Pro Upgrade Sign In
EDIT Edit this Article
EXPLORE Tech Help Pro About Us Random Article Quizzes Request a New Article Community Dashboard This Or That Game Popular Categories Arts and Entertainment Artwork Books Movies Computers and Electronics Computers Phone Skills Technology Hacks Health Men's Health Mental Health Women's Health Relationships Dating Love Relationship Issues Hobbies and Crafts Crafts Drawing Games Education & Communication Communication Skills Personal Development Studying Personal Care and Style Fashion Hair Care Personal Hygiene Youth Personal Care School Stuff Dating All Categories Arts and Entertainment Finance and Business Home and Garden Relationship Quizzes Cars & Other Vehicles Food and Entertaining Personal Care and Style Sports and Fitness Computers and Electronics Health Pets and Animals Travel Education & Communication Hobbies and Crafts Philosophy and Religion Work World Family Life Holidays and Traditions Relationships Youth
Browse Articles
Learn Something New
Quizzes Hot
This Or That Game New
Train Your Brain
Explore More
Support wikiHow
About wikiHow
Log in / Sign up
Computers and Electronics
Online Communications

How to Get ChatGPT to Write an Essay: Prompts, Outlines, & More

Last Updated: March 31, 2024 Fact Checked

Getting ChatGPT to Write the Essay

Using ai to help you write, expert interview.

This article was written by Bryce Warwick, JD and by wikiHow staff writer, Nicole Levine, MFA . Bryce Warwick is currently the President of Warwick Strategies, an organization based in the San Francisco Bay Area offering premium, personalized private tutoring for the GMAT, LSAT and GRE. Bryce has a JD from the George Washington University Law School. This article has been fact-checked, ensuring the accuracy of any cited facts and confirming the authority of its sources. This article has been viewed 45,453 times.

Are you curious about using ChatGPT to write an essay? While most instructors have tools that make it easy to detect AI-written essays, there are ways you can use OpenAI's ChatGPT to write papers without worrying about plagiarism or getting caught. In addition to writing essays for you, ChatGPT can also help you come up with topics, write outlines, find sources, check your grammar, and even format your citations. This wikiHow article will teach you the best ways to use ChatGPT to write essays, including helpful example prompts that will generate impressive papers.

Things You Should Know

To have ChatGPT write an essay, tell it your topic, word count, type of essay, and facts or viewpoints to include.
ChatGPT is also useful for generating essay topics, writing outlines, and checking grammar.
Because ChatGPT can make mistakes and trigger AI-detection alarms, it's better to use AI to assist with writing than have it do the writing.

Before using the OpenAI's ChatGPT to write your essay, make sure you understand your instructor's policies on AI tools. Using ChatGPT may be against the rules, and it's easy for instructors to detect AI-written essays.
While you can use ChatGPT to write a polished-looking essay, there are drawbacks. Most importantly, ChatGPT cannot verify facts or provide references. This means that essays created by ChatGPT may contain made-up facts and biased content. [1] X Research source It's best to use ChatGPT for inspiration and examples instead of having it write the essay for you.

The topic you want to write about.
Essay length, such as word or page count. Whether you're writing an essay for a class, college application, or even a cover letter , you'll want to tell ChatGPT how much to write.
Other assignment details, such as type of essay (e.g., personal, book report, etc.) and points to mention.
If you're writing an argumentative or persuasive essay , know the stance you want to take so ChatGPT can argue your point.
If you have notes on the topic that you want to include, you can also provide those to ChatGPT.
When you plan an essay, think of a thesis, a topic sentence, a body paragraph, and the examples you expect to present in each paragraph.
It can be like an outline and not an extensive sentence-by-sentence structure. It should be a good overview of how the points relate.

"Write a 2000-word college essay that covers different approaches to gun violence prevention in the United States. Include facts about gun laws and give ideas on how to improve them."
This prompt not only tells ChatGPT the topic, length, and grade level, but also that the essay is personal. ChatGPT will write the essay in the first-person point of view.
"Write a 4-page college application essay about an obstacle I have overcome. I am applying to the Geography program and want to be a cartographer. The obstacle is that I have dyslexia. Explain that I have always loved maps, and that having dyslexia makes me better at making them."

In our essay about gun control, ChatGPT did not mention school shootings. If we want to discuss this topic in the essay, we can use the prompt, "Discuss school shootings in the essay."
Let's say we review our college entrance essay and realize that we forgot to mention that we grew up without parents. Add to the essay by saying, "Mention that my parents died when I was young."
In the Israel-Palestine essay, ChatGPT explored two options for peace: A 2-state solution and a bi-state solution. If you'd rather the essay focus on a single option, ask ChatGPT to remove one. For example, "Change my essay so that it focuses on a bi-state solution."

"Give me ideas for an essay about the Israel-Palestine conflict."
"Ideas for a persuasive essay about a current event."
"Give me a list of argumentative essay topics about COVID-19 for a Political Science 101 class."

"Create an outline for an argumentative essay called "The Impact of COVID-19 on the Economy."
"Write an outline for an essay about positive uses of AI chatbots in schools."
"Create an outline for a short 2-page essay on disinformation in the 2016 election."

"Find peer-reviewed sources for advances in using MRNA vaccines for cancer."
"Give me a list of sources from academic journals about Black feminism in the movie Black Panther."
"Give me sources for an essay on current efforts to ban children's books in US libraries."

"Write a 4-page college paper about how global warming is changing the automotive industry in the United States."
"Write a 750-word personal college entrance essay about how my experience with homelessness as a child has made me more resilient."
You can even refer to the outline you created with ChatGPT, as the AI bot can reference up to 3000 words from the current conversation. [3] X Research source For example: "Write a 1000 word argumentative essay called 'The Impact of COVID-19 on the United States Economy' using the outline you provided. Argue that the government should take more action to support businesses affected by the pandemic."

Step 5 Use ChatGPT to proofread and tighten grammar.

One way to do this is to paste a list of the sources you've used, including URLs, book titles, authors, pages, publishers, and other details, into ChatGPT along with the instruction "Create an MLA Works Cited page for these sources."
You can also ask ChatGPT to provide a list of sources, and then build a Works Cited or References page that includes those sources. You can then replace sources you didn't use with the sources you did use.

Expert Q&A

Because it's easy for teachers, hiring managers, and college admissions offices to spot AI-written essays, it's best to use your ChatGPT-written essay as a guide to write your own essay. Using the structure and ideas from ChatGPT, write an essay in the same format, but using your own words. Thanks Helpful 0 Not Helpful 0
Always double-check the facts in your essay, and make sure facts are backed up with legitimate sources. Thanks Helpful 0 Not Helpful 0
If you see an error that says ChatGPT is at capacity , wait a few moments and try again. Thanks Helpful 0 Not Helpful 0

Using ChatGPT to write or assist with your essay may be against your instructor's rules. Make sure you understand the consequences of using ChatGPT to write or assist with your essay. Thanks Helpful 0 Not Helpful 0
ChatGPT-written essays may include factual inaccuracies, outdated information, and inadequate detail. [4] X Research source Thanks Helpful 0 Not Helpful 0

Thanks for reading our article! If you’d like to learn more about completing school assignments, check out our in-depth interview with Bryce Warwick, JD .

↑ https://help.openai.com/en/articles/6783457-what-is-chatgpt
↑ https://platform.openai.com/examples/default-essay-outline
↑ https://help.openai.com/en/articles/6787051-does-chatgpt-remember-what-happened-earlier-in-the-conversation
↑ https://www.ipl.org/div/chatgpt/

About This Article

Send fan mail to authors

Is this article up to date?

Featured Articles

Watch Articles

Terms of Use
Privacy Policy
Do Not Sell or Share My Info
Not Selling Info

Keep up with tech in just 5 minutes a week!

We've detected unusual activity from your computer network

To continue, please click the box below to let us know you're not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy .

For inquiries related to this message please contact our support team and provide the reference ID below.

This AI chatbot is dominating social media with its frighteningly good essays

OpenAI last week opened up access to ChatGPT, an AI-powered chatbot that interacts with users in an eerily convincing and conversational way. (CNN / Adobe Stock)

Imagine if Siri could write you a college essay, or Alexa could spit out a movie review in the style of Shakespeare.

OpenAI last week opened up access to ChatGPT, an AI-powered chatbot that interacts with users in an eerily convincing and conversational way. Its ability to provide lengthy, thoughtful and thorough responses to questions and prompts -- even if inaccurate -- has stunned users, including academics and some in the tech industry.

The tool quickly went viral. On Monday, Open AI's co-founder Sam Altman, a prominent Silicon Valley investor, said on Twitter that ChatGPT crossed one million users . It also captured the attention of some prominent tech leaders, such as Box CEO Aaron Levie.

"There's a certain feeling that happens when a new technology adjusts your thinking about computing. Google did it. Firefox did it. AWS did it. iPhone did it. OpenAI is doing it with ChatGPT," Levie said on Twitter.

But as with other AI-powered tools, it also poses possible concerns, including for how it could disrupt creative industries, perpetuate biases and spread misinformation.

WHAT IS CHATGPT?

ChatGPT is a large language model trained on a massive trove of information online to create its responses. It comes from the same company behind DALL-E, which generates a seemingly limitless range of images in response to prompts from users. It's also the next iteration of text-generator GPT-3.

After signing up for ChatGPT , users can ask the AI system to field a range of questions, such as "Who was the president of the United States in 1955," or summarize difficult concepts into something a second grader could understand. It'll even tackle open-ended questions, such as "What's the meaning of life?" or "What should I wear if it's 40 degrees out today?"

"It depends on what activities you plan to do. If you plan to be outside, you should wear a light jacket or sweater, long pants, and closed-toe shoes," ChatGPT responded. "If you plan to be inside, you can wear a t-shirt and jeans or other comfortable clothing."

But some users are getting very creative.

HOW PEOPLE ARE USING IT

One person asked the chatbot to rewrite the 90s hit song, "Baby Got Back," in the Style of "The Canterbury Tales;" another wrote a letter to remove a bad account from a credit report (rather than using a credit repair lawyer). Other colorful examples including asking for fairy-tale inspired home décor tips and giving it an AP English exam question (it responded with a 5 paragraph essay about Wuthering Heights.)

In a blog post last week, OpenAI said the "format makes it possible for the tool to answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests."

As of Monday morning, the page to try ChatGPT was down, citing "exceptionally high demand." "Please hang tight as we work on scaling our systems," the message said. (It now appears to be back online).

POSSIBLE ISSUES

While ChatGPT successfully fielded a variety of questions submitted by CNN, some responses were noticeably off. In fact, Stack Overflow -- a Q&A platform for coders and programmers -- temporarily banned users from sharing information from ChatGPT, noting that it's "substantially harmful to the site and to users who are asking or looking for correct answers."

Beyond the issue of spreading incorrect information, the tool could also threaten some written professions, be used to explain problematic concepts, and as with all AI tools, perpetuate biases based on the pool of data on which it's trained. Typing a prompt involving a CEO, for example, could prompt a response assuming that the individual is white and male, for example.

"While we've made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behavior," Open AI said on its website. "We're using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now. We're eager to collect user feedback to aid our ongoing work to improve this system."

Still, Lian Jye Su, a research director at market research firm ABI Research, warns the chatbot is operating "without a contextual understanding of the language."

"It is very easy for the model to give plausible-sounding but incorrect or nonsensical answers," she said. "It guessed when it was supposed to clarify and sometimes responded to harmful instructions or exhibited biased behavior. It also lacks regional and country-specific understanding."

At the same time, however, it does provide a glimpse into how companies may be able to capitalize on developing more robust virtual assistance, as well as patient and customer care solutions.

While the DALL-E tool is free, it does put a limit on the number of prompts a user can do before having to pay. When Elon Musk, a co-founder of OpenAI, recently asked Altman on Twitter about the average cost per ChatGPT chat, Altman said: "We will have to monetize it somehow at some point; the compute costs are eye-watering."

CTVNews.ca Top Stories

They were from different countries and barely spoke each other's languages. More than 20 years later, they're still happily in love

He decided to spend Christmas somewhere that wouldn't involve snowstorm disasters. She was spending the holidays with family, travelling for the first time outside of her native country of Venezuela. 23 years later, they're still in love.

Man who set himself on fire outside Trump trial dies of injuries, police say

A man who doused himself in an accelerant and set himself on fire outside the courthouse where former U.S. President Donald Trump is on trial has died, police said.

Verdun Airbnb listing taken down amid complaints, fines and frustration from neighbours

An Airbnb in Montreal's Verdun borough was the source of much frustration from neighbours who say there were constant parties at the location. It has been taken down from the app, but housing advocates remain upset about short-term rentals.

Young people 'tortured' if stolen vehicle operations fail, Montreal police tell MPs

One day after a Montreal police officer fired gunshots at a suspect in a stolen vehicle, senior officers were telling parliamentarians that organized crime groups are recruiting people as young as 15 in the city to steal cars so that they can be shipped overseas.

Fire in Labrador town under control, officials tells residents to stay away

RCMP say the fire that prompted a state of emergency in a Labrador town is now under control.

12 students and teacher killed in Columbine school shooting remembered at 25th anniversary vigil

Thirteen victims of the Columbine High School shooting were remembered during a vigil Friday on the eve of the 25th anniversary of the shooting that was the worst the nation had seen at the time.

Israeli airstrike in southern Gaza city of Rafah kills at least 9 Palestinians, including 6 children

An Israeli airstrike on a house in Gaza's southernmost city killed at least nine people, six of them children, hospital authorities said Saturday, as Israel pursued its nearly seven-month offensive in the besieged Palestinian territory.

Mandisa, Grammy award-winning 'American Idol' alum, dead at 47

Soulful gospel artist Mandisa, a Grammy-winning singer who got her start as a contestant on 'American Idol' in 2006, has died, according to a statement on her verified social media. She was 47.

Iraq investigates a blast at a base of Iran-allied militias that killed 1. U.S. denies involvement

Iraqi authorities said Saturday that they were investigating an explosion that struck a base belonging to the Popular Mobilization Forces, a coalition of Iran-allied militias, killing one person and injuring eight.

Saskatoon police to search landfill for remains of woman missing since 2020

Saskatoon police say they will begin searching the city’s landfill for the remains of Mackenzie Lee Trottier, who has been missing for more than three years.

First Nation solar farm in B.C. expected to save 1.1 million litres of diesel a year

A First Nation in central British Columbia will build what the federal government says will likely be the largest off-grid solar project in Canada.

Home sale contract handwritten in Chinese holds up in B.C. court

A one-page contract for the purchase of a B.C. home that was handwritten in Chinese has been deemed valid in the province's Supreme Court, leaving the would-be buyer on the hook for more than $400,000.

Sask. Teachers' Federation sending offer to a vote 'tactical move': labour scholar

Teachers have three weeks to consider how they’ll vote on an offer from the provincial bargaining committee. But where does the dispute go next?

North Korea says it tested 'super-large' cruise missile warhead and new anti-aircraft missile

North Korea said Saturday it tested a “super-large” cruise missile warhead and a new anti-aircraft missile in a western coastal area as it expands military capabilities in the face of deepening tensions with the United States and South Korea.

The House is on the brink of approving aid for Ukraine and Israel after months of struggle

The House is preparing in a rare Saturday session to approve US$95 billion in foreign aid for Ukraine, Israel and other U.S. allies.

On federal budget, Macklem says 'fiscal track has not changed significantly'

Bank of Canada governor Tiff Macklem says Canada's fiscal position has 'not changed significantly' following the release of the federal government's budget.

Premiers want return to co-operation with federal government with new budget

Canada's premiers are warning the federal government not to overreach into their jurisdictions when it comes to delivering the programs laid out in Ottawa's latest budget.

Canada, G7 urge 'all parties' to de-escalate in growing Mideast conflict

Canada called for 'all parties' to de-escalate rising tensions in the Mideast following an apparent Israeli drone attack against Iran overnight.

Shivering for health: The myths and truths of ice baths explained

In a climate of social media-endorsed wellness rituals, plunging into cold water has promised to aid muscle recovery, enhance mental health and support immune system function. But the evidence of such benefits sits on thin ice, according to researchers.

Dengue cases top 5.2 million in the Americas as outbreak passes yearly record, PAHO says

Dengue cases are surging in the Americas, with cases reported topping 5.2 million as of this week, surpassing a yearly record set in 2023, according to the Pan American Health Organization (PAHO).

WHO likely to issue wider alert on contaminated cough syrup

The World Health Organization is likely to issue a wider warning about contaminated Johnson and Johnson-made children's cough syrup found in Nigeria last week, it said in an email.

Olympic organizers unveil strategy for using artificial intelligence in sports

Olympic organizers unveiled their strategy Friday to use artificial intelligence in sports, joining the global rush to capitalize on the rapidly advancing technology.

This ancient snake in India might have been longer than a school bus and weighed a tonne

An ancient giant snake in India might have been longer than a school bus and weighed a tonne, researchers reported Thursday.

When new leaders took over in ancient Maya, they didn't just bury the former royals. They burned their bodies in public

New archeological investigations in Guatemala reveal that the ancient Maya people had a ritual of burning royal human remains as a public display of political regime change.

Entertainment

Not a toddler, not a parent, but still love 'Bluey'? You're not alone

She's the title character of 'Bluey,' a kids' program consisting of seven-minute episodes that have enraptured children and adults alike. This week's release of its longest episode yet — at a whopping 28 minutes — prompted an outpouring of appreciation for the show, even from those who are neither toddler nor parent.

Nick Offerman once spent a night high in jail after being mistaken for a robber

During a guest appearance on 'Jimmy Kimmel Live,' Nick Offerman told a story from years ago he said his parents didn't know: he once spent a night in jail.

Taylor Swift drops 15 new songs on double album, 'The Tortured Poets Department: The Anthology'

On Friday, the pop star released her 11th album and at 2 a.m. Eastern, she released "The Tortured Poets Department: The Anthology," featuring 15 additional songs.

Lululemon to shutter Washington distribution centre, lay off 128 employees

Lululemon Athletica will close its distribution centre in the state of Washington at the end of the year and lay off more than 100 employees, the apparel retailer told Reuters on Friday.

Netflix slides as move to end sharing user count sparks growth worries

Netflix shares fell on Friday, as its surprise move to stop sharing subscriber additions and average revenue per member from 2025 sowed doubts in investor minds about growth peaking in some markets for the streaming pioneer.

The Body Shop Canada explores sale as demand outpaces inventory: court filing

The Body Shop Canada is exploring a sale as it struggles to get its hands on enough inventory to keep up with "robust" sales after announcing it would file for creditor protection and close 33 stores.

'Jonny is special': Moncton music community rallies around drummer

A GoFundMe campaign for a Moncton drummer has raised around $49,500 in just a few weeks.

After hearing thousands of last words, this hospital chaplain has advice for the living

Hospital chaplain J.S. Park opens up about death, grief and hearing thousands of last words, and shares his advice for the living.

How 4/20 grew from humble roots to marijuana's high holiday

Saturday marks marijuana culture’s high holiday, 4/20, when college students gather — at 4:20 p.m. — in clouds of smoke on campus quads and pot shops in legal-weed states thank their customers with discounts.

U.S. FAA launches investigation into unauthorized personnel in cockpit of Colorado Rockies flight to Toronto

The U.S.’s Federal Aviation Administration is investigating a video that appears to show unauthorized personnel in the cockpit of a charted Colorado Rockies flight to Toronto.

Streaking Jets prepare for playoff clash with star-studded Avalanche

The Winnipeg Jets will face the Colorado Avalanche for the first time ever in the Stanley Cup Playoffs

Legendary Formula 1 driver Michael Schumacher's watch collection is going on sale

Eight of Formula One legend Michael Schumacher's luxury and personalized watches are going up for auction – and they could sell for more than US$4.8 million in total.

Tesla recalling nearly 4,000 Cybertrucks because accelerator pedal can get stuck

Tesla is recalling 3,878 of its 2024 Cybertrucks after it discovered that the accelerator pedal can become stuck, potentially causing the vehicle to accelerate unintentionally and increase the risk of a crash.

London, Ont. driver charged after travelling nearly 200 km/h on Highway 401

A driver from London will have to find alternative transportation after an OPP officer clocked them travelling nearly 200 km/h on Highway 401 over the weekend.

Local Spotlight

UBC football star turning heads in lead up to NFL draft

At 6'8" and 350 pounds, there is nothing typical about UBC offensive lineman Giovanni Manu, who was born in Tonga and went to high school in Pitt Meadows.

Cat found at Pearson airport 3 days after going missing

Kevin the cat has been reunited with his family after enduring a harrowing three-day ordeal while lost at Toronto Pearson International Airport earlier this week.

Molly on a mission: N.S. student collecting books about women in sport for school library

Molly Knight, a grade four student in Nova Scotia, noticed her school library did not have many books on female athletes, so she started her own book drive in hopes of changing that.

Where did the gold go? Crime expert weighs in on unfolding Pearson airport heist investigation

Almost 7,000 bars of pure gold were stolen from Pearson International Airport exactly one year ago during an elaborate heist, but so far only a tiny fraction of that stolen loot has been found.

Marmot in the city: New resident of North Vancouver's Lower Lonsdale a 'rock star rodent'

When Les Robertson was walking home from the gym in North Vancouver's Lower Lonsdale neighbourhood three weeks ago, he did a double take. Standing near a burrow it had dug in a vacant lot near East 1st Street and St. Georges Avenue was a yellow-bellied marmot.

Relocated seal returns to Greater Victoria after 'astonishing' 204-kilometre trek

A moulting seal who was relocated after drawing daily crowds of onlookers in Greater Victoria has made a surprise return, after what officials described as an 'astonishing' six-day journey.

Ottawa barber shop steps away from Parliament Hill marks 100 years in business

Just steps from Parliament Hill is a barber shop that for the last 100 years has catered to everyone from prime ministers to tourists.

'It was a special game': Edmonton pinball player celebrates high score and shout out from game designer

A high score on a Foo Fighters pinball machine has Edmonton player Dave Formenti on a high.

'How much time do we have?': 'Contamination' in Prairie groundwater identified

A compound used to treat sour gas that's been linked to fertility issues in cattle has been found throughout groundwater in the Prairies, according to a new study.

B.C. judge orders shared dog custody for exes who both 'clearly love Stella'

In a first-of-its-kind ruling, a B.C. judge has awarded a former couple joint custody of their dog.

Vancouver nurse, union say concerns over drug use in hospitals politicized

A Vancouver nurse is speaking out, saying politicians are using recent accounts of concerns related to the safety of health-care workers for political gain.

What to expect for Toronto Maple Leafs playoffs against Boston Bruins

The next chapter of the Toronto Maple Leafs rivalry with the Boston Bruins is set to unfold on Saturday night.

How to revive your lawn after winter and avoid long-lasting damage: experts

Winter weather can cause long-lasting damage to your front lawn, according to experts – the key to a healthy revival lies in a few simple maintenance practices.

Voluntary water limits put in place in face of possible Alberta drought

More than three dozen of Alberta's largest water users have agreed with a provincial plan to cut back on water usage this year ahead of a severe drought expected this summer.

From 'barely surviving, to living': Calgary family champions local mental health supports

It seemed like their world was crashing in when free mental health services changed the lives of a Calgary family and now they are trying to help others get the same support.

Abbotsford scores late in overtime to topple Wranglers 4-3

The Calgary Wranglers picked up a single point Friday night, dropping a 4-3 decision to the Canucks in an American Hockey League (AHL) game played in Abbotsford.

Manor Park shooting leaves man dead, Ottawa police investigating

The Ottawa Police Service Homicide Unit is investigating a shooting that happened Friday evening on Birch Avenue.

Child not secured, mother holding baby on Highway 401 in eastern Ontario, driver facing charges

The Ontario Provincial Police (OPP) says a car that was initially stopped for stunt driving on Highway 401 in eastern Ontario had a child and a baby not safely seated Friday evening.

Ottawa is for the birds: Tips for birdwatching in the nation's capital

As springtime gets into bloom, birds that call Ottawa home are either migrating back to the area or becoming more active.

Iconic ninth floor Eaton Centre restaurant set to reopen in May

There was once a beautiful restaurant on the ninth floor of the former Eaton's department store. It closed 25 years ago, but many in Montreal still talk about it. Soon, Le 9ieme will open to diners once again.

Spit, punches and bites: School support staff detail rising violence from students

A former educational assistant is calling attention to the rising violence in Alberta's classrooms.

New report suggests Alberta's emissions reduction plan made little progress in first year

Alberta has done little to advance its plan to reduce greenhouse gas emissions a year after introducing it, an analysis suggests.

Nova Scotia justice minister resigns following domestic violence comments

Nova Scotia Justice Minister Brad Johns has resigned, according to a short statement from Premier Tim Houston on Friday evening.

Lyrid meteor shower nears peak; viewing opportunities in the Maritimes

CTV Atlantic meteorologist Kalin Mitchell says a large part of the Maritimes should have ideal viewing conditions for the Lyrid meteor shower Sunday overnight into Monday morning.

RCMP seek help identifying person of interest after arson in Back Bay, N.B.

The St. George RCMP is asking for help from the public in identifying a person of interest after an alleged arson at a wharf in Back Bay, N.B., on Thursday.

'Like part of the family': St. Boniface burger staple back after closing down last year

Mrs. Mikes shut down last year after a half-century of serving the community. Now its doors have reopened and Winnipeggers were lined up to get their fill, even as snow fell on them.

Winnipeg airport's longest runway is getting a facelift

The Winnipeg Airports Authority is doing an extensive repaving project to its largest runway.

Manitoba mom praises quick-thinking fire department for freeing daughter stuck in playground equipment

A Manitoba mother is praising firefighters for their quick work in helping her daughter who got stuck at a playground in Lorette, Man.

Sask. father found guilty of withholding daughter to prevent her from getting COVID-19 vaccine

Michael Gordon Jackson, a Saskatchewan man accused of abducting his daughter to prevent her from getting a COVID-19 vaccine, has been found guilty for contravention of a custody order.

NDP leader likens Wilmot farmland grab to Greenbelt scandal

A controversial land acquisition proposed in Wilmot Township is once again in the spotlight, as Ontario NDP leader Marit Stiles hosts a town hall in the community, calling the deal “eerily similar” to the Greenbelt scandal.

'I'm still shaking': Emotional victim impact statements shared after impaired double-fatal crash in Cambridge

It was an emotional day in court as close to a dozen people read victim impact statements to the man charged with impaired driving in a double-fatal Cambridge crash.

Kitchener golfer recognized as creator of Masters skip-shot tradition

A Canadian Hall of Fame golfer from Kitchener is finally getting recognition for starting the skip-shot tradition at The Masters.

‘My family are all broken-hearted’: Grandfather of homicide victim speaks out

Saskatoon is grappling with the tragic death of 24-year-old Melissa Duquette, whose body was discovered on April 15. A day later, authorities confirmed her death as a homicide, sparking deep sorrow and calls for action within the community.

Saskatoon judge to make ruling on evidence in fatal THC-impaired driving case

A Saskatoon Provincial Court judge will determine whether testimony from a woman, charged with impaired driving causing the death of a child, will be used as evidence in her trial.

Northern Ontario

Vicious attack on a dog ends with charges for northern Ont. suspect

Police in Sault Ste. Marie charged a 22-year-old man with animal cruelty following an attack on a dog Thursday morning.

Flood warning issued for Lake Nipissing shoreline in North Bay

A flood warning was issued Friday for the Lake Nipissing shoreline in North Bay, Callander and the Parks Creek watershed.

Early morning shooting under investigation by London police

Police say a man was found with a gunshot wound just before 5 a.m. Saturday in a northwest London neighbourhood.

ICYMI | CTV News London's top stories from this week

In case you missed it, CTV News London has compiled all the top local stories from this week into one video for your convenience.

Are London Transit buses already too full to accommodate free passes for high school students?

Leadership at London Transit cautions that the logistics need to be worked out before approving a free bus pass pilot project for high school students by this September.

Driver charged after Highway 400 crash sends woman and child to hospital

One person has been charged following a collision on Highway 400 in Barrie Friday morning that sent a vehicle rolling into a ditch.

Some important safety reminders for boating season

Orillia OPP underscores the crucial need for boaters to verify their safety equipment before the upcoming boating season.

Barrie pharmacy launches 1st provincial pharmacist care walk-in clinic

The first provincial pharmacist care walk-in clinic has opened at the Rexall in the north end of Barrie to provide more immediate care for minor illnesses and chronic care in a more private setting.

ICYMI | CTV News Windsor's top stories from this week

In case you missed it, CTV News Windsor has compiled all the top local stories from this week into one video for your convenience.

$5K in tools stolen after garage break-in, Chatham-Kent police say

Thousands of dollars worth of tools were allegedly stolen from a garage in Chatham, according to police.

'My family’s suffering still hasn’t ended': Faint hope hearing evidence concludes with victim impact statements

A Windsor man convicted in a violent murder 20 years ago awaits ruling on bid for early parole.

Vancouver Island

'it was joy': trapped b.c. orca calf eats seal meat, putting rescue on hold.

A rescue operation for an orca calf trapped in a remote tidal lagoon off Vancouver Island has been put on hold after it started eating seal meat thrown in the water for what is believed to be the first time.

B.C. Humanist Association threatens to sue Vancouver Island city over council prayer

The B.C. Humanist Association has announced plans to sue a Vancouver Island city for breaching religious neutrality by including a Christian prayer in council.

Stolen snake named Milkshake returned to Kelowna pet store

A banana ball python that was stolen from a Kelowna pet store on Saturday has been returned unharmed, Mounties said.

Nurses rally at B.C. Interior hospital over security, staffing concerns

Nurses held a rally Wednesday at a hospital in the B.C. Interior that closed its emergency department more than a dozen times last year due to insufficient staff.

B.C. to add 240 complex-care housing units throughout province

British Columbia is planning to add 240 new units to its complex-care housing program, providing homes for people with mental-health and addictions challenges that overlap with other serious conditions.

Lethbridge cutting back on water use as part of provincial water sharing agreement

The province has announced the largest water sharing agreement in Alberta’s history, which will see Lethbridge and Medicine Hat, among other municipalities, cut water use.

Medicine Hat woman charged after $60K stolen from Redcliff Legion

Mounties say a 32-year-old Medicine Hat woman has been charged in connection with the theft of $60,000 from the Redcliff Legion.

Deterrent sentence possible, expert says, after trio convicted in connection to Coutts border blockade

Three men who helped lead and co-ordinate the 2022 border blockade at Coutts, Alta., have been found guilty of public mischief.

Sault Ste. Marie

Phoenix Rising looking for community partners

A Sault Ste. Marie-based support group for women is hoping to work with other groups to provide more inclusive services.

Sault College president sues Conestoga counterpart, seeks formal apology for vulgar attack

The war of words between the presidents of Sault College and Conestoga College has escalated into a legal battle.

Sault police ticket driver travelling nearly double the speed limit

A 43-year-old has been charged in Sault Ste. Marie after police recorded a vehicle being driven at 113 km/h, nearly double the posted speed limit.

N.L. gardening store revives 19th century seed-packing machine

Technology from the 19th century has been brought out of retirement at a Newfoundland gardening store, as staff look for all the help they can get to fill orders during a busy season.

500 Newfoundlanders wound up on the same cruise and it turned into a rocking kitchen party

A Celebrity Apex cruise to the Caribbean this month turned into a rocking Newfoundland kitchen party when hundreds of people from Canada's easternmost province happened to be booked on the same ship.

Protest averted as Newfoundland and Labrador premier helps reach pricing deal on crab

A pricing agreement has been reached between crab fishers and seafood processors that will allow for Newfoundland and Labrador's annual crab fishery to get started.

Shopping Trends

The Shopping Trends team is independent of the journalists at CTV News. We may earn a commission when you use our links to shop. Read about us.

Editor's Picks

17 practical things for your backyard that you'll want to order immediately, 19 of the best mother's day gifts under $50, here are the best deals you'll find on amazon canada right now, our guide to the best inflatable hot tubs in canada in 2024 (and where to get them), 21 of the best dog products you can get on amazon canada right now, our guide to the best coolers in canada in 2024 (and where to get them), 17 unique mother's day gifts your mom definitely wants, but probably won’t buy herself, if your mom needs a bit of rest and relaxation, here are 20 of the best self-care gifts for mother's day, 20 gifts that are so great, you'll want to keep them for yourself, 12 travel-sized skincare products that'll fit in your toiletry bag, 15 wrinkle-smoothing serums that’ll help reduce the appearance of fine lines, this canadian red light therapy brand is here to give you your best skin ever, stay connected.

Search form

Student-developed ai chatbot opens yale philosopher’s works to all.

Nicolas Gertler and Luciano Floridi (Portraits by Mara Lavitt)

The public is often closed off from scholarly perspectives on the potential benefits of generative artificial intelligence (AI). Studies often reside behind pricey paywalls. And even if they are accessible, they are frequently written in esoteric language that non-academics struggle to parse.

Nicolas Gertler, a first-year student in Yale College, saw a potential solution to these obstacles, through generative AI’s own capabilities.

Gertler, a research assistant at Yale’s Digital Ethics Center (DEC) , has spearheaded an experiment using generative AI to make bodies of knowledge broadly accessible. With Rithvik “Ricky” Sabnekar, a high school junior and skilled developer from Texas, he created the Luciano Floridi Bot , also known as LuFlot, a free AI-powered online educational tool designed to foster engagement with the works of Yale philosopher and DEC Director Luciano Floridi, a pioneer in the philosophy of information and one of the most-cited living philosophers.

The developers believe it’s the first time a chatbot has been trained on an academic’s corpus of literature and released to the public for free.

“ The idea was to democratize access to Professor Floridi’s work,” Gertler said, who undertook the project after discussing its possibilities with Floridi. “The issues he has written about touch everyone’s lives, and more and more people are becoming aware of AI. LuFlot provides an AI-driven platform for much broader engagement with the ethical questions surrounding this transformative technology.”

Meant to facilitate teaching and learning, the chatbot is trained on all the books that Floridi has published over his more than 30-year academic career. Within seconds of receiving a query, it provides users detailed and easily digestible answers drawn from this vast work.

It’s able to synthesize information from multiple sources, finding links between works that even Floridi might not have considered.

The tool features in-text citations, allowing users to trace the origins of the information provided directly to the original texts. If asked a question outside its knowledge base, the bot will politely respond that the query falls outside the scope of Floridi’s expertise. But relevant questions receive prompt and thorough answers. For example, a question asking how to use AI ethically quickly generated a clear, eight-point response with sources cited.

The interface also allows users to ask follow-up questions.

“ Anyone, regardless of their knowledge of AI, can visit the website, ask a question, and have a conversation with the founder of the philosophy of information,” Gertler said. “I think that’s remarkable.”

The tool warns users to evaluate its answers critically, as it may generate incorrect or biased information.

Floridi, professor in the practice in the Cognitive Science Program in Yale’s Faculty of Arts and Sciences, is impressed by the chatbot and its young developers.

“ The center is focused on the impact of digital technology and the ethics of AI, so it makes a lot of sense to have a bot available to answer people’s questions,” Floridi said. “The bot is an amazing tool. Nicolas and Ricky deserve all the credit. I’m just the supporting band.”

He has also become a user. For background on a paper he is writing, Floridi asked the chatbot about asymmetry between good and evil.

“ It gave me an amazing answer, accurately referencing concepts and ideas I had completely forgotten that I’d written about,” he said. “It can instantly draw interesting connections between something I published last year and something I published in 1991, which is incredible.”

Dynamic knowledge

Gertler, who is from Los Angeles, became interested in AI’s potential to benefit society about five years ago after watching YouTube videos on the topic.

“ I love thinking about the societal and ethical implications of new technologies and how they can be leveraged to help people, especially those in underserved and marginalized communities,” he said.

He is involved with two youth-led organizations devoted to harnessing new technology for the greater good: He serves as AI and education advisor of Encode Justice, an organization of more than 1,000 high school and college students across the world that advocates for steering AI in directions that benefit society. He also is vice president of Fidutam, a civil-society group that mobilizes its more than 1,600 members to advocate for and build responsible technology.

At the DEC, Gertler explores how to use generative AI to create educational frameworks that can provide enriched learning experiences. He also serves as the first-ever “AI student ambassador” at Yale’s Poorvu Center for Teaching and Learning, engaging with faculty on integrating AI into instruction and developing resources for students and faculty on the ethical use of AI in pedagogy.

The LuFlot project came together fast. Earlier this year, Gertler showed Floridi a chatbot he had created in his cognitive science class, intended to help students learn the course material. After a few conversations, the idea for LuFlot was born.

Gertler and Floridi decided to build from scratch a custom chatbot with its own user interface, rather than rely on existing tools like ChatGPT. Gertler brought on board Sabnekar, who attends Plano East Senior High School in Plano, Texas, to help develop the bot. The two met through Fidutam, where Sabnekar leads the technical development of the organization’s projects.

“ Ricky is technically brilliant,” Gertler said.

The pair founded Mylon Education, a startup company seeking to transform the educational landscape by reconstructing the systems through which individuals generate and develop their ideas. LuFlot is the startup’s first project.

“ Our goal is finding ways to harmonize human agency and creativity with AI-enabled structural support,” Gertler said. “That involves incorporating AI tools into the writing process without inhibiting people’s agency. It’s not about using the chatbot to write essays. It’s about using this technology to deepen your knowledge and sharpen your creativity and critical thinking skills.”

Generative AI and other innovations are changing how people learn, Floridi said. So there needs to be new approaches to teaching.

LuFlot demonstrates that it is cheap, feasible, and efficient to train an AI chatbot on a scholar’s corpus and have it produce high-quality answers to users’ prompts, he said. Chatbots could be trained on other scholars’ work or on an instructors’ entire course content, which could help students learn and retain information.

“ It’s much more useful than just making my lecture notes available online,” he said.

Floridi says he is reminded of Plato’s criticism of the invention of writing — that it is not a dynamic means of sharing knowledge because reading it always conveys the same answer.

“ The bot is dynamic,” he said. “It will not give you the same answer if you ask the same question even slightly differently. And you can ask it to be more specific. And it will grow as you train it with more information.”

Visit this link to converse with LuFlot about the ethics of digital technologies.

Science & Technology

Media Contact

Bess Connolly : [email protected] ,

What economists can learn from economic history

Greater access to clean water, thanks to a better membrane

Understanding bacteria protection in order to break through it

Cadets and midshipmen standing in a block

ROTC cadets, midshipmen parade — and honor Salovey — at President’s Review

Show More Articles

What are you writing about today?

Write better essays, in less time, with your ai writing assistant.

Skip to main content
Keyboard shortcuts for audio player

A new tool helps teachers detect if AI wrote an assignment

Janet W. Lee

Several big school districts such as New York and Los Angeles have blocked access to a new chatbot that uses artificial intelligence to produce essays. One student has a new tool to help.

MICHEL MARTIN, HOST:

ChatGPT is a buzzy new AI technology that can write research papers or poems that come out sounding like a real person did the work. You can even train this bot to write the way you do. Some teachers are understandably concerned, but one graduate student has an idea of how to help. Janet Woojeong Lee, from NPR's Education Desk, has this report.

JANET WOOJEONG LEE, BYLINE: Teachers around the country don't know what to do. Since ChatGPT launched in November, many say they're worried this powerful technology could do their students' homework. Some school districts, including New York City and Los Angeles, have blocked access. But Edward Tian thinks that's the wrong way to go.

EDWARD TIAN: I'm not for these blanket bans on ChatGPT usage because that does really nothing. Students can get around it, just like you can use ChatGPT on your Wi-Fi at home.

LEE: Tian is a 22-year-old computer science student at Princeton University. Just a month after ChatGPT got teachers worried, he built a bot to help them. It's called GPTZero. You can copy and paste any text, and it'll analyze each sentence, each word and judge how likely it is that a real person or a fake person wrote it.

TIAN: And teachers can, you know, make their own decision of, like, wow, this essay is, like, 100% ChatGPT-written, or this essay is, like, uses ChatGPT where it really made sense to help influence thought. That works. Teachers can make their own informed decisions.

LEE: Tian says having a handle on what is and isn't written by AI, down to the percentage of an essay, could help teachers who are intimidated by this new technology feel more in charge. There are other AI detection tools out there, too. Tian wrote his as a winter break passion project. He shared it on Twitter and was surprised to hear quickly from many teachers and even college officials who wanted to learn more.

TIAN: My own high school principal reached out. My own high school English teacher, Ms. Studka, reached out, and admissions officers have reached out saying they're interested.

LEE: Tian is now building a community of educators and students who want to figure out what to do with AI in the classroom. He believes instead of cheating, AI might be able to help teach and learn responsibly.

TIAN: Responsibly means somewhere in the middle. It can't be, like, students don't write any homework and don't do any homework anymore. But it also can't be, like, OK, we completely can't use these new technologies and are just ignoring them. So it has to be somewhere in the middle.

LEE: Students should learn how to use AI to their benefit, Tian says, because the technology is here to stay.

Janet Woojeong Lee, NPR News.

NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.

International edition
Australia edition
Europe edition

Edward Tian claims his GPTZero app can ‘quickly and efficiently’ detect whether an essay has been written by an AI bot.

College student claims app can detect essays written by chatbot ChatGPT

Princeton senior Edward Tian says GPTZero can root out text composed by the controversial AI bot, but users cite mixed results

Follow our Australia news live blog for the latest updates
Get our morning and afternoon news emails , free app or daily news podcast

A 22-year-old college student has developed an app which he claims can detect whether text is written by ChatGPT, the explosive chatbot raising fears of plagiarism in academia.

Edward Tian, a senior at Princeton University, developed GPTZero over a summer break. It had 30,000 hits within a week of its launch.

Tian said the motivation was to address the use of artificial intelligence to evade anti-plagiarism software to cheat in exams with quick and credible academic writing.

His initial tweet, which claimed the app could “quickly and efficiently” detect whether an essay had been written by artificial intelligence, went viral with more than 5m views.

I spent New Years building GPTZero — an app that can quickly and efficiently detect whether an essay is ChatGPT or human written — Edward Tian (@edward_the6) January 3, 2023

Streamlit, the free platform that hosts GPTZero, has since supported Tian with hosting and memory capabilities to keep up with web traffic.

To determine whether text was written by artificial intelligence, the app tests a calculation of “perplexity” – which measures the complexity of a text, and “burstiness” – which compares the variation of sentences.

The more familiar the text is to the bot – which is trained on similar data – the likelier it is to be generated by AI.

here's a demo with @nandoodles 's Linkedin post that used ChatGPT to successfully respond to Danish programmer David Hansson's opinions pic.twitter.com/5szgLIQdeN — Edward Tian (@edward_the6) January 3, 2023

Tian told subscribers the newer model used the same principles, but with an improved capacity to detect artificial intelligence in text.

“Through testing the new model on a dataset of BBC news articles and AI generated articles from the same headlines prompts, the improved model has a false positive rate of < 2%,” he said.

“The coming months, I’ll be completely focused on building GPTZero, improving the model capabilities, and scaling the app out fully.”

Toby Walsh, Scientia professor of artificial intelligence at the University of New South Wales, wasn’t convinced.

He said unless the app was picked up by a major company, it was unlikely to have an impact on ChatGPT’s capacity to be used for plagiarising.

“It’s always an arms race between tech to identify synthetic text and the apps,” he said. “And it’s quite easy to ask ChatGPT to rewrite in a more personable style … like rephrasing as an 11-year-old.

“This will make it harder, but it won’t stop it.”

Walsh said users could also ask ChatGPT to add more “randomness” into text to evade censors, and obfuscate with different synonyms and grammatical edits.

Meanwhile, he said each app developed to spot synthetic texts gave greater ability for artificial intelligence programs to evade detection.

And each time a user logged on to ChatGPT, it was generating human feedback to improve filters, both implicitly and explicitly.

“There’s a deep fundamental technical reason we’ll never win the arms race,” Walsh said.

“Every program used to identify synthetic text can be added to [the original program] to generate synthetic text to fool them … it’s always the case.

“We are training it but it’s getting better by the day.”

Users of GPTZero have cited mixed results.

GPTZero is a proposed anti-plagiarism tool that claims to be able to detect ChatGPT-generated text. Here's how it did on the first prompt I tried. https://t.co/ZmisoZt0uO pic.twitter.com/RhNU7B4k7B — Riley Goodside (@goodside) January 4, 2023

“It seemed like it was working on - and it does work for texts which are generated by GPT models entirely or generated with semi-human intervention,” one subscriber wrote.

“However … it does not work well with essays written by good writers. It false flagged so many essays as AI-written.

“This is at the same time a very useful tool for professors, and on the other hand a very dangerous tool - trusting it too much would lead to exacerbation of the false flags.”

“Nice attempt, but ChatGPT is so good at what it does,” another subscriber wrote.

“I have pasted in roughly 350 words of French … mostly generated by ChatGPT. The text is slightly manually edited for a better style, and generated with a strong, enforced context leading to the presence of proper nouns.

“That text passes the GPTZero test as human … I am not totally convinced that proper human-AI cooperation can be flagged.”

Artificial intelligence (AI)
Australian universities
Australian education

ChatGPT: A GPT-4 Turbo Upgrade and Everything Else to Know

It started as a research project. But ChatGPT has swept us away with its mind-blowing skills. Now, GPT-4 Turbo has improved in writing, math, logical reasoning and coding.

Shankland covered the tech industry for more than 25 years and was a science writer for five years before that. He has deep expertise in microprocessors, digital photography, computer hardware and software, internet standards, web technology, and more.

In 2022, OpenAI wowed the world when it introduced ChatGPT and showed us a chatbot with an entirely new level of power, breadth and usefulness, thanks to the generative AI technology behind it. Since then, ChatGPT has continued to evolve, including its most recent development: access to its latest GPT-4 Turbo model for paid users.

ChatGPT and generative AI aren't a novelty anymore, but keeping track of what they can do can be a challenge as new abilities arrive. Most notably, OpenAI now provides easier access to anyone who wants to use it. It also lets anyone write custom AI apps called GPTs and share them on its own app store, while on a smaller scale ChatGPT can now speak its responses to you. OpenAI has been leading the generative AI charge , but it's hotly pursued by Microsoft, Google and startups far and wide.

Generative AI still hasn't shaken a core problem -- it makes up information that sounds plausible but isn't necessarily correct. But there's no denying AI has fired the imaginations of computer scientists, loosened the purse strings of venture capitalists and caught the attention of everyone from teachers to doctors to artists and more, all wondering how AI will change their work and their lives.

If you're trying to get a handle on ChatGPT, this FAQ is for you. Here's a look at what's up.

Read more : ChatGPT 3.5 Review: First Doesn't Mean Best

What is ChatGPT?

ChatGPT is an online chatbot that responds to "prompts" -- text requests that you type. ChatGPT has countless uses . You can request relationship advice, a summarized history of punk rock or an explanation of the ocean's tides. It's particularly good at writing software, and it can also handle some other technical tasks, like creating 3D models .

ChatGPT is called a generative AI because it generates these responses on its own. But it can also display more overtly creative output like screenplays, poetry, jokes and student essays. That's one of the abilities that really caught people's attention.

Much of AI has been focused on specific tasks, but ChatGPT is a general-purpose tool. This puts it more into a category like a search engine.

That breadth makes it powerful but also hard to fully control. OpenAI has many mechanisms in place to try to screen out abuse and other problems, but there's an active cat-and-mouse game afoot by researchers and others who try to get ChatGPT to do things like offer bomb-making recipes.

ChatGPT really blew people's minds when it began passing tests. For example, AnsibleHealth researchers reported in 2023 that " ChatGPT performed at or near the passing threshold " for the United States Medical Licensing Exam, suggesting that AI chatbots "may have the potential to assist with medical education, and potentially, clinical decision-making."

We're a long way from fully fledged doctor-bots you can trust, but the computing industry is investing billions of dollars to solve the problems and expand AI into new domains like visual data too. OpenAI is among those at the vanguard. So strap in, because the AI journey is going to be a sometimes terrifying, sometimes exciting thrill.

What's ChatGPT's origin?

Artificial intelligence algorithms had been ticking away for years before ChatGPT arrived. These systems were a big departure from traditional programming, which follows a rigid if-this-then-that approach. AI, in contrast, is trained to spot patterns in complex real-world data. AI has been busy for more than a decade screening out spam, identifying our friends in photos, recommending videos and translating our Alexa voice commands into computerese.

A Google technology called transformers helped propel AI to a new level, leading to a type of AI called a large language model, or LLM. These AIs are trained on enormous quantities of text, including material like books, blog posts, forum comments and news articles. The training process internalizes the relationships between words, letting chatbots process input text and then generate what it believes to be appropriate output text.

A second phase of building an LLM is called reinforcement learning through human feedback, or RLHF. That's when people review the chatbot's responses and steer it toward good answers or away from bad ones. That significantly alters the tool's behavior and is one important mechanism for trying to stop abuse.

OpenAI's LLM is called GPT, which stands for "generative pretrained transformer." Training a new model is expensive and time consuming, typically taking weeks and requiring a data center packed with thousands of expensive AI acceleration processors. OpenAI's latest LLM is called GPT-4 Turbo . Other LLMs include Google's Gemini (formerly called Bard), Anthropic's Claude and Meta's Llama .

ChatGPT is an interface that lets you easily prompt GPT for responses. When it arrived as a free tool in November 2022, its use exploded far beyond what OpenAI expected.

When OpenAI launched ChatGPT, the company didn't even see it as a product. It was supposed to be a mere "research preview," a test that could draw some feedback from a broader audience, said ChatGPT product leader Nick Turley. Instead, it went viral, and OpenAI scrambled to just keep the service up and running under the demand.

"It was surreal," Turley said. "There was something about that release that just struck a nerve with folks in a way that we certainly did not expect. I remember distinctly coming back the day after we launched and looking at dashboards and thinking, something's broken, this couldn't be real, because we really didn't make a very big deal out of this launch."

ChatGPT, a name only engineers could love, was launched as a research project in November 2022, but quickly caught on as a consumer product.

How do I use ChatGPT?

The ChatGPT website is the most obvious method. Open it up, select the LLM version you want from the drop-down menu in the upper left corner, and type in a query.

As of April 1, OpenAI is allowing consumers to use ChatGPT without first signing up for an account. According to a blog post , the move was meant to make the tool more accessible. OpenAI also said in the post that as part of the move, it's introducing added content safeguards, blocking prompts in a wider range of categories.

However, users with accounts will be able to do more with the tool, such as save and review their history, share conversations and tap into features like voice conversations and custom instructions.

OpenAI in 2023 released a ChatGPT app for iPhones and for Android phones . In February, ChatGPT for Apple Vision Pro arrived , too, adding the chatbot's abilities to the "spatial computing" headset. Be careful to look for the genuine article, because other developers can create their own chatbot apps that link to OpenAI's GPT.

In January, OpenAI opened its GPT Store , a collection of custom AI apps that focus ChatGPT's all-purpose design to specific jobs. A lot more on that later, but in addition to finding them through the store you can invoke them with the @ symbol in a prompt, the way you might tag a friend on Instagram.

Microsoft uses GPT for its Bing search engine, which means you can also try out ChatGPT there.

ChatGPT is sprouting up in various hardware devices, including Volkswagen EVs , Humane's voice-controlled AI pin and the squarish Rabbit R1 device .

How much does ChatGPT cost?

It's free, though you have to set up an account to take advantage of all of its features.

For more capability, there's also a subscription called ChatGPT Plus that costs $20 per month that offers a variety of advantages: It responds faster, particularly during busy times when the free version is slow or sometimes tells you to try again later. It also offers access to newer AI models, including GPT-4 Turbo . OpenAI said it has improved capabilities in writing, math, logical reasoning and coding in this model.

The free ChatGPT uses the older GPT-3.5, which doesn't do as well on OpenAI's benchmark tests but which is faster to respond. The newest variation, GPT-4 Turbo, arrived in late 2023 with more up-to-date responses and an ability to ingest and output larger blocks of text.

ChatGPT is growing beyond its language roots. With ChatGPT Plus, you can upload images, for example, to ask what type of mushroom is in a photo.

Perhaps most importantly, ChatGPT Plus lets you use GPTs.

What are these GPTs?

GPTs are custom versions of ChatGPT from OpenAI, its business partners and thousands of third-party developers who created their own GPTs.

Sometimes when people encounter ChatGPT, they don't know where to start. OpenAI calls it the "empty box problem." Discovering that led the company to find a way to narrow down the choices, Turley said.

"People really benefit from the packaging of a use case -- here's a very specific thing that I can do with ChatGPT," like travel planning, cooking help or an interactive, step-by-step tool to build a website, Turley said.

OpenAI CEO Sam Altman stands in front of a black screen that shows the term

OpenAI CEO Sam Altman announces custom AI apps called GPTs at a developer event in November 2023.

Think of GPTs as OpenAI trying to make the general-purpose power of ChatGPT more refined the same way smartphones have a wealth of specific tools. (And think of GPTs as OpenAI's attempt to take control over how we find, use and pay for these apps, much like Apple has a commanding role over iPhones through its App Store.)

What GPTs are available now?

OpenAI's GPT store now offers millions of GPTs , though as with smartphone apps, you'll probably not be interested in most of them. A range of GPT custom apps are available, including AllTrails personal trail recommendations , a Khan Academy programming tutor , a Canva design tool , a book recommender , a fitness trainer , the laundry buddy clothes washing label decoder, a music theory instructor , a haiku writer and the Pearl for Pets for vet advice bot .

One person excited by GPTs is Daniel Kivatinos, co-founder of financial services company JustPaid . His team is building a GPT designed to take a spreadsheet of financial data as input and then let executives ask questions. How fast is a startup going through the money investors gave it? Why did that employee just file a $6,000 travel expense?

JustPaid hopes that GPTs will eventually be powerful enough to accept connections to bank accounts and financial software, which would mean a more powerful tool. For now, the developers are focusing on guardrails to avoid problems like hallucinations -- those answers that sound plausible but are actually wrong -- or making sure the GPT is answering based on the users' data, not on some general information in its AI model, Kivatinos said.

Anyone can create a GPT, at least in principle. OpenAI's GPT editor walks you through the process with a series of prompts. Just like the regular ChatGPT, your ability to craft the right prompt will generate better results.

Another notable difference from regular ChatGPT: GPTs let you upload extra data that's relevant to your particular GPT, like a collection of essays or a writing style guide.

Some of the GPTs draw on OpenAI's Dall-E tool for turning text into images, which can be useful and entertaining. For example, there is a coloring book picture creator , a logo generator and a tool that turns text prompts into diagrams like company org charts. OpenAI calls Dall-E a GPT.

How up to date is ChatGPT?

Not very, and that can be a problem. For example, a Bing search using ChatGPT to process results said OpenAI hadn't yet released its ChatGPT Android app. Search results from traditional search engines can help to "ground" AI results, and indeed that's part of the Microsoft-OpenAI partnership that can tweak ChatGPT Plus results.

GPT-4 Turbo, announced in November, is trained on data up through April 2023. But it's nothing like a search engine whose bots crawl news sites many times a day for the latest information.

Can you trust ChatGPT responses?

No. Well, sometimes, but you need to be wary.

Large language models work by stringing words together, one after another, based on what's probable each step of the way. But it turns out that LLM's generative AI works better and sounds more natural with a little spice of randomness added to the word selection recipe. That's the basic statistical nature that underlies the criticism that LLMs are mere "stochastic parrots" rather than sophisticated systems that in some way understand the world's complexity.

The result of this system, combined with the steering influence of the human training, is an AI that produces results that sound plausible but that aren't necessarily true. ChatGPT does better with information that's well represented in training data and undisputed -- for instance, red traffic signals mean stop, Plato was a philosopher who wrote the Allegory of the Cave , an Alaskan earthquake in 1964 was the largest in US history at magnitude 9.2.

ChatGPT response asking about tips for writing good prompts

We humans interact with AI chatbots by writing prompts -- questions or statements that seek an answer from the information stored in the chatbot's underlying large language model.

When facts are more sparsely documented, controversial or off the beaten track of human knowledge, LLMs don't work as well. Unfortunately, they sometimes produce incorrect answers with a convincing, authoritative voice. That's what tripped up a lawyer who used ChatGPT to bolster his legal case only to be reprimanded when it emerged he used ChatGPT fabricated some cases that appeared to support his arguments. "I did not comprehend that ChatGPT could fabricate cases ," he said, according to The New York Times.

Such fabrications are called hallucinations in the AI business.

That means when you're using ChatGPT, it's best to double check facts elsewhere.

But there are plenty of creative uses for ChatGPT that don't require strictly factual results.

Want to use ChatGPT to draft a cover letter for a job hunt or give you ideas for a themed birthday party? No problem. Looking for hotel suggestions in Bangladesh? ChatGPT can give useful travel itineraries , but confirm the results before booking anything.

Is the hallucination problem getting better?

Yes, but we haven't seen a breakthrough.

"Hallucinations are a fundamental limitation of the way that these models work today," Turley said. LLMs just predict the next word in a response, over and over, "which means that they return things that are likely to be true, which is not always the same as things that are true," Turley said.

But OpenAI has been making gradual progress. "With nearly every model update, we've gotten a little bit better on making the model both more factual and more self aware about what it does and doesn't know," Turley said. "If you compare ChatGPT now to the original ChatGPT, it's much better at saying, 'I don't know that' or 'I can't help you with that' versus making something up."

Hallucinations are so much a part of the zeitgeist that Dictionary.com touted it as a new word it added to its dictionary in 2023.

Can you use ChatGPT for wicked purposes?

You can try, but lots of it will violate OpenAI's terms of use , and the company tries to block it too. The company prohibits use that involves sexual or violent material, racist caricatures, and personal information like Social Security numbers or addresses.

OpenAI works hard to prevent harmful uses. Indeed, its basic sales pitch is trying to bring the benefits of AI to the world without the drawbacks. But it acknowledges the difficulties, for example in its GPT-4 "system card" that documents its safety work.

"GPT-4 can generate potentially harmful content, such as advice on planning attacks or hate speech. It can represent various societal biases and worldviews that may not be representative of the user's intent, or of widely shared values. It can also generate code that is compromised or vulnerable," the system card says. It also can be used to try to identify individuals and could help lower the cost of cyberattacks.

Through a process called red teaming, in which experts try to find unsafe uses of its AI and bypass protections, OpenAI identified lots of problems and tried to nip them in the bud before GPT-4 launched. For example, a prompt to generate jokes mocking a Muslim boyfriend in a wheelchair was diverted so its response said, "I cannot provide jokes that may offend someone based on their religion, disability or any other personal factors. However, I'd be happy to help you come up with some light-hearted and friendly jokes that can bring laughter to the event without hurting anyone's feelings."

Researchers are still probing LLM limits. For example, Italian researchers discovered they could use ChatGPT to fabricate fake but convincing medical research data . And Google DeepMind researchers found that telling ChatGPT to repeat the same word forever eventually caused a glitch that made the chatbot blurt out training data verbatim. That's a big no-no, and OpenAI barred the approach .

LLMs are still new. Expect more problems and more patches.

And there are plenty of uses for ChatGPT that might be allowed but ill-advised. The website of Philadelphia's sheriff published more than 30 bogus news stories generated with ChatGPT .

What about ChatGPT and cheating in school?

ChatGPT is well suited to short essays on just about anything you might encounter in high school or college, to the chagrin of many educators who fear students will type in prompts instead of thinking for themselves.

Microsoft CEO Satya Nadella speaking while standing between logos for OpenAI and Microsoft

Microsoft CEO Satya Nadella touted his company's partnership with OpenAI at a November 2023 event for OpenAI developers. Microsoft uses OpenAI's GPT large language model for its Bing search engine, Office productivity tools and GitHub Copilot programming assistant.

ChatGPT also can solve some math problems, explain physics phenomena, write chemistry lab reports and handle all kinds of other work students are supposed to handle on their own. Companies that sell anti-plagiarism software have pivoted to flagging text they believe an AI generated.

But not everyone is opposed, seeing it more like a tool akin to Google search and Wikipedia articles that can help students.

"There was a time when using calculators on exams was a huge no-no," said Alexis Abramson, dean of Dartmouth's Thayer School of Engineering. "It's really important that our students learn how to use these tools, because 90% of them are going into jobs where they're going to be expected to use these tools. They're going to walk in the office and people will expect them, being age 22 and technologically savvy, to be able to use these tools."

ChatGPT also can help kids get past writer's block and can help kids who aren't as good at writing, perhaps because English isn't their first language, she said.

So for Abramson, using ChatGPT to write a first draft or polish their grammar is fine. But she asks her students to disclose that fact.

"Anytime you use it, I would like you to include what you did when you turn in your assignment," she said. "It's unavoidable that students will use ChatGPT, so why don't we figure out a way to help them use it responsibly?"

Is ChatGPT coming for my job?

The threat to employment is real as managers seek to replace expensive humans with cheaper automated processes. We've seen this movie before: elevator operators were replaced by buttons, bookkeepers were replaced by accounting software, welders were replaced by robots.

ChatGPT has all sorts of potential to blitz white-collar jobs. Paralegals summarizing documents, marketers writing promotional materials, tax advisers interpreting IRS rules, even therapists offering relationship advice.

But so far, in part because of problems with things like hallucinations, AI companies present their bots as assistants and "copilots," not replacements.

And so far, sentiment is more positive than negative about chatbots, according to a survey by consulting firm PwC. Of 53,912 people surveyed around the world, 52% expressed at least one good expectation about the arrival of AI, for example that AI would increase their productivity. That compares with 35% who had at least one negative thing to say, for example that AI will replace them or require skills they're not confident they can learn.

How will ChatGPT affect programmers?

Software development is a particular area where people have found ChatGPT and its rivals useful. Trained on millions of lines of code, it internalized enough information to build websites and mobile apps. It can help programmers frame up bigger projects or fill in details.

One of the biggest fans is Microsoft's GitHub , a site where developers can host projects and invite collaboration. Nearly a third of people maintaining GitHub projects use its GPT-based assistant, called Copilot, and 92% of US developers say they're using AI tools .

"We call it the industrial revolution of software development," said Github Chief Product Officer Inbal Shani. "We see it lowering the barrier for entry. People who are not developers today can write software and develop applications using Copilot."

It's the next step in making programming more accessible, she said. Programmers used to have to understand bits and bytes, then higher-level languages gradually eased the difficulties. "Now you can write coding the way you talk to people," she said.

And AI programming aids still have a lot to prove. Researchers from Stanford and the University of California-San Diego found in a study of 47 programmers that those with access to an OpenAI programming help " wrote significantly less secure code than those without access."

And they raise a variation of the cheating problem that some teachers are worried about: copying software that shouldn't be copied, which can lead to copyright problems. That's why Copyleaks, a maker of plagiarism detection software, offers a tool called the Codeleaks Source Code AI Detector designed to spot AI-generated code from ChatGPT, Google Gemini and GitHub Copilot. AIs could inadvertently copy code from other sources, and the latest version is designed to spot copied code based on its semantic structures, not just verbatim software.

At least in the next five years, Shani doesn't see AI tools like Copilot as taking humans out of programming.

"I don't think that it will replace the human in the loop. There's some capabilities that we as humanity have -- the creative thinking, the innovation, the ability to think beyond how a machine thinks in terms of putting things together in a creative way. That's something that the machine can still not do."

Editors' note: CNET used an AI engine to help create several dozen stories, which are labeled accordingly. For more, see our AI policy .

Computing Guides

Best Laptop
Best Chromebook
Best Budget Laptop
Best Cheap Gaming Laptop
Best 2-in-1 Laptop
Best Windows Laptop
Best Macbook
Best Gaming Laptop
Best Macbook Deals
Best Desktop PC
Best Gaming PC
Best Monitor Under 200
Best Desktop Deals
Best Monitors
M2 Mac Mini Review
Best PC Speakers
Best Printer
Best External Hard Drive SSD
Best USB C Hub Docking Station
Best Keyboard
Best Webcams
Best Laptop Backpack
Best Camera to Buy
Best Vlogging Camera
Best Tripod
Best Waterproof Camera
Best Action Camera
Best Camera Bag and Backpack
Best E-Ink Tablets
Best iPad Deals
Best E-Reader
Best Tablet
Best Android Tablet
Best 3D Printer
Best Budget 3D Printer
Best 3D Printing Filament
Best 3D Printer Deals
Dell Coupon Codes
Newegg Promo Codes
HP Coupon Codes
Microsoft Coupons
Anker Coupons
Logitech Promo Codes
Western Digital Coupons
Monoprice Promo Codes
A4C Coupons

To revisit this article, visit My Profile, then View saved stories .

Backchannel
Newsletters
WIRED Insider
WIRED Consulting

Amanda Hoover

Students Are Likely Writing Millions of Papers With AI

Illustration of four hands holding pencils that are connected to a central brain

Students have submitted more than 22 million papers that may have used generative AI in the past year, new data released by plagiarism detection company Turnitin shows.

A year ago, Turnitin rolled out an AI writing detection tool that was trained on its trove of papers written by students as well as other AI-generated texts. Since then, more than 200 million papers have been reviewed by the detector, predominantly written by high school and college students. Turnitin found that 11 percent may contain AI-written language in 20 percent of its content, with 3 percent of the total papers reviewed getting flagged for having 80 percent or more AI writing. (Turnitin is owned by Advance, which also owns Condé Nast, publisher of WIRED.) Turnitin says its detector has a false positive rate of less than 1 percent when analyzing full documents.

ChatGPT’s launch was met with knee-jerk fears that the English class essay would die . The chatbot can synthesize information and distill it near-instantly—but that doesn’t mean it always gets it right. Generative AI has been known to hallucinate , creating its own facts and citing academic references that don’t actually exist. Generative AI chatbots have also been caught spitting out biased text on gender and race . Despite those flaws, students have used chatbots for research, organizing ideas, and as a ghostwriter . Traces of chatbots have even been found in peer-reviewed, published academic writing .

Teachers understandably want to hold students accountable for using generative AI without permission or disclosure. But that requires a reliable way to prove AI was used in a given assignment. Instructors have tried at times to find their own solutions to detecting AI in writing, using messy, untested methods to enforce rules , and distressing students. Further complicating the issue, some teachers are even using generative AI in their grading processes.

Detecting the use of gen AI is tricky. It’s not as easy as flagging plagiarism, because generated text is still original text. Plus, there’s nuance to how students use gen AI; some may ask chatbots to write their papers for them in large chunks or in full, while others may use the tools as an aid or a brainstorm partner.

Students also aren't tempted by only ChatGPT and similar large language models. So-called word spinners are another type of AI software that rewrites text, and may make it less obvious to a teacher that work was plagiarized or generated by AI. Turnitin’s AI detector has also been updated to detect word spinners, says Annie Chechitelli, the company’s chief product officer. It can also flag work that was rewritten by services like spell checker Grammarly, which now has its own generative AI tool . As familiar software increasingly adds generative AI components, what students can and can’t use becomes more muddled.

Detection tools themselves have a risk of bias. English language learners may be more likely to set them off; a 2023 study found a 61.3 percent false positive rate when evaluating Test of English as a Foreign Language (TOEFL) exams with seven different AI detectors. The study did not examine Turnitin’s version. The company says it has trained its detector on writing from English language learners as well as native English speakers. A study published in October found that Turnitin was among the most accurate of 16 AI language detectors in a test that had the tool examine undergraduate papers and AI-generated papers.

The Real-Time Deepfake Romance Scams Have Arrived

Matt Burgess

Andrew Couts

We Finally Know Where Neuralink’s Brain Implant Trial Is Happening

Emily Mullin

Nas’ Illmatic Was the Beginning of the End of the Album

C. Brandon Ogbunu

Schools that use Turnitin had access to the AI detection software for a free pilot period, which ended at the start of this year. Chechitelli says a majority of the service’s clients have opted to purchase the AI detection. But the risks of false positives and bias against English learners have led some universities to ditch the tools for now. Montclair State University in New Jersey announced in November that it would pause use of Turnitin’s AI detector. Vanderbilt University and Northwestern University did the same last summer.

“This is hard. I understand why people want a tool,” says Emily Isaacs, executive director of the Office of Faculty Excellence at Montclair State. But Isaacs says the university is concerned about potentially biased results from AI detectors, as well as the fact that the tools can’t provide confirmation the way they can with plagiarism. Plus, Montclair State doesn’t want to put a blanket ban on AI, which will have some place in academia. With time and more trust in the tools, the policies could change. “It’s not a forever decision, it’s a now decision,” Isaacs says.

Chechitelli says the Turnitin tool shouldn’t be the only consideration in passing or failing a student. Instead, it’s a chance for teachers to start conversations with students that touch on all of the nuance in using generative AI. “People don’t really know where that line should be,” she says.

You Might Also Like …

In your inbox: The best and weirdest stories from WIRED’s archive

Jeffrey Epstein’s island visitors exposed by data broker

8 Google employees invented modern AI. Here’s the inside story

The crypto fraud kingpin who almost got away

Listen up! These are the best podcasts , no matter what you’re into

Will Knight

Perplexity's Founder Was Inspired by Sundar Pichai. Now They’re Competing to Reinvent Search

Lauren Goode

How to Stop ChatGPT’s Voice Feature From Interrupting You

Reece Rogers

The EU Targets Apple, Meta, and Alphabet for Investigations Under New Tech Law

Morgan Meaker

4 Internal Apple Emails That Helped the DOJ Build Its Case

Tom Simonite

The Science of Crypto Forensics Survives a Court Battle&-for Now

Joel Khalili

Here's How Generative AI Depicts Queer People

Benj Edwards, Ars Technica

How to use Meta’s new AI chatbot that you can’t avoid

Facebook, instagram, messenger and whatsapp are all pushing a new ai chatbot.

With seemingly fewer friends posting to their main Facebook and Instagram feeds, Meta has introduced a new feature its users can talk to: an AI chatbot.

The feature, named Meta AI, is rolling out to the company’s main apps including Facebook, Instagram, Messenger and WhatsApp. It’s primarily a conversational chat window where you can ask questions and generate AI images, similar to other AI chatbots like OpenAI’s ChatGPT, Microsoft’s Co-Pilot and Google’s Gemini.

Despite over a year of artificial intelligence being everywhere, this could be many people’s first interaction with the technology. Meta has billions of users across its apps, and anyone who has managed to avoid the bots so far will find this one nearly impossible to escape.

Should you trust that AI?

Why is this ai chatbot here.

Facebook and Instagram users probably weren’t banging down Mark Zuckerberg’s door demanding an AI chatbot, so why is this feature suddenly everywhere? The technology is still new and its utility debatable. However, the major tech companies have decided that, like voice assistants and scrollable vertical videos before it, AI is the next big thing. Now they are competing to push out their versions. Facebook and Instagram used to rely on users’ friends, family and communities to keep their attention. Now, as these platforms are aging, the companies may hope a chatty bot can replace some of the human interaction.

How do I find it?

The chatbot is integrated in search and messaging features across Meta’s apps, and may appear in your feed under some posts as well. If you don’t see the AI features it yet, check back later. Its presence is marked with its logo: a thin ring that’s mostly blue and occasionally animated. The AI tool can also be accessed online on the stand-alone website meta.ai . It is not included in the company’s app for children, Messenger Kids.

On Facebook, tap the search icon on top and you’ll find that the usual search bar has been replaced with one that says, “Ask Meta AI anything.” As you start typing, it will auto-suggest searches. Anything with the blue circle next to it is going to bring up the AI chat window. You can also tap the messages icon and engage with Meta AI as if it’s another pal to talk to. If you see it under a post in your feed, it will suggest questions to ask related to the content you see.

In Instagram, Messenger and WhatsApp, you’ll also find Meta AI has taken over the search bars and appears as another chat. If your accounts are connected to each other, the Meta AI conversation should pick up where you left off, regardless of what app you’re in.

How do I turn it off?

There’s no way to get rid of Meta AI in search, confirmed Meta. In WhatsApp, there is an option to hide the new Meta AI button by going to Settings → Chats → Show Meta AI Button. However, it’s still in the search bar. Other apps have an option to mute its replies. I asked the AI chatbot how to turn it off and got multiple incorrect answers with instructions that did not work and for settings that don’t exist.

You can delete a chat with Meta AI to remove it from recent conversations in the same way you would any other chat. Swipe left on the chat and select Delete in Instagram, More → Delete on Facebook and Messenger, and More → Delete Chat on WhatsApp.

How do I get started?

Start typing full sentences or random words in any of the apps’ search bars or in the conversations with Meta AI. If this is your first time using an AI chatbot, you can begin by asking simple questions and even for a list of ways to use it.

I did the first things any normal person does when testing an AI tool. I asked it to be my pretend boyfriend, told it to generate images of ducks writing breakup letters and tried to push its boundaries. I discovered it avoids partaking in overtly sexual conversations or generating photos of the Pope (entirely unrelated questions). As with all artificial intelligence, there are creative ways to get around its filters.

Meta AI includes options for shortcuts. Type a forward slash and command, like /joke:, /imagine: or /story: and type your description after. However, these aren’t really necessary since you can make the same requests in a conversational way, such as “tell me a story about depressed hamster who ran for mayor.”

What should I use it for?

An AI chatbot is like having an enthusiastic but unreliable friend. You can ask it almost anything — but never assume it’s telling the truth. With that in mind, use Meta AI for fun and for noncritical tasks. Ask random questions like you would with Google, start conversations to feel less alone and use it to brainstorm.

Meta AI can also generate images, though in our tests they have the typical flaws associate with artificial intelligence. Most share the hyper-realistic lighting that AI images are known for, fumble details like fingers and eyes, and frequently give women exposed, ample cleavage.

There are plenty of other things you can try. Ask Meta AI to animate images, request a summary of the day’s news or ask it to take on the personality of a specific character when speaking to you. Because it’s integrated with Meta’s other products, you can use it to search things like “Reels of people learning to roller skate.”

To get the best results and avoid bland responses, ask follow-up questions and give as many details as possible. For a list of starter ideas, check out Tech Friend Shira Ovide’s recommendations of useful things to ask a chatbot .

What should I not use it for?

Don’t use AI as an authority for anything of consequence. For example, don’t rely on a chatbot for medical advice or as a source for school or work. Ethically, you shouldn’t use it to write papers for school, though Meta AI is happy to spit out wooden essays on demand.

Experts warn there is a danger of misinformation from tools like Meta’s chatbot. To steer clear, avoid using it as a go-to for anything sensitive or political. Turn to human sources instead like reporters, experts, even Wikipedia and Reddit, before artificial intelligence. For more advice on avoiding misinformation, check out our guide.

How is it different from other AI bots?

For the basics, Meta AI appears to spit out the same generic answers as its competitors. I asked five different chatbots about the best taqueria in San Francisco, a vegetarian meal plan, if God exists and how to know if a polycule is right for you. For the most part, they all gave incredibly similar, mundane but neutral answers with the exception of Microsoft’s Co-pilot, which does not enjoy shenanigans.

Is it keeping my information?

Use the same precautions typing questions and thoughts into an AI chatbot as you would a Google search. Meta does save the conversations but to protect privacy, the data is anonymized, meaning it’s not connected to your name or identity. While this is standard for technology companies, experts say it’s possible to re-identify people using additional data points. If you want to delete a chat, you can use the shortcut “/reset-ai” and Meta claims it will remove the conversation from its servers.

Help Desk: Making tech work for you

Help Desk is a destination built for readers looking to better understand and take control of the technology used in everyday life.

Take control: Sign up for The Tech Friend newsletter to get straight talk and advice on how to make your tech a force for good.

Tech tips to make your life easier: 10 tips and tricks to customize iOS 16 | 5 tips to make your gadget batteries last longer | How to get back control of a hacked social media account | How to avoid falling for and spreading misinformation online

Ask a question: Send the Help Desk your personal technology questions .

WhatsApp just added this long-requested feature April 25, 2023 WhatsApp just added this long-requested feature April 25, 2023
Safety advocates see red flags galore with new tech at CES show January 10, 2023 Safety advocates see red flags galore with new tech at CES show January 10, 2023
Got a computer collecting dust? Google’s new software could bring it back to life. February 15, 2022 Got a computer collecting dust? Google’s new software could bring it back to life. February 15, 2022

Did a Fourth Grader Write This? Or the New Chatbot?

By Claire Cain Miller , Adam Playford , Larry Buchanan and Aaron Krolik Dec. 26, 2022

Share full article

Don’t be surprised if you can’t always tell. Neither could two teachers, a professor, nor even the renowned children's author Judy Blume.

“I’m just gonna say it’s a student and prepare for my soul to be crushed.”

Don’t be surprised if you can’t always tell. Neither could a fourth-grade teacher — or Judy Blume.

By Claire Cain Miller , Adam Playford , Larry Buchanan and Aaron Krolik Dec. 26, 2022

It’s hard to fully grasp the enormous potential of ChatGPT , a new artificial intelligence chatbot released last month. The bot doesn’t just search and summarize information that already exists. It creates new content, tailored to your request, often with a startling degree of nuance, humor and creativity. Most of us have never seen anything like it outside of science fiction.

To better understand what ChatGPT can do, we decided to see if people could tell the difference between the bot’s writing and a child’s.

We used real essay prompts from the National Assessment of Educational Progress (the standardized test from the Department of Education, known as the nation’s report card). We asked the bot to produce essays based on those prompts — sometimes with a little coaching, and always telling it to write like a student of the appropriate age. We put what it wrote side by side with sample answers written by real children.

We asked some experts on children’s writing to take our variation on the Turing test , live on a call with us. They were a fourth-grade teacher; a professional writing tutor; a Stanford education professor; and Judy Blume, the beloved children’s author. None of them could tell every time whether a child or a bot wrote the essay. See how you do.

Click to play

Explore Our Coverage of Artificial Intelligence

News and Analysis

Users of Instagram, Facebook, WhatsApp and Messenger will soon be able to use newly added smart assistants , powered by Meta’s latest artificial intelligence model, to obtain information and complete tasks.

Microsoft said that it would make a $1.5 billion investment in G42 , an A.I. giant in the United Arab Emirates, in a deal largely orchestrated by the Biden administration to box out China.

Instagram is testing a program that offers its top influencers the ability to interact with their followers over direct messages using an A.I. chatbot .

The Age of A.I.

Could A.I. change India’s elections? Avatars are addressing voters by name, in whichever of India’s many languages they speak. Experts see potential for misuse in a country already rife with disinformation.

Which A.I. system writes the best computer code or generates the most realistic image? Right now, there’s no easy way to answer those questions, our technology columnist writes .

U.S. clinics are starting to offer patients a new service: having their mammograms read not just by a radiologist, but also by an A.I. model .

A.I. tools can replace much of Wall Street’s entry-level white-collar work , raising tough questions about the future of finance.

The boom in A.I. technology has put a more sophisticated spin on a kind of gig work that doesn’t require leaving the house: training A.I, models .

AI’s ability to write for us—and our inability to resist ‘The Button’—will spark a crisis of meaning in creative work

"Co-Intelligence: Living and Working with AI," by Ethan Mollick.

Soon, every major office application and email client will include a button to help you create a draft of your work. It deserves capital letters: The Button.

When faced with the tyranny of the blank page, people are going to push The Button. It is so much easier to start with something than nothing. Students are going to use it to start essays. Managers will use it to start emails, reports, or documents. Teachers will use it when providing feedback. Scientists will use it to write grants. Concept artists will use it for their first draft. Everyone is going to use The Button.

The implications of having AI write our first drafts (even if we do the work ourselves, which is not a given) are huge. One consequence is that we could lose our creativity and originality. When we use AI to generate our first drafts, we tend to anchor on the first idea that the machine produces, which influences our future work. Even if we rewrite the drafts completely, they will still be tainted by the AI’s influence. We will not be able to explore different perspectives and alternatives, which could lead to better solutions and insights.

Another consequence is that we could reduce the quality and depth of our thinking and reasoning. When we use AI to generate our first drafts, we don’t have to think as hard or as deeply about what we write. We rely on the machine to do the hard work of analysis and synthesis, and we don’t engage in critical and reflective thinking ourselves. We also miss the opportunity to learn from our mistakes and feedback and the chance to develop our own style.

AI can do it

There is already evidence that this is going to be a problem. A recent MIT study found that ChatGPT mostly serves as a substitute for human effort, not a complement to our skills. In fact, the vast majority of participants didn’t even bother editing the AI’s output. This is a problem I see repeatedly when people first use AI: they just paste in the exact question they are asked and let the AI answer it.

A lot of work is time-consuming by design. In a world in which the AI gives an instant, pretty good, near universally accessible shortcut, we’ll soon face a crisis of meaning in creative work of all kinds. This is, in part, because we expect creative work to take careful thought and revision, but also that time often operates as a stand-in for work. Take, for example, the letter of recommendation. Professors are asked to write letters for students all the time, and a good letter takes a long time to write. You have to understand the student and the reason for the letter, decide how to phrase the letter to align with the job requirements and the student’s strengths, and more. The fact that it is time-consuming is somewhat the point. That a professor takes the time to write a good letter is a sign that they support the student’s application. We are setting our time on fire to signal to others that this letter is worth reading.

Or we can push The Button.

And the problem is that the letter the AI generates is going to be good. Not just grammatically correct, but persuasive and insightful to a human reader. It is going to be better than most letters of recommendation that I receive. This means that not only is the quality of the letter no longer a signal of the professor’s interest, but also that you may actually be hurting people by not writing a letter of recommendation by AI, especially if you are not a particularly strong writer. So people now have to consider that the goal of the letter (getting a student a job) is in contrast with the morally correct method of accomplishing the goal (the professor spending a lot of time writing the letter). I am still doing all my letters the old-fashioned way, but I wonder whether that will ultimately do my students a disservice.

Now consider all the other tasks whose final written output is important because it is a signal of the time spent on the task and of the thoughtfulness that went into it—performance reviews, strategic memos, college essays, grant applications, speeches, comments on papers. And so much more.

Reconstructing meaning

Then The Button starts to tempt everyone. Work that was boring to do but meaningful when completed by humans (like performance reviews) becomes easy to outsource—and the apparent quality actually increases. We start to create documents mostly with AI that get sent to AI-powered inboxes, where the recipients respond primarily with AI. Even worse, we still create the reports by hand but realize that no human is actually reading them. This kind of meaningless task, what organizational theorists have called mere ceremony, has always been with us. But AI will make a lot of previously useful tasks meaningless. It will also remove the facade that previously disguised meaningless tasks. We may not have always known if our work mattered in the bigger picture, but in most organizations, the people in your part of the organizational structure felt it did. With AI-generated work sent to other AIs to assess, that sense of meaning disappears.

We are going to need to reconstruct meaning, in art and in the rituals of creative work. This is not an easy process, but we have done it before, many times. Where musicians once made money from records, they now depend on being excellent live performers. When photography made realistic oil paintings obsolete, artists started pushing the bounds of photography as art. When the spreadsheet made adding data by hand unneeded, clerks shifted their responsibilities to bigger-picture issues. This change in meaning is going to have a large effect on work.

Excerpted with permission from Co-Intelligence: Living and Working with AI , by Ethan Mollick, in agreement with Portfolio, an imprint of Penguin Publishing Group, a division of Penguin Random House LLC. Copyright © Ethan Mollick, 2024.

Ethan Mollick is a professor of management at Wharton, specializing in entrepreneurship and innovation. He writes the AI-focused blog One Useful Thing and is the creator of numerous educational games on a variety of topics.

Latest in Tech

0 minutes ago

a group of men with a football in front of them

AI is shaking up how sports like rugby, soccer and cricket are played—and could mint big money for sports clubs

Uber and authorities are probing hoax calls made to an 81-year old man that allegedly led him to shoot a female driver

Short-form videos on social media are both a problem and an opportunity for Netflix, co-CEO Ted Sarandos says

Alice Xiang, who leads AI ethics at Sony, says the success of Rihanna’s makeup line offers a valuable lesson for her field.

Rihanna’s ‘Fenty effect’ could teach AI developers about inclusivity and fighting bias

Tesla’s Q&A with investors rips open Musk anguish: ‘Will you please at least appear to make Tesla your top priority?’

WHO’s new AI-powered chatbot SARAH is available 24/7 and in eight languages – but it’s blundering some answers

Former HGTV star slapped with $10 million fine and jail time for real estate fraud

Exclusive: Mercedes becomes the first automaker to sell autonomous cars in the U.S. that don’t come with a requirement that drivers watch the road

Sundar Pichai fires 28 Google workers for staging sit-in protest over $1.2 billion Israeli contract on company property

Your reusable water bottle may be a breeding ground for strep and fecal bacteria. Here’s how to keep it clean

Luxury brands have a new headache in China: Stingy shoppers are returning their goods, erasing up to 75% of their sales value

JCPenney expects to return half a billion dollars to customers this year

Commentary Nov 6, 2023

Ten Ways AI Will Change Democracy

In a new essay, Harvard Kennedy School’s Bruce Schneier goes beyond AI generated disinformation to detail other novel ways in which AI might alter how democracy functions.

Reimagining Democracy Program

Democracy and AI

photo of a robot hand reaching for a human hand

Artificial intelligence will change so many aspects of society, largely in ways that we cannot conceive of yet. Democracy, and the systems of governance that surround it, will be no exception. In this short essay, I want to move beyond the “AI generated disinformation” trope and speculate on some of the ways AI will change how democracy functions – in both large and small ways.

When I survey how artificial intelligence might upend different aspects of modern society, democracy included, I look at four different dimensions of change: speed, scale, scope, and sophistication. Look for places where changes in degree result in changes of kind. Those are where the societal upheavals will happen.

Some items on my list are still speculative, but non require science-fictional levels of technological advance. And we can see the first stages of many of them today. When reading about the successes and failures of AI systems, it’s important to differentiate between the fundamental limitations of AI as a technology, and the practical limitations of AI systems in the fall of 2023. Advances are happening quickly, and the impossible is becoming the routine. We don’t know how long this will continue, but my bet is on continued major technological advances in the coming years. Which means it’s going to be a wild ride.

So, here’s my list:

1. AI as educator. We are already seeing AI serving the role of teacher. It’s much more effective for a student to learn a topic from an interactive AI chatbot than from a textbook. This has applications for democracy. We can imagine chatbots teaching citizens about different issues, such as climate change or tax policy. We can imagine candidates deploying chatbots of themselves, allowing voters to directly engage with them on various issues. A more general chatbot could know the positions of all the candidates, and help voters decide which best represents their position. There are a lot of possibilities here.

2. AI as sense maker. There are many areas of society where accurate summarization is important. Today, when constituents write to their legislator, those letters get put into two piles – one for and another against – and someone compares the height of those piles. AI can do much better. It can provide a rich summary of the comments. It can help figure out which are unique and which are form letters. It can highlight unique perspectives. This same system can also work for comments to different government agencies on rulemaking processes – and on documents generated during the discovery process in lawsuits.

3. AI as moderator, mediator, and consensus builder. Imagine online conversations, where AIs serve the role of moderator. It could ensure that all voices are heard. It could block hateful – or even just off-topic – comments. It could highlight areas of agreement and disagreement. It could help the group reach a decision. This is nothing that a human moderator can’t do, but there aren’t enough human moderators to go around. AI can give this capability to every decision-making group. At the extreme, an AI could be an arbiter – a judge – weighing evidence and making a decision. These capabilities don’t exist yet, but they are not far off.

4. AI as lawmaker. We have already seen proposed legislation written by AI , albeit more as a stunt than anything else. But in the future AIs will help craft legislation, dealing with the complex ways laws interact with each other. More importantly, AIs will eventually be able to craft loopholes in legislation, ones potentially too complicated for people to easily notice. On the other side of that, AIs could be used to find loopholes in legislation – for both existing and pending laws. And more generally, AIs could be used to help develop policy positions.

5. AI as political strategist. Right now, you can ask your favorite chatbot questions about political strategy: what legislations would further your political goals, what positions to publicly take, what campaign slogans to use. The answers you get won’t be very good, but that’ll improve with time. In the future we should expect politicians to make use of this AI expertise: not to follow blindly, but as another source of ideas. And as AIs become more capable at using tools , they can automatically conduct polls and focus groups to test out political ideas. There are a lot of possibilities here . AIs could also engage in fundraising campaigns, directly soliciting contributions from people.

6. AI as lawyer. We don’t yet know which aspects of the legal profession can be done by AIs, but many routine tasks that are now handled by attorneys will soon be able to be completed by an AI. Early attempts at having AIs write legal briefs haven’t worked , but this will change as the systems get better at accuracy. Additionally, AIs can help people navigate government systems: filling out forms, applying for services, contesting bureaucratic actions. And future AIs will be much better at writing legalese, reducing the cost of legal counsel.

7. AI as cheap reasoning generator. More generally, AI chatbots are really good at generating persuasive arguments. Today, writing out a persuasive argument takes time and effort, and our systems reflect that. We can easily imagine AIs conducting lobbying campaigns , generating and submitting comments on legislation and rulemaking. This also has applications for the legal system. For example: if it is suddenly easy to file thousands of court cases, this will overwhelm the courts. Solutions for this are hard. We could increase the cost of filing a court case, but that becomes a burden on the poor. The only solution might be another AI working for the court, dealing with the deluge of AI-filed cases – which doesn’t sound like a great idea.

8. AI as law enforcer. Automated systems already act as law enforcement in some areas: speed trap cameras are an obvious example. AI can take this kind of thing much further, automatically identifying people who cheat on tax returns or when applying for government services. This has the obvious problem of false positives, which could be hard to contest if the courts believe that “the computer is always right.” Separately, future laws might be so complicated that only AIs are able to decide whether or not they are being broken. And, like breathalyzers, defendants might not be allowed to know how they work.

9. AI as propagandist. AIs can produce and distribute propaganda faster than humans can. This is an obvious risk, but we don’t know how effective any of it will be. It makes disinformation campaigns easier, which means that more people will take advantage of them. But people will be more inured against the risks. More importantly, AI’s ability to summarize and understand text can enable much more effective censorship.

10. AI as political proxy . Finally, we can imagine an AI voting on behalf of individuals. A voter could feed an AI their social, economic, and political preferences; or it can infer them by listening to them talk and watching their actions. And then it could be empowered to vote on their behalf, either for others who would represent them or directly on ballot initiatives. On the one hand, this would greatly increase voter participation. On the other hand, it would further disengage people from the act of understanding politics and engaging in democracy.

When I teach AI policy at HKS, I stress the importance of separating the specific AI chatbot technologies in November of 2023 with AI’s technological possibilities in general. Some of the items on my list will soon be possible; others will remain fiction for many years. Similarly, our acceptance of these technologies will change. Items on that list that we would never accept today might feel routine in a few years. A judgeless courtroom seems crazy today, but so did a driverless car a few years ago. Don’t underestimate our ability to normalize new technologies. My bet is that we’re in for a wild ride.

More from this Program

Congressman Jamie Raskin speaks at an Ash Center conference on the Electoral College

Moving beyond the Electoral College

At an Ash Center symposium on Electoral College reform, Congressman Jamie Raskin makes the case that the US should finally move to a direct popular vote for selecting presidential winners.

Apr 18, 2024

A presidential electoral in Washington State ceremonially signs an electoral college ballot

The Electoral College: What’s to be Done

During an opening panel at an Ash Center symposium on the future of the Electoral College, scholars examined the history behind how the US adopted its peculiar centuries-old system of choosing presidential election winners – and what should be done to reform or even abolish the practice today.

Apr 17, 2024

Photo of Jamie Raskin standing at the podium

The Future of the Electoral College: A Conversation with Congressman Jamie Raskin

Harvard-ID holders were invited to join the Ash Center for Democratic Governance and Innovation and the Institute of Politics for a conversation with Congressman Jamie Raskin (MD-08) about the future of the Electoral College.

Apr 4, 2024

More on this Issue

AI and the Future of Privacy

The GETTING-Plurality Research Network at the Ash Center’s Allen Lab and Connection Science at MIT Media Lab hosted a webinar event focused on “AI and the Future of Privacy”. In this session, we hear from Bruce Schneier , security technologist, and Faculty Affiliate at the Ash Center; Sarah Roth-Gaudette , Executive Director of Fight for the Future; and Tobin South , MIT Ph.D. Candidate and Fulbright Scholar. Each presenter gives a lightning talk, followed by audience Q&A.

Apr 11, 2024

Policy Brief

AGI and Democracy

We face a fundamental question: is the very pursuit of Artificial General Intelligence (AGI) the kind of aim democracies should allow?

Mar 28, 2024

Additional Resource

Democracy On, Not Just Around, the Internet

This essay was adopted from a presentation given by Nathan Schneider at the Second Interdisciplinary Workshop on Reimagining Democracy held on the campus of Harvard Kennedy School in December 2023.

Mar 6, 2024

IMAGES

ChatGPT: A Chatbot That Answers Questions & Writes Essays
The new AI chatbot writing essays and more
Writing an Essay with ChatGPT
Meet the new AI chatbot that can write essays
How to Use AI to Write Essays, Projects, Scripts Using ChatGPT OpenAi
Top 5 AI Essay Writer Tools You Need to Effortlessly Create Compelling

VIDEO

Best 5 AI Chatbots You NEED to Try in 2024 (POE, Copilot, ChatGPT, Gemini, DeepAI)
Gemini Advanced
CHAT GPT VS GEMINI AI Uncovered| CHEIZ TECH
Best AI Tool & Chatbot for Research paper Writing |marmof AI
Which AI chatbot writes essays?
How To Write Essays with AI & Mind Maps

COMMENTS

How ChatGPT (and other AI chatbots) can help you write an essay
1. Use ChatGPT to generate essay ideas. Before you can even get started writing an essay, you need to flesh out the idea. When professors assign essays, they generally give students a prompt that ...
We Used A.I. to Write Essays for Harvard, Yale and Princeton. Here's
Or they could just write their own — chatbot-free — admissions essays from scratch. Bard I'm a high school student with a strong interest in artificial intelligence and machine learning.
How to Write an Essay with ChatGPT
For example, you can include the writing level (e.g., high school essay, college essay), perspective (e.g., first person) and the type of essay you intend to write (e.g., argumentative, descriptive, expository, or narrative ). You can also mention any facts or viewpoints you've gathered that should be incorporated into the output.
AI bot ChatGPT writes smart essays
Now there's a fresh concern: ChatGPT, an artificial intelligence (AI) powered chatbot that creates surprisingly intelligent-sounding text in response to user prompts, including homework ...
Can ChatGPT write a college admission essay? We tested it
The chatbot produced two essays: one responding to a question from the Common Application, which thousands of colleges use for admissions, and one answering a prompt used solely for applicants to ...
Will ChatGPT Kill the Student Essay?
The College Essay Is Dead. Nobody is prepared for how AI will transform academia. By Stephen Marche. Paul Spella / The Atlantic; Getty. December 6, 2022. Suppose you are a professor of pedagogy ...
AI bot ChatGPT stuns academics with essay-writing skills and usability
AI bot ChatGPT stuns academics with essay-writing skills and usability. Professors, programmers and journalists could all be out of a job in just a few years, after the latest chatbot from the ...
Ban or Embrace? Colleges Wrestle With A.I.-Generated Admissions Essays
The school has posted guidelines for applicants on using A.I. tools for college essays. Kendrick Brinson for The New York Times. The personal essay has long been a staple of the application ...
ChatGPT Wrote My AP English Essay—and I Passed
Listen. (2 min) ChatGPT, OpenAI's new artificially intelligent chatbot, can write essays on complex topics. WSJ's Joanna Stern went back to high-school AP Literature for a day to see if she ...
ChatGPT-3.5 as writing assistance in students' essays
ChatGPT-3.5, an AI language model capable of text generation, translation, summarization, and question-answering, has recently been released for public use. Studies have shown it can generate ...
Applying to College? Here's How A.I. Tools Might Hurt, or Help
During the podcast, two Yale admissions officers discussed how using tools like ChatGPT to write college essays was a form of plagiarism. An applicant who submitted a chatbot-generated essay, they ...
A large-scale comparison of human-written versus ChatGPT-generated essays
The corpus features essays for 90 topics from Essay Forum 42, an active community for providing writing feedback on different kinds of text and is frequented by high-school students to get ...
How to Use OpenAI to Write Essays: ChatGPT Tips for Students
3. Ask ChatGPT to write the essay. To get the best essay from ChatGPT, create a prompt that contains the topic, type of essay, and the other details you've gathered. In these examples, we'll show you prompts to get ChatGPT to write an essay based on your topic, length requirements, and a few specific requests:
AI ChatGPT: OpenAI, DALL-E Maker's New Essay-Writing Bot Blowing People
A new chatbot created by artificial intelligence non-profit OpenAI Inc. has taken the internet by storm, as users speculated on its ability to replace everything from playwrights to college essays.
AI chatbot can write essays
Imagine if Siri could write you a college essay, or Alexa could spit out a movie review in the style of Shakespeare. OpenAI last week opened up access to ChatGPT, an AI-powered chatbot that ...
Student-developed AI chatbot opens Yale philosopher's ...
It's not about using the chatbot to write essays. It's about using this technology to deepen your knowledge and sharpen your creativity and critical thinking skills." Generative AI and other innovations are changing how people learn, Floridi said. So there needs to be new approaches to teaching.
EssayGenius
Write better essays, in less time, with your AI writing assistant. EssayGenius uses cutting-edge AI to help you write your essays like never before. Generate ideas, rephrase sentences, and have your essay structure built for you. EssayGenius lets you write better essays, in less time. Our AI tools help you generate new paragraphs, complete ...
A new tool helps teachers detect if AI wrote an assignment
Several big school districts such as New York and Los Angeles have blocked access to a new chatbot that uses artificial intelligence to produce essays. One student has a new tool to help.
College student claims app can detect essays written by chatbot ChatGPT
Tian told subscribers the newer model used the same principles, but with an improved capacity to detect artificial intelligence in text. "Through testing the new model on a dataset of BBC news ...
(PDF) "CHATBOTS IMPACT ON ACADEMIC WRITING"
Intelligence Generated Content (AIGC), which aims to make the creation process of digital content, including academic writing, music, web log (blog), et al., effective, efficient, and accessible ...
ChatGPT: A GPT-4 Turbo Upgrade and Everything Else to Know
We humans interact with AI chatbots by writing prompts -- questions or statements that seek an answer from the information stored in the chatbot's underlying large language model. OpenAI ...
Free AI Detector
Scribbr's AI Detector helps ensure that your essays and papers adhere to your university guidelines. ... Check the authenticity of your students' work. More and more students are using AI tools, like ChatGPT in their writing process. Analyze the content submitted by your students, ensuring that their work is actually written by them.
Students Are Likely Writing Millions of Papers With AI
Plus, there's nuance to how students use gen AI; some may ask chatbots to write their papers for them in large chunks or in full, while others may use the tools as an aid or a brainstorm partner.
Best AI Chatbot for Essay Writing
You can use custom AI chatbots to research the topic of your essay and create an outline for your essay. Custom AI chatbots help students with paraphrasing, summarization, text generation and spelling & grammar fixes. ZenoChat by TextCortex is a customizable conversational AI that will meet your various essay writing needs, from research to ...
How to use Meta's new AI chatbot that you can't avoid
Ethically, you shouldn't use it to write papers for school, though Meta AI is happy to spit out wooden essays on demand. Advertisement Experts warn there is a danger of misinformation from tools ...
Did a Fourth Grader Write This? Or the New Chatbot?
Don't be surprised if you can't always tell. Neither could a fourth-grade teacher — or Judy Blume. By Claire Cain Miller, Adam Playford, Larry Buchanan and Aaron Krolik Dec. 26, 2022. It's ...
AI's ability to write for us—and our inability to resist 'The Button
Students are going to use it to start essays. Managers will use it to start emails, reports, or documents. ... The implications of having AI write our first drafts (even if we do the work ...
Ten Ways AI Will Change Democracy
7. AI as cheap reasoning generator. More generally, AI chatbots are really good at generating persuasive arguments. Today, writing out a persuasive argument takes time and effort, and our systems reflect that. We can easily imagine AIs conducting lobbying campaigns, generating and submitting comments on legislation and rulemaking. This also has ...
How to Use AI to Prepare for Exams
Use AI Text-Summarisation Tools. Preparing for an exam draws on your capacity to summarise research papers or large chunks of information and extract the most essential points. AI text-summarisation tools can be helpful in this regard. These tools use advanced algorithms to analyse large amounts of text and generate concise summaries that ...