"I'll never look at a prompt the same way again"
Preliminary findings from an empirical study of an AI literacy intervention
Abolitionists. Boosters. Clankers. Doomers. Generative AI has spawned a veritable abcedaria of value-laden metaphors and terminology to convey our orientation to this emerging technology. As an AI realist set adrift, I’ve landed on a metaphor of my own:
Generative AI is just the latest iteration of computing input-output devices.
It’s in this spirit that I approach AI literacy work, not as a dramatic departure from other library research instruction, but as an emerging modality in which researchers seek, retrieve, evaluate, synthesize, and apply information.
Like its forebears—command line searching (I trained on DIALOG in grad. school!), search operators, structured Boolean strings, limits, facets, and natural language searching—prompting is the latest in a lineage of querying techniques that researchers can employ to discover and retrieve information.
But at least until “Let’s ask chat”1 achieves the cultural ubiquity of “Google it,” generative AI users are going to need direct instruction on prompting strategies. Prompting is perhaps the key procedural knowledge, or skill, of [generative] AI literacy, complementing declarative knowledge of how large language models are trained and generate output, and dispositional ethics about whether, when, and how to use generative AI in the first place.

Early on, I adopted Leo Lo’s CLEAR Framework for prompt design instruction. I appreciate that it attends to Adaptiveness and Reflection, which encourage generative AI users to evaluate output and iteratively refine their prompting strategies.
However, I got the feeling that students just weren’t getting it—that they needed more direct instruction on what it mean to be Concise, Logical, and Explicit in prompt design. The PROMPT Design Framework addresses this need, scaffolding down from the CLEAR Framework by specifying the elements of an effective prompt: Persona, Requirements, Organization, Medium, Purpose, and Tone.
PROMPT quickly gained traction with students and colleagues alike, until the fateful day that one of my liaison faculty asked:
How do you know if it works?
So began our research collaboration and a mixed methods, pre/post study of the PROMPT Design Framework as an AI literacy intervention.
Empirical Evaluation of an AI Literacy Intervention
As co-investigators, Dr. Ada Leung2 and I synthesized research methods from existing empirical studies3 of prompt engineering as an element of AI literacy into a mixed methods pre/post intervention study. Data collection is organized in two stages, comprising a survey instrument and focus group.
Stage 1: Survey
During Stage 1 of the study, consenting participants recruited from a 300-level undergraduate marketing course completed a demographic questionnaire, conducted a marketing research task using Microsoft Copilot (enterprise version) and uploaded their prompts and output into the survey instrument, and answered AI literacy self-assessment questions. The research task directed participants to
Research consumer demographic information of people living in Berks County, PA for an electric vehicle (EV) that potentially appeals to the people living in the area. Develop an elevator speech (1-minute presentation) to highlight the demographic trends of the area that make EV an attractive/unattractive option for the residents of Berks County.
Intervention: Teaching to the PROMPT
In the intervention phase, students learned the PROMPT Design Framework and its constituent elements:
Persona: Assign the AI a role
Requirements: Specify inclusion and exclusion criteria and other parameters for output
Organization: Describe the structure of output (bulleted or numbered list, data table, alphabetical / chronological / etc.)
Medium: Describe the format or file type of the output
Purpose: Explain the rhetorical purpose and intended audience for the output, and
Tone: Describe the tenor, feeling, or ‘voice’ of the output.
I further explained that a student perspective inspired PROMPT: Kanika Gupta, an international undergraduate computer science student, wrote for the Connect Happy Valley newsletter about how she was using generative AI to refine her pitches and win pitch contests. She attributed her success to prompt engineering and provides an example of a ‘good’ prompt:
“Imagine you’re a book expert. Quickly tell me what George Orwell’s ‘1984’ is all about and why it’s a big deal in literature.”
This is the inspiration behind Persona in the PROMPT framework.
During the intervention phrase of the study, we dissected Kanika Gupta’s ‘good prompt’ to see the PROMPT Design Framework in action. We can see that she assigns the AI a Persona by prompting, “Imagine you’re a book expert.” Next, she establishes some Requirements by prompting, “Quickly tell me what Geoge Orwell’s ‘1984’ is all about.” She also describes a Purpose, to understand “why it’s a big deal in literature.”
In this example, Kanika doesn’t explicitly use the Organization, Medium, or Tone elements of the PROMPT Framework. However, we can imagine ways that Kanika could use these elements to refine her prompt even more. For example, she could prompt the AI to Organize its summary by chapter, theme, character, or some other literary element. She could have asked for the summary in a format like a brief video or a concept map visualization. Finally, I find that many college students have mastered the Tone element of the PROMPT Framework by instructing AI to write “like a college student” or “like a first-year undergraduate.”
In the intervention phase of the study, we also contextualized prompt design within the AI ethics landscape. Specifically, we argued that prompt design can mitigate some AI fabrication and bias by specifying requirements to increase the relevance, quality, and usefulness of output. Additionally, we argue that the process of reflecting on one’s information need for purposes of prompt design also primes one’s mind for critical thinking and evaluating the output. And while end users’ locus of control is small, we know that most of the environmental impact of LLM’s is created in their inference phase, when they are actively used to generate output; consequently, we suggest that crafting effective prompts can generate better quality output in fewer attempts.
Post-Intervention: Return to Copilot
After learning the PROMPT framework, students repeated the marketing research task using Copilot, uploaded their post-intervention prompts and output into the survey instrument, and repeated the AI literacy self-assessment. This post-intervention data collection enables us to compare students’ prompting behavior and subjective AI literacy self-assessment before and after learning the PROMPT framework, and to determine whether the PROMPT framework affects students’ prompting effectiveness or AI self-efficacy.
Stage 2: Focus Groups
Following the survey-based research session, a subset of participants were recruited to participate in focus groups. Focus group participants composed written responses to questions in a workbook and engaged in a facilitated discussion. (We were particularly excited to ‘flip the script’ and invite students to grade the PROMPT Framework in a report card format as part of the focus group!)
A total of 47 students consented to participate in the research study, generating a total of 47 valid survey responses and 73 prompt-output artifacts (45 pre-intervention and 28 post-intervention). Additionally, 13 participants engaged in one of two hour-long focus group sessions.
What We’ve Learned (So Far)
While data analysis is still underway, we presented some encouraging preliminary findings for the LOEX Fall Focus 2025 conference. Among these were the results of an initial round of open, axial, and selective coding of the two focus group transcripts. At 8,400-words apiece, the two transcripts amounted to nearly 17,000 words and yielded 763 total open codes and 254 unique codes. Further axial and selective coding surfaced four primary themes and associated subthemes that provide insight into students’ characterizations of generative AI, orientations to generative AI, interests in AI literacy instruction, and concerns about AI.
Characterization of AI
Within the Characterization of AI theme, approximately half of codes describe AI as an Information Concierge – an agent that delivers personalized, point-of-need information services. Participants described generative AI as
“like Google, but more personalized,”
“almost like a person that always knows the answer,” and
“somebody who’s, like, well versed in most topics,”
and disclosed uses from finding the highest-rated local mechanic at the lowest price point to recipe design and cooking advice.
The next most frequent characterization was AI as a Medium – a tool or implement used in their own creative work. 22% of open codes in the Characterization of AI theme describe AI as a Medium, including explanations like
“I need to generate a lot of data sets and a lot of charts,”
“you could use it if you would need help writing an email,” and
“such as creating videos, pictures, management advice.”
Within the AI as a Medium subtheme, students were explicit about AI lacking the capacity to think or create:
“it still doesn’t think and create new ideas.”
16% of Characterization of AI codes describe AI as a Bot, using terminology related to computation and machine learning:
“just a very large computer that processes information and provides output depending on what you put in,” and
“a cumulative database of every bias, thought, research, idea and argument humans have ever put on the internet.”
Approximately 1 in 10 characterize AI in terms related to Artificial Intimacy, such as a therapist, virtual friend, or digital lover.
Orientation to AI
Under the Orientation to AI theme, two-thirds of codes refer to Prompting as Strategic Exploration – the use of AI to explore a broad range of personal, academic, and professional interests, with participants often employing trial-and-error or free web resources to teach themselves prompting techniques. Many describe their independent learning strategies for prompting generative AI tools:
“I just played around with it a lot,”
“trial and error, testing out different prompts, testing how specific you should get,” and
“maybe I need a little bit more of this sort of information, so I’ll be more specific and ask it to cater towards that.”
15% of codes described experiencing secondhand AI antipathy, often from professors —“teachers are very much like, ‘I do not want you touching it’”—while 8% expressed firsthand AI antipathy—“It refused to cooperate with me.” A small number of codes exhibited AI apathy or AI realism, and identified as AI natives.
AI Literacy
When discussing AI literacy, participants primarily focus on research-related procedural knowledge. About 27% of codes identified prompting as a skill of interest, but this could be an artifact of the study itself. Nearly 5% of codes indicate a preference for instruction on synthesizing and citing AI output, and 4% wanted further discussion of academic integrity and classroom guidelines. A small number of codes observed how AI literacy could influence participants’ dispositions toward AI use.
Small numbers of codes also discussed participants’ preferences for AI literacy pedagogy, especially active learning approaches, and resources they reference when independently learning AI skills.
(An artifact of the study, the PROMPT Framework accounts for 49% of codes in the AI literacy category.)
AI Concerns
In the AI Concerns category, the highest concentration of codes related to personal responsibility and user agency at 19%:
“a facet of AI that we don’t really talk about is our personal responsibility as, like, end users,”
[AI is] “negating what kind of responsibility we’re supposed to be doing as human beings,” and
“it’s not the AI’s fault. It’s your fault. You’re letting it take advantage of you.”
Green AI and the environmental impact of generative AI follows closely at 16%:
“it kind of shocks me looking to see how much energy ChatGPT, or, like, AI uses.”
Cognitive decline and AI dependency account for another 16% of codes:
“because of AI use, people are not using, like, their own minds to figure things out. There’s cognitive deficiency,”
“the loss of critical thinking skills , like, if you don’t practice the muscle, it kind of goes away,”
“why can’t I think about this ? Oh, wait, because I said, ‘let’s ask chat.’”
AI fabrication is present in 14% of codes—“the amount of hallucinations that it produced in one sitting was astronomical”—while satisficing and information overload is present in 11% of codes:
“you can kind of get by with minimal effort,”
“it’s definitely like a copy and paste type deal,” and
“I think that creates insane laziness.”
Job displacement is present in 8% of codes, but participants also discussed ‘student displacement,’ the idea that improper or excessive use of AI for academic purposes is interfering with their learning experience, peer and instructor relationships, and making college a poor investment of time and money:
“For homework, I’m just gonna toss an online generator because I don’t have time to deal with this.”
“You’re paying to be here if you don’t want to take advantage of that, like, that’s fine.”
“We’re paying thousands of dollars for college, and what are we doing? We’re wasting our time.”
Finally, social harms of AI account for almost 7% of codes in the AI Concerns category.
The waiting is the hardest part
We handed off participants’ anonymized prompt-output artifacts to independent raters, who will score their prompt efficacy and efficiency using a rubric adapted from Lo’s CLEAR Framework. In the coming weeks, my collaborator will aggregate the research team’s analysis of prompting quality based on rubric scores and a simple count of PROMPT elements evident in pre- and post-intervention prompts, and calculate interrater reliability metrics, to test the hypothesis that learning the PROMPT Design Framework can help people be more effective and efficient generative AI users.
When it comes to answering Ada’s question—“How do you know if it works?”—we’ll follow the evidence to the extent that it supports claims of statistically significant improvements in participants’ prompting behavior and AI self-efficacy.
But as teachers, sometimes we get by on believing that the seeds of inquiry we sow today might take more time to blossom into observable information behaviors than a single research session will allow. Sometimes it’s the human significance, and not the statistical significance, that matters. So listening back to the focus group recordings, I heard all I needed to hear when a student said:
“I think I’ll never look at a prompt again the same way. Now, when I type in, I’m going to think about it and think, ‘Okay, what am I telling this to explain? How am I, how is it going to explain this?’”
On this point, PROMPT works.
Acknowledgements
Undergraduate research assistants: Janet Ruiz and Jeilyn Tineo.
Focus group facilitator: Alexandria Chisholm.
Sponsor: This project was supported in part by a Penn State University Libraries Faculty Organization Research Grant.
Research ethics: This study was pre-reviewed and deemed exempt by Penn State University Human Research Protection Program (Institutional Review Board STUDY00027385).
AI use: Otter.ai was used to transcribe focus group recordings. AI was not used for coding, qualitative analysis, or writing. (That’s my craft, which I relish as any artisan does. Why would I relinquish it to the machine?)
To promote viewpoint diversity, Heterodoxy in the Stacks invites constructive dissent and disagreement in the form of comments guest posts. While articles published on Heterodoxy in the Stacks are not peer-reviewed, all posts and comments must model the HxA Way. Content is attributed to the individual contributor(s).
To submit an article for Heterodoxy in the Stacks, submit the Heterodoxy in the Stacks Guest Submission form in the format of a Microsoft Word document, PDF, or a Google Doc. Unless otherwise requested, posts will include the author’s name and the commenting feature will be on. We understand that sharing diverse viewpoints can be risky, both professionally and personally, so anonymous and pseudonymous posts are allowed.
Please see “About” to see what kinds of subject matter is appropriate for submissions.
Thank you for joining the conversation!
Participants used this phrase in a focus group to describe prompting ChatGPT to answer a casual, everyday life question.
Associate professor of marketing, Penn State Berks.
Gastineau, J. (2024). Exploring the impact of generative AI prompt engineering in higher education: A study in undergraduate and graduate business analytics courses [Doctoral dissertation, University of Arkansas]. ScholarWorks@UARK. https://scholarworks.uark.edu/etd/5551/
Guha, A., Grewal, D., & Atlas, S. (2024). Generative AI and marketing education: What the future holds. Journal of Marketing Education, 46(1), 6-17. https://doi.org/10.1177/02734753231215436
Knoth, N., Tolzin, A., Janson, A. & Leimeister, J. M. (2024). AI literacy and its implications for prompt engineering strategies. Computers and Education: Artificial Intelligence, 6, article no. 100225. https://doi.org/10.1016/j.caeai.2024.100225
Rossi, V. (2024). Inquiring the ‘oracle’. An empirical study on how artificial intelligence literacy and prompt engineering influence the use of LLMs and GAI in higher education [Master’s thesis, Università Ca’Foscari Venezia]. UNITesi. https://unitesi.unive.it/handle/20.500.14247/23630
Wang, Y. Y. & Chuang, Y. W. (2024). Artificial intelligence self-efficacy: Scale development and validation. Education and Information Technologies 29: 4785–4808. https://doi.org/10.1007/s10639-023-12015-w
Woo, D. J., Wang, D., Yung, T., & Guo, K. (2024). Effects of a prompt engineering intervention on undergraduate students’ AI self-efficacy, AI knowledge and prompt engineering ability: A mixed methods study. https://doi.org/10.48550/arXiv.2408.07302




Very interesting, Sarah. In thinking of prompt engineering, I'm reminded of "question formulation" as we used to describe it, in formulating research questions and mapping them into the information resources . Not exactly the same, but maybe similar.
For example, PICO in evidence-based medical research:
https://www.nlm.nih.gov/oet/ed/pubmed/pubmed_in_ebp/02-100.html
(Patient, Intervention, Comparison, Outcome)
I'm even thinking of "research question analysis" which came from engineering and which was adapted in library instruction for a while in the 1980s, before "search strategy formulation" (that is, key terms + Boolean operators) took over.
Queston analysis is discussed in "Learning the Library" (Sharon Hogan)--
Scope of problem/question
Formats needed
Geographic scope
Time frame associated with topic
Depth of information needed,
etc.
Teaching a thought process on the front end of searching and encouraging reflection points throughout is maybe a throughline.