Oct 86 min read

GenAI and Integrity: The Arguments to Reshape the Thesis Assessment Structure

It is well-documented that writing-based assessments are increasingly vulnerable to academic integrity issues, intensified by both contract cheating and the rise of Generative AI (GenAI) tools. Initially, research seemed shielded from GenAI-based risks, but recent findings from my team indicate that even research-based projects are now at significant risk. One area of concern in engineering education is the final year capstone project or thesis. A thesis (at many institutions) generally has substantial assessment weighting on the final report, influenced by the ability of the student to document and explain the technical background and adventure they embarked on. Hence, based on the technical capabilities students need to demonstrate, many have questioned if the focus on the assessment of writing has served its purpose. I have personally witnessed the growing use of GenAI by students in their writing of such projects. They are easiest to spot when used without understanding what was written or the limitations of the GenAI.

At the University of Wollongong (UOW), I have been involved in discussions on developing a new assessment structure for the engineering thesis to address these risks. While we have yet to finalise or approve any changes, we are at a stage where broader dialogue is crucial. Sharing our initial thoughts will help us gather feedback and offer insights to others facing similar challenges. After all, we are all navigating this evolving landscape together, and the AAIEEC promotes collaboration across the academic community.

Key Questions for Reimagining Assessment

Some of the central questions we have been wrestling with include:

Is the current emphasis on large written reports still appropriate in the GenAI era? Students still need to know how to conduct research, write coherently, and demonstrate ethical practice. But are traditional written assessments still the best way to evaluate these skills?
What core learning outcomes are we assessing through written components of thesis projects, and do they need reevaluation?
Should we permit the use of AI in thesis projects to reflect future-facing professional practices?
Should there be a stronger focus on the demonstration aspects of thesis projects? Does demonstration improve the validity of assuring technical capability?
What role should supervisors play in the overall student assessment?

Proposed Modifications to the Assessment Structure

After much debate, we have identified several potential modifications to improve integrity and align assessments with future professional demands.

1. Embracing AI as a Tool for Learning

The idea of allowing students to use AI freely was one of the most controversial topics. On one hand, we want to help students develop GenAI skills ethically rather than risk them seeking out questionable online resources. Currently, there is no reliable AI detection tool, so instead of policing its use, why not incorporate it into the learning process? To illustrate this, I wrote this blog post at speed, careless about the words and phrases that I used. Grammarly went into a stress-out mode, covering my page with so much underline that the writing was almost unreadable. I then ran it through ChatGPT-4 for editorial guidance, which still required proofreading and evaluative judgment. While many edits on the ChatGPT edit were needed, this allowed me to speed up the process. It turned about four hours of careful deliberation into a 30min task. An example of working more productively? If an AI detection tool were used, it would likely flag a high match, but is this process truly unethical if my original thoughts and evaluative judgment were the foundation? All it did was tidy up and reshape my thoughts and words. I still needed to reread it and make edits, and many edits I did do. This includes bringing in non-formal words and structures that creates some feeling that this blog was written by a human. That is one thing I hate about Grammarly, it is so aggressive about its rules that it takes all personality out of writing.

In professional contexts, engineers are expected to leverage AI to increase productivity, producing reports, proposals, and designs more efficiently. Employers will value employees who can use AI effectively and add value beyond what AI generates.

Counterarguments: However, a concern arises—what if AI is used by students as a shortcut, undermining learning? As mentioned, I have already seen this happen first-hand. While AI can enhance productivity for those who know what they want to say, it can also be detrimental to students who are still developing critical thinking and writing skills. Balancing these risks is essential. At this stage, most faculty members are comfortable permitting AI for editing, with the caveat that we must maintain integrity in key learning outcomes. How can we ensure AI is used as a tool for growth rather than as a shortcut to p’s mean degrees? The only way to ensure this is to embed a number of flags into the design.

2. Increased Weight for Supervisor Evaluation

In a GenAI world, it is vital to focus on skills that AI cannot replicate. As a laboratory guru, I really love emphasising psychomotor and affective skills. To encourage a greater holistic set of skills, we have proposed increasing the weight of supervisor assessments. Supervisors would assign marks based on the student's ability to meet regular milestones, demonstrate understanding during meetings, and incorporate feedback throughout the project.

This system creates safeguards against last-minute work and allows supervisors to assess not only the written report but also a diverse range of student’s competencies. It also gets students into the habit of working as a professional would. A large discrepancy between the student's written report and a student's demonstrated ability via their regular progress would signal a potential integrity issue.

Counterarguments: Some may argue that relying heavily on supervisors could introduce bias or subjectivity into the grading process. However, we believe regular touchpoints will allow for more comprehensive feedback than they currently receive and a more accurate reflection of the student’s abilities, ensuring that the written work aligns with the student's progress and understanding. Of note, I currently supervise using this method. This would simply make it the standard, and increase its weighting.

3. Greater Focus on Presentations and Demonstrations

While AI can help students convert their written thesis to a beautiful PowerPoint presentation with speaker notes at the click of a finger, students still need to prove that they can present that information to their peers. Presentations and demonstrations are inherently human skills, and AI tools like ChatGPT cannot replace a student's ability to present and explain their work. By increasing the weighting of these components, we better assess communication and other competencies across cognitive, psychomotor, and affective domains. Through demonstration, we get a clearer picture of a student's technical capability. As engineers, we want students to build, code and simulate, and this allows us to better ensure that those capabilities are present. This shift also provides an additional layer of integrity checking, as it is far more difficult for a student to fake understanding in a live demonstration than in a written report.

We propose that presentations be marked by non-supervisors to provide an unbiased assessment and offer a point of comparison with the written report and supervisor evaluation. This creates multiple checks and balances. If one of the three pillars shows an imbalance, quality assurance processes can be put in place for confirmation.

Counterarguments: A potential concern is the increased pressure on students who may not excel in oral presentations. However, these skills are critical in the engineering profession. While students may expect their work to focus on the technical, communicating is where most time is generally spent. Therefore, it is essential to prepare students for real-world expectations.

4. Reduced Emphasis on the Written Report

As we shift weight towards presentations, demonstrations, and supervisor evaluations, the focus on the written report will naturally decrease. However, writing remains a critical skill, especially in engineering, where technical capability is essential. With the use of AI tools permitted, we propose raising the standard for written work. Students will need to engage deeply with technical content and exercise strong evaluative judgment in their reports.

Reducing the emphasis on writing may help reduce the risks of academic integrity breaches. With less weight on a single, easily manipulated component, students will be encouraged to focus on future-facing skills that are more aligned with modern engineering practice.

Conclusion: A Holistic Approach to Capstone Assessment

The proposed changes aim to create a more balanced assessment structure that reflects the competencies engineers need in the workforce while safeguarding academic integrity. By distributing assessment weight across multiple components—supervisors' observations, presentations, demonstrations, and written work—we can more accurately evaluate a student’s capabilities. Even if GenAI or contract cheating were involved, students would need a deep understanding of their work to navigate the safeguards in place. This design removes the focus from cheating to the validity of the learning objectives that makes thesis an activity that showcases a student's holistic engineering capability.

An overview of the proposed structure is provided:

This conversation is far from over, and we encourage feedback on these ideas. What strengths or weaknesses do you see in this proposed approach? Join the discussion below or on LinkedIn.

Sasha Nikolic

President, AAIEEC

University of Wollongong

GenAI and Integrity: The Arguments to Reshape the Thesis Assessment Structure

Recent Posts

Comments