GPT detectors can be biased against non-native English writers
Date:
July 10, 2023
Source:
Cell Press
Summary:
Researchers show that computer programs commonly used to determine
if a text was written by artificial intelligence tend to falsely
label articles written by non-native language speakers as
AI-generated. The researchers caution against the use of such AI
text detectors for their unreliability, which could have negative
impacts on individuals including students and those applying
for jobs.
Facebook Twitter Pinterest LinkedIN Email
==========================================================================
FULL STORY ==========================================================================
In a peer-reviewed opinion paper publishing July 10 in the journal
Patterns, researchers show that computer programs commonly used to
determine if a text was written by artificial intelligence tend to
falsely label articles written by non-native language speakers as
AI-generated. The researchers caution against the use of such AI text
detectors for their unreliability, which could have negative impacts on individuals including students and those applying for jobs.
"Our current recommendation is that we should be extremely careful about
and maybe try to avoid using these detectors as much as possible,"
says senior author James Zou, of Stanford University. "It can have
significant consequences if these detectors are used to review things like
job applications, college entrance essays or high school assignments."
AI tools like OpenAI's ChatGPT chatbot can compose essays, solve science
and math problems, and produce computer code. Educators across the
U.S. are increasingly concerned about the use of AI in students' work
and many of them have started using GPT detectors to screen students' assignments. These detectors are platforms that claim to be able to
identify if the text is generated by AI, but their reliability and effectiveness remain untested.
Zou and his team put seven popular GPT detectors to the test. They ran
91 English essays written by non-native English speakers for a widely recognized English proficiency test, called Test of English as a Foreign Language, or TOEFL, through the detectors. These platforms incorrectly
labeled more than half of the essays as AI-generated, with one detector flagging nearly 98% of these essays as written by AI. In comparison,
the detectors were able to correctly classify more than 90% of essays
written by eighth-grade students from the U.S. as human-generated.
Zou explains that the algorithms of these detectors work by evaluating
text perplexity, which is how surprising the word choice is in an
essay. "If you use common English words, the detectors will give
a low perplexity score, meaning my essay is likely to be flagged as AI-generated. If you use complex and fancier words, then it's more likely
to be classified as human written by the algorithms," he says. This
is because large language models like ChatGPT are trained to generate
text with low perplexity to better simulate how an average human talks,
Zou adds.
As a result, simpler word choices adopted by non-native English writers
would make them more vulnerable to being tagged as using AI.
The team then put the human-written TOEFL essays into ChatGPT and
prompted it to edit the text using more sophisticated language, including substituting simple words with complex vocabulary. The GPT detectors
tagged these AI-edited essays as human-written.
"We should be very cautious about using any of these detectors
in classroom settings, because there's still a lot of biases, and
they're easy to fool with just the minimum amount of prompt design,"
Zou says. Using GPT detectors could also have implications beyond the
education sector. For example, search engines like Google devalue
AI-generated content, which may inadvertently silence non- native
English writers.
While AI tools can have positive impacts on student learning, GPT
detectors should be further enhanced and evaluated before putting into
use. Zou says that training these algorithms with more diverse types of
writing could be one way to improve these detectors.
* RELATED_TOPICS
o Computers_&_Math
# Educational_Technology # Artificial_Intelligence #
Neural_Interfaces # Computer_Programming
o Science_&_Society
# Education_and_Employment # Educational_Policy #
STEM_Education # Poverty_and_Learning
* RELATED_TERMS
o Computer_virus o Computer_vision o Computer_and_video_games
o Search_engine o Artificial_intelligence o
Environmental_impact_assessment o Cyber-bullying o
Graphical_user_interface
==========================================================================
Print
Email
Share ========================================================================== ****** 1 ****** ***** 2 ***** **** 3 ****
*** 4 *** ** 5 ** Breaking this hour ==========================================================================
* Six_Foods_to_Boost_Cardiovascular_Health
* Cystic_Fibrosis:_Lasting_Improvement *
Artificial_Cells_Demonstrate_That_'Life_...
* Advice_to_Limit_High-Fat_Dairy_Foods_Challenged
* First_Snapshots_of_Fermion_Pairs *
Why_No_Kangaroos_in_Bali;_No_Tigers_in_Australia
* New_Route_for_Treating_Cancer:_Chromosomes *
Giant_Stone_Artefacts_Found:_Prehistoric_Tools
* Astonishing_Secrets_of_Tunicate_Origins *
Most_Distant_Active_Supermassive_Black_Hole
Trending Topics this week ========================================================================== SCIENCE_&_SOCIETY Education_and_Employment Environmental_Policies Land_Management BUSINESS_&_INDUSTRY Engineering_and_Construction Recycling_and_Waste Textiles_and_Clothing EDUCATION_&_LEARNING
Intelligence Patient_Education_and_Counseling Educational_Psychology
==========================================================================
Strange & Offbeat ========================================================================== SCIENCE_&_SOCIETY Chatgpt_Designs_a_Robot Robots_and_Rights:_Confucianism_Offers_Alternative Researchers_Use_21st_Century_Methods_to_Record_2,000_Years_of_Ancient_Graffiti in_Egypt BUSINESS_&_INDUSTRY AI_Tests_Into_Top_1%_for_Original_Creative_Thinking Virtual_Reality_Games_Can_Be_Used_as_a_Tool_in_Personnel_Assessment Does_Throwing_My_Voice_Make_You_Want_to_Shop_Here?
EDUCATION_&_LEARNING Illusions_Are_in_the_Eye,_Not_the_Mind A_Broader_Definition_of_Learning_Could_Help_Stimulate_Interdisciplinary Research How_the_Brain_Says_'Oops!' Story Source: Materials provided
by Cell_Press. Note: Content may be edited for style and length.
========================================================================== Journal Reference:
1. Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou. GPT
detectors are biased against non-native English writers. Patterns,
2023; 100779 DOI: 10.1016/j.patter.2023.100779 ==========================================================================
Link to news story:
https://www.sciencedaily.com/releases/2023/07/230710113921.htm
--- up 1 year, 19 weeks, 10 hours, 50 minutes
* Origin: -=> Castle Rock BBS <=- Now Husky HPT Powered! (1337:3/111)