November 6, 2025

Research found that AI agents can’t complete 97% of tasks on Upwork to even a basic standard

Scale AI and the Center of AI research found that AI agents can’t complete 97% of tasks on Upwork to even a basic ￰0￱ study used six different AI models to tackle 240 Upwork projects across various categories, including writing, design, and data analysis, and evaluated their findings against those of real ￰1￱ survey found that the best AI model, Manus, only managed to complete 2.5% of the tasks successfully and earned about $1,810 out of $143,991 on ￰2￱ AI models, such as Claude Sonnet and Grok 4, completed 2.1% of the ￰3￱ believe AI won’t replace jobs any time soon The researchers discovered that AI agents struggle with multi-step workflows, taking initiative, or using ￰4￱ also agreed that AI will not replace jobs for a while ￰5￱ to research by the European Broadcasting Union and BBC, AI models, including ChatGPT, Copilot, and Perplexity, are not good at reporting ￰6￱ research found that AI models fail to meet key criteria, such as sourcing, accuracy, generating text, and distinguishing between opinion and ￰7￱ models had at least one significant issue in 45% of answers, while only 31% of AI answers were scored correctly. 20% of AI answers were wrong and had outdated info and hallucinated ￰8￱ of all models, Gemini recorded 76% of significant issues in its ￰9￱ revealed research that found AI-generated cover letters have jeopardized efforts in applications, resulting in employers hiring fewer people or even the wrong ￰10￱ firm also revealed that skilled workers in the top quintile for abilities are being hired 19% less often than they were before, while those in the lowest quintile are being hired 14% more ￰11￱ study corroborates an MIT research report from August, which concluded that 95% of organizations have garnered zero return from their collective $30 billion investment in ￰12￱ to WorldTest from MIT and Basis Research, AI agents can match patterns and predict words, but struggle to build internal models of the ￰13￱ study involved 129 tasks across 43 interactive worlds, which required the AIs to predict hidden aspects of the world, plan sequences of actions to achieve a goal, and determine when the rules of the environment ￰14￱ researchers also tested 517 humans on the same tasks and found that humans achieve near-optimal scores while AI models frequently ￰15￱ researchers argued that humans perform better on tasks because they intuitively understand their environments, adjust their perspectives, run experiments, start from scratch, and explore ￰16￱ to the study, adding more compute to existing models also doesn’t work; it only helps 25 out of 43 ￰17￱ and AI Czar warns of AI-driven censorship on social media MIT Sloan researchers and Safe Security found that AI drives 80% of ransomware ￰18￱ to a study of 2,800 ransomware attacks by the Cybersecurity Arms Race, adversarial AI was found to be automating entire attack sequences, including creating malware, phishing campaigns, and deepfake phone calls for social engineering ￰19￱ Kevin Beaumont disagrees with the research, claiming that generative AI isn’t a major part of any of ￰20￱ Marcus Hutchins also called the paper absurd, adding that he burst out laughing.

“The paper is almost complete nonsense; it’s jaw-droppingly bad. It’s so bad it’s difficult to know where to start.” – Kevin Beaumont , Security Researcher at ￰21￱ and AI Czar David Sacks also stated that he’s concerned that the censorship on social media and search engines seen in recent years will become thoroughly dystopian with the advent of generative ￰22￱ argued that the term “woke AI” is insufficient to explain what’s going on because it somehow trivializes the ￰23￱ pointed to Orwellian AI, which he claims distorts the answers, lies, and rewrites history in real-time to serve the current political agenda of those in ￰24￱ your strategy with mentorship + daily ideas - 30 days free access to our trading program

Story Tags

#TECH #Gemini #mit #scale ai #Market #trading #research

Cryptopolitan

Latest news and analysis from Cryptopolitan

TRADING MARKET #cryptocurr…

Analyst: XRP Is Gearing Up for a Blow-Off Top. Here’s the Signal

The crypto market is once again watching XRP with growing excitement . After months of quiet consolidation, traders are anticipating an explosive move. Many believe the price structure mirrors previou...

TimesTabloid•Nov 6, 2025

1 min

TRADING SPONSORED #ripple

Ripple CTO Explains Real Value Of XRP Ledger And Why It Doesn’t Trigger Price Rallies

Ripple Chief Technology Officer (CTO) David Schwartz has commented on the XRP Ledger’s (XRPL) real value and the network’s focus. He also declared that the XRPL doesn’t trigger rallies for the XRP pri...

Bitcoinist•Nov 6, 2025

1 min

TRADING ALTCOIN #altcoin

AI16Z jumps 13% amid whale buys, yet TVL dips to $1.48M – Why?

AI16Z surged in the past 24 hours but holders were yet to be convinced of a U-turn in price structure....

AMB Crypto•Nov 6, 2025

1 min

Research found that AI agents can’t complete 97% of tasks on Upwork to even a basic standard

Story Tags

Watch Out: Another Stablecoin Has Lost Its Peg to the Dollar

66% Drop? Head-and Shoulders Pattern Hints at Another Painful Correction for PEPE

Google DeepMind's AlphaEvolve AI Finds New Paths to Unsolved Math Problems

Analyst: XRP Is Gearing Up for a Blow-Off Top. Here’s the Signal

Ripple CTO Explains Real Value Of XRP Ledger And Why It Doesn’t Trigger Price Rallies

AI16Z jumps 13% amid whale buys, yet TVL dips to $1.48M – Why?

Research found that AI agents can’t complete 97% of tasks on Upwork to even a basic standard

Story Tags

Related Articles

Analyst: XRP Is Gearing Up for a Blow-Off Top. Here’s the Signal

Ripple CTO Explains Real Value Of XRP Ledger And Why It Doesn’t Trigger Price Rallies

AI16Z jumps 13% amid whale buys, yet TVL dips to $1.48M – Why?