AI systems are already deceiving us -- and that's a problem, experts warn

Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

Dubai 30°C

AED 4.269911

AFN 72.658748

ALL 94.915795

AMD 428.055222

ANG 2.081348

AOA 1067.143961

ARS 1621.632758

AUD 1.623964

AWG 2.093891

AZN 1.980807

BAM 1.952467

BBD 2.342302

BDT 142.748177

BGN 1.941225

BHD 0.438541

BIF 3460.079226

BMD 1.162466

BND 1.486688

BOB 8.03642

BRL 5.90289

BSD 1.162915

BTN 111.545516

BWP 16.450203

BYN 3.236331

BYR 22784.328181

BZD 2.338948

CAD 1.597914

CDF 2612.64627

CHF 0.914594

CLF 0.026805

CLP 1054.879981

CNY 7.91628

CNH 7.92164

COP 4429.006031

CRC 527.544886

CUC 1.162466

CUP 30.805342

CVE 110.609072

CZK 24.324019

DJF 206.593866

DKK 7.473719

DOP 69.225291

DZD 153.748173

EGP 61.496999

ERN 17.436986

ETB 183.030684

FJD 2.560568

FKP 0.862421

GBP 0.872215

GEL 3.115862

GGP 0.862421

GHS 13.299061

GIP 0.862421

GMD 84.283241

GNF 10203.547362

GTQ 8.87197

GYD 243.308869

HKD 9.103159

HNL 30.945289

HRK 7.531969

HTG 152.273176

HUF 361.515801

IDR 20458.757378

ILS 3.393749

IMP 0.862421

INR 111.504996

IQD 1522.830098

IRR 1533292.28975

ISK 143.471968

JEP 0.862421

JMD 183.756336

JOD 0.824234

JPY 184.53683

KES 150.365388

KGS 101.658074

KHR 4664.398129

KMF 492.885874

KPW 1046.22128

KRW 1741.246228

KWD 0.358772

KYD 0.969162

KZT 545.967451

LAK 25516.123037

LBP 104098.805948

LKR 382.032817

LRD 213.167198

LSL 19.169503

LTL 3.43246

LVL 0.703164

LYD 7.352641

MAD 10.724954

MDL 20.119004

MGA 4856.204926

MKD 61.626219

MMK 2440.759526

MNT 4161.015762

MOP 9.37985

MRU 46.499031

MUR 54.845573

MVR 17.914036

MWK 2024.438401

MXN 20.156517

MYR 4.570239

MZN 74.285895

NAD 19.169498

NGN 1593.136463

NIO 42.679974

NOK 10.815087

NPR 178.472426

NZD 1.98884

OMR 0.446973

PAB 1.162935

PEN 3.990168

PGK 5.193942

PHP 71.590496

PKR 323.892057

PLN 4.249336

PYG 7086.902977

QAR 4.237232

RON 5.20727

RSD 117.423032

RUB 84.68781

RWF 1697.781189

SAR 4.409172

SBD 9.318484

SCR 16.312958

SDG 698.06494

SEK 10.97467

SGD 1.488171

SHP 0.867898

SLE 28.655211

SLL 24376.327437

SOS 664.353418

SRD 43.537873

STD 24060.693468

STN 24.702397

SVC 10.175631

SYP 128.490183

SZL 19.169489

THB 37.943467

TJS 10.850465

TMT 4.06863

TND 3.357245

TOP 2.798938

TRY 52.944041

TTD 7.894204

TWD 36.678162

TZS 3022.411271

UAH 51.349648

UGX 4366.546502

USD 1.162466

UYU 46.580489

UZS 14001.900028

VES 593.030511

VND 30636.784144

VUV 137.078484

WST 3.145166

XAF 654.850466

XAG 0.015073

XAU 0.000255

XCD 3.141622

XCG 2.095958

XDR 0.813648

XOF 648.078818

XPF 119.331742

YER 277.422867

ZAR 19.38171

ZMK 10463.590637

ZMW 21.893006

ZWL 374.313489

RBGPF

0.8900

61.68

+1.44%
CMSC

-0.1600

22.98

-0.7%
NGG

-6.7900

80.64

-8.42%
RYCEF

-0.8300

15.1

-5.5%
GSK

-0.8289

49.67

-1.67%
AZN

-3.3800

181.58

-1.86%
BTI

-1.6100

65.09

-2.47%
CMSD

-0.1828

23.05

-0.79%
RIO

-5.9000

103.69

-5.69%
BCE

-0.4000

23.79

-1.68%
BCC

-3.4100

65.99

-5.17%
RELX

0.9400

32.4

+2.9%
JRI

-0.5565

12.45

-4.47%
VOD

-0.8000

14.68

-5.45%
BP

0.7292

44.35

+1.64%

AI systems are already deceiving us -- and that's a problem, experts warn / Photo: OLIVIER MORIN - AFP/File

AI systems are already deceiving us -- and that's a problem, experts warn

TECHNOLOGY 10.05.2024

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

F.A.Dsouza--DT

Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

AI systems are already deceiving us -- and that's a problem, experts warn

Featured

Southeast Asia's largest dinosaur identified in Thailand

Canada's Cohere embraces 'low drama' amid AI giant tumult

Closing arguments in blockbuster trial pitting Musk against OpenAI

Spain gears up for August total solar eclipse