Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

EUR -
AED 4.269911
AFN 72.658748
ALL 94.915795
AMD 428.055222
ANG 2.081348
AOA 1067.143961
ARS 1621.632758
AUD 1.623964
AWG 2.093891
AZN 1.980807
BAM 1.952467
BBD 2.342302
BDT 142.748177
BGN 1.941225
BHD 0.438541
BIF 3460.079226
BMD 1.162466
BND 1.486688
BOB 8.03642
BRL 5.90289
BSD 1.162915
BTN 111.545516
BWP 16.450203
BYN 3.236331
BYR 22784.328181
BZD 2.338948
CAD 1.597914
CDF 2612.64627
CHF 0.914594
CLF 0.026805
CLP 1054.879981
CNY 7.91628
CNH 7.92164
COP 4429.006031
CRC 527.544886
CUC 1.162466
CUP 30.805342
CVE 110.609072
CZK 24.324019
DJF 206.593866
DKK 7.473719
DOP 69.225291
DZD 153.748173
EGP 61.496999
ERN 17.436986
ETB 183.030684
FJD 2.560568
FKP 0.862421
GBP 0.872215
GEL 3.115862
GGP 0.862421
GHS 13.299061
GIP 0.862421
GMD 84.283241
GNF 10203.547362
GTQ 8.87197
GYD 243.308869
HKD 9.103159
HNL 30.945289
HRK 7.531969
HTG 152.273176
HUF 361.515801
IDR 20458.757378
ILS 3.393749
IMP 0.862421
INR 111.504996
IQD 1522.830098
IRR 1533292.28975
ISK 143.471968
JEP 0.862421
JMD 183.756336
JOD 0.824234
JPY 184.53683
KES 150.365388
KGS 101.658074
KHR 4664.398129
KMF 492.885874
KPW 1046.22128
KRW 1741.246228
KWD 0.358772
KYD 0.969162
KZT 545.967451
LAK 25516.123037
LBP 104098.805948
LKR 382.032817
LRD 213.167198
LSL 19.169503
LTL 3.43246
LVL 0.703164
LYD 7.352641
MAD 10.724954
MDL 20.119004
MGA 4856.204926
MKD 61.626219
MMK 2440.759526
MNT 4161.015762
MOP 9.37985
MRU 46.499031
MUR 54.845573
MVR 17.914036
MWK 2024.438401
MXN 20.156517
MYR 4.570239
MZN 74.285895
NAD 19.169498
NGN 1593.136463
NIO 42.679974
NOK 10.815087
NPR 178.472426
NZD 1.98884
OMR 0.446973
PAB 1.162935
PEN 3.990168
PGK 5.193942
PHP 71.590496
PKR 323.892057
PLN 4.249336
PYG 7086.902977
QAR 4.237232
RON 5.20727
RSD 117.423032
RUB 84.68781
RWF 1697.781189
SAR 4.409172
SBD 9.318484
SCR 16.312958
SDG 698.06494
SEK 10.97467
SGD 1.488171
SHP 0.867898
SLE 28.655211
SLL 24376.327437
SOS 664.353418
SRD 43.537873
STD 24060.693468
STN 24.702397
SVC 10.175631
SYP 128.490183
SZL 19.169489
THB 37.943467
TJS 10.850465
TMT 4.06863
TND 3.357245
TOP 2.798938
TRY 52.944041
TTD 7.894204
TWD 36.678162
TZS 3022.411271
UAH 51.349648
UGX 4366.546502
USD 1.162466
UYU 46.580489
UZS 14001.900028
VES 593.030511
VND 30636.784144
VUV 137.078484
WST 3.145166
XAF 654.850466
XAG 0.015073
XAU 0.000255
XCD 3.141622
XCG 2.095958
XDR 0.813648
XOF 648.078818
XPF 119.331742
YER 277.422867
ZAR 19.38171
ZMK 10463.590637
ZMW 21.893006
ZWL 374.313489
  • RBGPF

    0.8900

    61.68

    +1.44%

  • CMSC

    -0.1600

    22.98

    -0.7%

  • NGG

    -6.7900

    80.64

    -8.42%

  • RYCEF

    -0.8300

    15.1

    -5.5%

  • GSK

    -0.8289

    49.67

    -1.67%

  • AZN

    -3.3800

    181.58

    -1.86%

  • BTI

    -1.6100

    65.09

    -2.47%

  • CMSD

    -0.1828

    23.05

    -0.79%

  • RIO

    -5.9000

    103.69

    -5.69%

  • BCE

    -0.4000

    23.79

    -1.68%

  • BCC

    -3.4100

    65.99

    -5.17%

  • RELX

    0.9400

    32.4

    +2.9%

  • JRI

    -0.5565

    12.45

    -4.47%

  • VOD

    -0.8000

    14.68

    -5.45%

  • BP

    0.7292

    44.35

    +1.64%

AI systems are already deceiving us -- and that's a problem, experts warn
AI systems are already deceiving us -- and that's a problem, experts warn / Photo: OLIVIER MORIN - AFP/File

AI systems are already deceiving us -- and that's a problem, experts warn

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

F.A.Dsouza--DT