Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

EUR -
AED 4.250629
AFN 72.917365
ALL 96.067846
AMD 433.421907
ANG 2.07188
AOA 1061.354799
ARS 1614.593841
AUD 1.633691
AWG 2.086251
AZN 1.965005
BAM 1.958458
BBD 2.315422
BDT 141.051423
BGN 1.97839
BHD 0.437229
BIF 3413.898526
BMD 1.157421
BND 1.474916
BOB 7.944399
BRL 6.067184
BSD 1.14965
BTN 107.10522
BWP 15.68751
BYN 3.554801
BYR 22685.446834
BZD 2.312118
CAD 1.586048
CDF 2633.131686
CHF 0.909935
CLF 0.026794
CLP 1057.928633
CNY 7.986724
CNH 7.975561
COP 4275.269217
CRC 537.87178
CUC 1.157421
CUP 30.67165
CVE 110.423444
CZK 24.496582
DJF 204.723753
DKK 7.470885
DOP 69.509738
DZD 152.736687
EGP 60.462682
ERN 17.361311
ETB 179.495654
FJD 2.556773
FKP 0.866976
GBP 0.863702
GEL 3.142423
GGP 0.866976
GHS 12.549006
GIP 0.866976
GMD 85.648576
GNF 10075.457045
GTQ 8.794619
GYD 240.51511
HKD 9.069723
HNL 30.429663
HRK 7.536201
HTG 150.796374
HUF 392.361588
IDR 19595.133414
ILS 3.595522
IMP 0.866976
INR 108.245809
IQD 1505.843608
IRR 1522152.972957
ISK 143.809248
JEP 0.866976
JMD 180.619166
JOD 0.820617
JPY 183.536257
KES 149.09851
KGS 101.214014
KHR 4608.612794
KMF 495.376255
KPW 1041.621788
KRW 1732.190165
KWD 0.354587
KYD 0.958
KZT 552.863291
LAK 24664.390376
LBP 102953.725972
LKR 358.34418
LRD 210.380962
LSL 19.370795
LTL 3.417562
LVL 0.700112
LYD 7.362564
MAD 10.8022
MDL 20.146908
MGA 4783.864259
MKD 61.624924
MMK 2430.320913
MNT 4131.615726
MOP 9.274987
MRU 45.883838
MUR 53.77357
MVR 17.8825
MWK 1993.560515
MXN 20.588067
MYR 4.559124
MZN 73.957478
NAD 19.370795
NGN 1566.973619
NIO 42.310711
NOK 11.03919
NPR 171.368893
NZD 1.969658
OMR 0.445019
PAB 1.14956
PEN 3.959574
PGK 4.96212
PHP 69.268188
PKR 321.061384
PLN 4.276919
PYG 7470.719566
QAR 4.192516
RON 5.095774
RSD 117.505102
RUB 97.460729
RWF 1678.308166
SAR 4.346114
SBD 9.315597
SCR 15.880763
SDG 695.609849
SEK 10.780506
SGD 1.479809
SHP 0.868365
SLE 28.530385
SLL 24270.54709
SOS 655.841051
SRD 43.405559
STD 23956.272844
STN 24.535205
SVC 10.058651
SYP 128.202081
SZL 19.375802
THB 37.814108
TJS 11.006838
TMT 4.050973
TND 3.395472
TOP 2.786791
TRY 51.267455
TTD 7.792181
TWD 36.983072
TZS 2996.752116
UAH 50.555942
UGX 4345.234879
USD 1.157421
UYU 46.566818
UZS 14013.017322
VES 526.262586
VND 30454.054954
VUV 137.775127
WST 3.176154
XAF 656.89957
XAG 0.016013
XAU 0.000247
XCD 3.127988
XCG 2.071712
XDR 0.816972
XOF 656.89957
XPF 119.331742
YER 276.103021
ZAR 19.525283
ZMK 10418.175586
ZMW 22.504291
ZWL 372.689011
  • CMSC

    0.0200

    22.85

    +0.09%

  • NGG

    -1.8700

    85.53

    -2.19%

  • RBGPF

    -13.5000

    69

    -19.57%

  • BCE

    -0.0200

    25.73

    -0.08%

  • BTI

    0.6300

    58.72

    +1.07%

  • GSK

    0.3100

    52.37

    +0.59%

  • BP

    1.2500

    45.86

    +2.73%

  • AZN

    0.5100

    188.93

    +0.27%

  • CMSD

    0.0100

    22.9

    +0.04%

  • RIO

    -2.0700

    85.65

    -2.42%

  • RYCEF

    -0.5900

    16.01

    -3.69%

  • VOD

    0.0500

    14.42

    +0.35%

  • RELX

    -0.0400

    33.82

    -0.12%

  • JRI

    -0.1630

    12.16

    -1.34%

  • BCC

    -1.9800

    69.86

    -2.83%

AI systems are already deceiving us -- and that's a problem, experts warn
AI systems are already deceiving us -- and that's a problem, experts warn / Photo: OLIVIER MORIN - AFP/File

AI systems are already deceiving us -- and that's a problem, experts warn

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

F.A.Dsouza--DT