3 days ago
LW - GPT-4 Plugs In by Zvi
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: GPT-4 Plugs In, published by Zvi on March 27, 2023 on LessWrong.
GPT-4 Right Out of the Box
In some ways, the best and worst thing about GPT-4 was its cutoff date of September 2021. After that date it had no access to new information, and it had no ability to interact with the world beyond its chat window.
As a practical matter, that meant that a wide range of use cases didn’t work. GPT-4 would lack the proper context. In terms of much of mundane utility, this would often destroy most or all of the value proposition. For many practical purposes, at least for now, I instead use Perplexity or Bard.
That’s all going to change. GPT-4 is going to get a huge upgrade soon in the form of plug-ins(announcement).
That means that GPT-4 is going to be able to browse the internet directly, and also use a variety of websites, with the first plug-ins coming from Expedia, FiscalNote, Instacart, KAYAK, Klarna, Milo, OpenTable, Shopify, Slack, Speak, Zapier and Wolfram. Wolfram means Can Do Math. There’s also a Python code interpreter.
Already that’s some exciting stuff. Zapier gives it access to your data to put your stuff into context.
Also there’s over 80 secret plug-ins already, that can be revealed by removing a specific parameter from an API call. And you can use them, there are only client-side checks stopping you. Sounds secure.
We continue to build everything related to AI in Python, almost as if we want to die, get our data stolen and generally not notice that the code is bugged and full of errors. Also there’s that other little issue that happened recently. Might want to proceed with caution.
(I happen to think they’re also downplaying the other risks even more, but hey.)
Perhaps this wasn’t thought out as carefully as we might like. Which raises a question.
So, About That Commitment To Safety and Moving Slowly And All That
That’s all obviously super cool and super useful. Very exciting. What about safety?
Well, we start off by having everyone to write their instructions in plain English and let the AIs figure it out, because in addition to being super fast, that’s the way to write secure code that does what everyone wants that is properly unit tested and fails gracefully and doesn’t lead to any doom loops whatsoever.
(Gross’ link here.)
Also one of the apps in the initial batch is Zapier, which essentially hooks GPT up to all your private information and accounts and lets it do whatever it wants. Sounds safe.
No, no, that’s not fair, they are concerned about safety, look, concern, right here.
At the same time, there’s a risk that plugins could increase safety challenges by taking harmful or unintended actions, increasing the capabilities of bad actors who would defraud, mislead, or abuse others.
I mean, yes, you don’t say, having the ability to access the internet directly and interface with APIs does seem like the least safe possible option. I mean, I get why you’d do that, there’s tons of value here, but let’s not kid ourselves. So, what’s the deal?
We’ve performed red-teaming exercises, both internally and with external collaborators, that have revealed a number of possible concerning scenarios. For example, our red teamers discovered ways for plugins—if released without safeguards—to perform sophisticated prompt injection, send fraudulent and spam emails, bypass safety restrictions, or misuse information sent to the plugin.
We’re using these findings to inform safety-by-design mitigations that restrict risky plugin behaviors and improve transparency of how and when they’re operating as part of the user experience.
This does not sound like ‘the red teams reported no problems,’ it sounds like ‘the red teams found tons of problems, while checking for the wrong kind of problem, and we’re trying to mitigate as best we can.’
Better than nothing. Not super comforting. What ha...