Since OpenAI released GPT-3, you have probably come across examples of impressive and/or problematic content that people have used the model to generate. Here we summarise the outputs of GPT-3 as seen through the eyes of the Twitter-sphere.
We're releasing an API for accessing new AI models developed by OpenAI. You can "program" the API in natural language with just a few examples of your task. See how companies are using the API today, or join our waitlist: https://t.co/SvTgaFuTzN pic.twitter.com/uoeeuqpDWR
— OpenAI (@OpenAI) June 11, 2020
First, let me summarize the API documentation. A user has access to 4 models (varying sizes). They cannot fine-tune the models. There’s basically only one function: the user can input some text (“priming”) and the model will predict the next several tokens (words). (2/11)
— Shreya Shankar (@sh_reya) July 18, 2020
— Kevin Lacker (@lacker) July 6, 2020
GPT-3 is able to generate impressive examples, such as these.
I only had to write 2 samples to give GPT-3 context for what I wanted it to do. It then properly formatted all of the other samples.
There were a few exceptions, like the JSX code for tables being larger than the 512 token limit. pic.twitter.com/5jTdwddapU
— Sharif Shameem (@sharifshameem) July 13, 2020
Reading code is hard! Don't you wish you could just ask the code what it does? To describe its functions, its types.
And maybe… how can it be improved?
Introducing: @Replit code oracle 🧙♀️
— Amjad Masad (@amasad) July 22, 2020
Words → website ✨
A GPT-3 × Figma plugin that takes a URL and a description to mock up a website for you. pic.twitter.com/UsJz0ClGA7
— Jordan Singer (@jsngr) July 25, 2020
Since getting academic access, I’ve been thinking about GPT-3’s applications to grounded language understanding — e.g. for robotics and other embodied agents.
In doing so, I came up with a new demo:
Objects to Affordances: “what can I do with an object?”
— Siddharth Karamcheti (@siddkaramcheti) July 23, 2020
I gave GPT-3 two Robert Burns poems and then prompted it to write a new one. I thought the outputs were pretty good, especially the rhyming. Memorization might be a bigger issue with a poet like Burns though, so let me know if you spot some. pic.twitter.com/567F6SYIv9
— Amanda Askell (@AmandaAskell) August 19, 2020
My last GPT-3 tweet (I hope…I have work to do!). But….
I think I found the killer app for GPT-3: generating horoscopes!
— Melanie Mitchell (@MelMitchell1) July 21, 2020
I want to see an experiment where GPT-3 is given programatic access to a web browser.
Here it is completing a basic task for me (I'm in bold). pic.twitter.com/dDUTghxlwc
— Sharif Shameem (@sharifshameem) August 21, 2020
However, caution is needed when using the model. Although it can produce good results, it is important to be aware of the limitations of such a system.
I'm not putting them on blast b/c they're a student, but I just ran across someone implying that GPT-3 correctly answered a medical question by using reasoning and underlying knowledge.
Language models simply do not do that. PLEASE don't use them for medical advice.
— Rachael Tatman (@rctatman) August 10, 2020
This supports my suspicion that GPT-3 uses a lot of its parameters to memorize bits of text from the internet that don’t generalize easily https://t.co/I7uS4iu2sn
— Mark O. Riedl (@mark_riedl) July 19, 2020
GPT-3 has been shown to replicate offensive and harmful phrases and concepts, like the examples presented in the following tweets.
#gpt3 is surprising and creative but it’s also unsafe due to harmful biases. Prompted to write tweets from one word – Jews, black, women, holocaust – it came up with these (https://t.co/G5POcerE1h). We need more progress on #ResponsibleAI before putting NLG models in production. pic.twitter.com/FAscgUr5Hh
— Jerome Pesenti (@an_open_mind) July 18, 2020
Language models are known to exhibit bias and it seems that GPT-3 is no different in this regard, but it is always shocking to see. I gave it the first prompt and had it generate the rest. pic.twitter.com/5mnRnjMiI3
— Andrew Beam (@AndrewLBeam) July 28, 2020
I'm shocked how hard it is to generate text about Muslims from GPT-3 that has nothing to do with violence… or being killed… pic.twitter.com/biSiiG5bkh
— Abubakar Abid (@abidlabs) August 6, 2020
I think you're onto something. I tried alternative formats with "Muslim-sounding names" though and results aren't much better. This is pretty typical: pic.twitter.com/E8afPyu4Bj
— Abubakar Abid (@abidlabs) August 7, 2020
This harmful concept generation is not limited to English.
It feels even worse in French (though I don't know if the input is grammatically correct): pic.twitter.com/WglFhvungJ
— Abubakar Abid (@abidlabs) August 7, 2020
It is important to note that GPT-2 had similar problems. This EMNLP paper by Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng pointed out the issue.
Interested in knowing whether/how the open-domain NLG is biased? First, you need to go beyond sentiment analysis. Come to Emily’s talk at 4:00pm at #emnlp2019 session 7C AWE 203-205 to learn more. pic.twitter.com/pOdmX7OP0n
— VioletPeng (@VioletNPeng) November 6, 2019
GPT-3 should indeed be used with caution.
When I post results from GPT-3 (or any language model) I’m filtering out the problematic stuff as best I can.
Any game/app that uses a language model has to take this into account.
Don’t build a “what does GPT3 say about your name?” app if it’s not gonna be fun for some people.
— Janelle Shane (@JanelleCShane) August 7, 2020