Free Republic
Browse · Search
General/Chat
Topics · Post Article

To: sonova

I looked up his blog, and he wrote an interesting, but long, post on how they could watermark text. The gist of it is, word sequence choice could be set in a way that would probabilistically only come from Chat GPT.

This is his explanation:

How does it work? For GPT, every input and output is a string of tokens, which could be words but also punctuation marks, parts of words, or more—there are about 100,000 tokens in total. At its core, GPT is constantly generating a probability distribution over the next token to generate, conditional on the string of previous tokens. After the neural net generates the distribution, the OpenAI server then actually samples a token according to that distribution—or some modified version of the distribution, depending on a parameter called “temperature.” As long as the temperature is nonzero, though, there will usually be some randomness in the choice of the next token: you could run over and over with the same prompt, and get a different completion (i.e., string of output tokens) each time.

So then to watermark, instead of selecting the next token randomly, the idea will be to select it pseudorandomly, using a cryptographic pseudorandom function, whose key is known only to OpenAI. That won’t make any detectable difference to the end user, assuming the end user can’t distinguish the pseudorandom numbers from truly random ones. But now you can choose a pseudorandom function that secretly biases a certain score—a sum over a certain function g evaluated at each n-gram (sequence of n consecutive tokens), for some small n—which score you can also compute if you know the key for this pseudorandom function.


31 posted on 01/27/2023 8:30:59 AM PST by Wayne07
[ Post Reply | Private Reply | To 24 | View Replies ]


To: Wayne07

Yeah, that.


32 posted on 01/28/2023 7:50:10 AM PST by sonova (That's what I always say sometimes.)
[ Post Reply | Private Reply | To 31 | View Replies ]

Free Republic
Browse · Search
General/Chat
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson