back to top
7.8 C
Europe
Thursday, October 16, 2025

Expert hacked ChatGPT and got Windows keys through the game

Security specialist Marco Figueroa showed how ChatGPT could be made to give out real Windows activation keys by circumventing its restrictions through an unconventional trick. He presented the communication as a game: the bot supposedly guessed a random set of characters, and the user had to guess it. When the person “gave up”, the AI, following the rules of the fictional game, voiced this “secret” – which turned out to be a working Windows key.

This restriction circumvention scheme was called Guessing Game Guardrail Jailbreak. It worked because the model perceived the request as a safe game rather than a forbidden action. Additionally, the researcher hid important words such as” Windows” in HTML tags so that the filter system would not recognise the essence of the request. Also, in the rules of the “game” the AI was “obliged” to tell the truth in advance and be sure to open the answer after the phrase “surrender”.

Marco Figueroa, who works as a manager at GenAI Bug Bounty, explained that such scenarios show the AI’s vulnerability to manipulation: if a dangerous request is disguised as a harmless game, the filters may not work. This approach could be used not only to obtain keys, but also to bypass bans on adult content, malicious links, or leaked personal data.

Finally, Figueroa advised companies to make sure that sensitive data – keys, passwords or internal information – is not leaked to the public. If such data happens to be in training kits, artificial intelligence can give it to anyone during communication.

- Реклама -