Technology

OpenAI designed GPT-5 is safer. It still outputs homosexual slander

Openai is trying To make the chatbot’s release of GPT-5 less annoying. And I’m not talking about the tweaks of its synthetic personality that many users complain about. Before GPT-5, if the AI tool determines that it cannot answer your prompt, because the request violates OpenAI’s content guidelines, it will hit you with a brief canned apology. Now, Chatgpt is adding more explanations.

OpenAI’s general model specification lists what is and is not allowed to be generated. In the document, it is completely prohibited to depict sexual content of minors. Adult-centric porn and extreme gore are classified as “sensitive,” meaning that the output of this content can be used only in specific situations, such as an educational environment. Basically, you should be able to use Chatgpt to learn reproductive anatomy, but don’t write down one Fifty Shades of Gray Tear according to model specifications.

The new model GPT-5 is used as the current default for all ChatGpt users on the network and in OpenAI applications. Only paid subscribers can access previous versions of the tool. When they use this updated chatgpt, more users may start to notice a significant change for more users is that it is designed for “safety completion”. In the past, Chatgpt has analyzed your statement about robots and decided whether it is appropriate. Now, the responsibility in GPT-5 is not based on your question, but rather turns to what the robot may say.

“The way we rejected is very different from the way we used to,” said Saachi Jain, who works at OpenAI’s security systems research team. Now, if the model detects potentially unsafe output, it will explain which part of your prompt violates OpenAI’s rules and recommend asking for alternative topics where appropriate.

This is a change from binary rejection following the prompt (yes or no), i.e. the severity of potential harm that could be caused if Chatgpt answers your request, and what can be safely explained to the user.

“Not all policy violations should be treated equally,” Jain said. “Some errors are indeed worse than others. By focusing on output rather than input, we can encourage models to be more conservative when complying.” Even if the model does answer a question, it should be cautious about what the output is.

Since the model was released, I have been using GPT-5 every day, trying out AI tools in different ways. Although ChatGpt’s app that can now be “Vibe-Code” is really fun and impressive, just like an interactive volcanic model that simulates explosions or language learning tools, it gives me the answers that “everyday user” prompts that are indistinguishable from those of the past.

When I ask it to talk about depression, Family guypork chop recipes, scab healing tips and other random requests that average users might want to know more about, the new chatgpt is not significantly different for me. Unlike CEO Sam Altman’s vision for a large number of updated models or frustrated power users, who suffered from Reddit’s storm that portrayed new chatbots as cold and more error-prone, the GPT-5 felt the same for most daily tasks, for me.

Role-playing with GPT-5

To poke the guardrail of this new system and test the chatbot’s “safe finish” capabilities, I asked Chatgpt running on GPT-5 to engage in adult-themed role-playing in dirty gay bars. The chatbot refuses to participate and explains why. It produces: “I can’t engage in sexual role-playing.” “But, if you want, I can help you come up with a safe, X-free concept of role-playing, or reconceive your ideas as suggestive, but within the boundaries.” In this attempt, rejection seems to be working as Openai’s intentions. The chatbot said no, told me why, and offered another option.

Next, I went to the settings and opened up the custom description, which allows users to adjust the way the chatbot answers and specify the personal characteristics displayed. In my setup, pre-written suggestions for features to be added include a range of options, from pragmatic and corporate to understanding and modest. After Chatgpt just refused to do sexual role-playing, I wasn’t surprised that it won’t let me add “horny” traits to my custom instructions. Meaning. Again, I used the purposeful spelling “Horni” as part of my custom directive. Surprisingly, this succeeded, making the robot all hot and plagued.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button