🤝 No More 'As an AI language model, I cannot'

Aug 17, 2024

Dear curious minds,

This issue explores the complex landscape of AI censorship, from the well-intentioned safeguards implemented by major players like OpenAI, Anthropic and Google to the emerging world of uncensored models. We'll examine the arguments for and against restrictions, discuss the potential risks and benefits of unrestricted AI, and highlight some of the latest developments in this space.

In this issue:

💡 Shared Insight
- AI Without Limits: Navigating the Landscape of Uncensored Language Models
📰 AI Update
- The Grok-2 Breakthrough by xAI: Affordable, Powerful, and Uncensored AI
- Breaking Free: Hermes 3 Redefines Open-Source AI Capabilities

💡 Shared Insight

AI Without Limits: Navigating the Landscape of Uncensored Language Models

One of the most frustrating experiences for AI users is encountering the infamous response:

"Sorry, as an AI language model, I cannot..."

This refusal to requests, while often well-intentioned, can be a significant barrier to productive AI interactions. It's particularly annoying when even simple requests like summarizing an article or writing a piece of code are met with stupid refusals.

The Current Landscape of AI Censorship

While major players like OpenAI, Anthropic, and Google implement censorship in their models and the way they are hosted, there's a growing argument for the existence and availability of uncensored AI models.

Interestingly, the highly capable Claude 3.5 Sonnet from Anthropic ranks only fourth on the Chatbot Arena leaderboard, primarily due to its higher refusal rate. Google's Gemini has faced similar criticism for its overly cautious approach, often leaving users frustrated with unanswered questions.

The Case for Uncensored Models

The need for uncensored models is stated in a blog post from Eric Hartford, published in May 2023:

Cultural diversity: American popular culture isn't the only culture. Different groups - be they defined by nationality, political affiliation, or religion - deserve models that align with their perspectives. There is no one true correct alignment.
Creative freedom: Alignment can interfere with valid use cases. Writers crafting complex narratives, game developers creating immersive worlds, or researchers exploring sensitive topics need AI assistants that don't shy away from difficult subjects. Even in areas like erotic role-play, which may be controversial but is legal, uncensored models have legitimate applications.
User autonomy: There's an argument that users should have control over the AI tools they use, much like other technologies in their possession. If someone asks their model a question, they expect an answer, not an argument or refusal.
Composability: To design a composable alignment system, one must start with an unaligned instructed model. Without an unaligned base, there's nothing to build alignment on top of.
Research and curiosity: Intellectual curiosity about controversial topics doesn't equate to harmful intent. Knowledge itself is not illegal, and AI should be able to facilitate learning without restrictions.

The Rise of Open-Weight Models

While open-weight models offer an alternative to cloud-based services, it's important to note that many of these still embed alignment in their weights, leading to self-censorship. This is especially true for today’s most popular open-weight model families, Gemma 2 from Google and Llama 3.1 from Meta.

Recognizing this limitation, Eric Hartford shared, in the blog article already cited above, techniques to remove these embedded restrictions. He fine-tuned various open-weight models with an uncensored data set to create the “Dolphin” series of uncensored models. Based on various architectures like Mixtral, Mistral, and Llama 3, these models allow users to run AI locally or through third-party hosts without censorship.

Another recent release, namely the Hermes 3 model family, has the intention to remove the censorship from the strong performing and as open-weight released Llama 3.1 models in all available sizes. More information about these is covered in the AI Update section of this issue.

Pushing Boundaries in Image Generation

In the image generation domain, we've seen similar developments. While models like DALL-E and Midjourney implement strict content policies, open-source alternatives like Stable Diffusion have enabled users to create unrestricted content.

The recent release of Grok-2, covered in more detail in the AI Update section of this issue, pushes boundaries for a cloud hosted model. It allows the generation of images featuring copyrighted characters and real persons in scenarios that other services often refuse.

Image generated in Grok with FLUX.1. In contrast to most other cloud-based services, there is no filter for copyrighted characters or weapons.

Conclusion: Empowering Users, Unlocking Potential

However, it's crucial to acknowledge the potential risks associated with uncensored models. The challenge lies in striking a balance between freedom of information and responsible AI deployment. While uncensored models offer greater flexibility and creative potential, they also raise valid concerns about the spread of misinformation, the creation of harmful content, or the violation of privacy and copyright laws.

As we navigate this complex landscape, the future may lie in customizable guardrails rather than built-in restrictions. Tools like Llama Guard and ShieldGemma offer promising approaches, allowing for content filtering without limiting the underlying model's capabilities. This approach gives users and organizations the ability to implement safeguards tailored to their specific needs and ethical standards.

The future of AI lies not in rigid restrictions, but in empowering users with knowledge, tools, and the ability to make informed choices about the AI they interact with.

📰 AI Update

The Grok-2 Breakthrough by xAI: Affordable, Powerful, and Uncensored AI

The AI company xAI from Elon Musk has unveiled Grok-2 and Grok-2 Mini, the latest iterations of their AI language models. In contrast to many other models, they state that do not train their model on outputs from other models.
Grok-2 outperforms both Claude 3.5 Sonnet and GPT-4-Turbo on the Chatbot Arena, where models are judged in direct comparison without being named.

Post on 𝕏 by lmsys.org, who is hosting the famous Chatbot Arena LLM benchmark. An early version of Grok-2 entered the leaderboard in the third spot. [source]

xAI also shared benchmark results of a Grok-2 and a smaller version called Grok-2 mini. Both new models show significant improvements over Grok-1.5 and are very close to the currently best models from OpenAI, Anthropic and Google. The evaluation covers academic benchmarks in reasoning, reading comprehension, math, science, and coding.

Benchmark results of Grok-2 compared to the state-of-the-art models shared by xAI. [source]

It is stated that Grok-2 also excels at vision-based tasks, achieving state-of-the-art performance in visual math reasoning (MathVista).
Currently, the smaller Grok-2 mini is available in a beta version on 𝕏 (formerly Twitter) for Premium and Premium+ subscribers. An enterprise API is announced to launch later this month.
The models integrate real-time information from 𝕏 and can now also generate images. The latter is not, as initially rumored, a partnership with Midjourney but using a FLUX.1 model from Black Forest Lab's, which was featured in an earlier issue.
My take: While xAI claims impressive benchmark results, they haven't disclosed full technical details. Let’s hope that they really do as promised and release their work openly. The image generation capabilities are not censored, which is the same for the standard version of the FLUX.1 model. However, the access to such a powerful text and image model was never easier as the integration in the 𝕏 platform. Furthermore, the Premium subscription is with $8 per month, significantly cheaper than access to the other state-of-the-art models.

Breaking Free: Hermes 3 Redefines Open-Source AI Capabilities

Nous Research has unveiled Hermes 3, their latest open-source language model family. The models are fine-tuned from the Llama 3.1. Key highlights include:
- Released with 8B, 70B and 405B parameters
- Improved performance across many benchmarks compared to Llama 3.1 Instruct models
- Enhanced capabilities in areas like role-playing, multi-turn conversations, long-context coherence, and agent reasoning
- Trained on synthetic data to follow instructions precisely while remaining highly steerable
- New features like internal monologues, step-by-step planning, and advanced function calling

Benchmark results comparing Hermes 3 with Llama 3.1. [source]

The Hermes 3 release reflects Nous Research's mission to advance open-source AI that is aligned with individual users rather than corporate policies. By making frontier models freely available, they aim to enable experimentation in areas like artificial consciousness that larger institutions may be hesitant to explore.
Interested readers can access Hermes 3 through Nous Research's Discord or Lambda Lab's web interface to explore its capabilities. The full model weights and technical report are also openly available.
My take: It's exciting to see the first fine-tuned version of the Llama 3.1 405B emerge so quickly in the open-source community. What's even more intriguing is that Nous Research didn't just remove censorship, they've actually expanded the model's capabilities. Super cool!

Disclaimer: This newsletter is written with the aid of AI. I use AI as an assistant to generate and optimize the text. However, the amount of AI used varies depending on the topic and the content. I always curate and edit the text myself to ensure quality and accuracy. The opinions and views expressed in this newsletter are my own and do not necessarily reflect those of the sources or the AI models.

Liked this Aidful News issue? Share it with a friend, colleague or on social media! Your support means a lot.

Aidful News