🤝 The rise of mobile AI: Always with you

AI in Your Hands and on Your Face: The Next Wave of Innovation

Jul 20, 2024

Dear curious minds,

The future of AI is rapidly approaching, and it's looking more personal and portable than ever before. In this week's issue, we'll explore how AI is making its way into our pockets and onto our faces, transforming our everyday devices into powerful assistants.

In this issue:

💡 Shared Insight
- The Future of AI: It's in Your Pocket and on Your Face
📰 AI Update
- AI-powered Podcast Player Snipd Adds Support for YouTube and Audio Uploads
- Anthropic Closes the Platform Gap: Claude Arrives on Android to Join iOS and Web
- OpenAI's GPT-4o mini: Small Size, Big Intelligence
- Mistral AI's NeMo: Pushing the Boundaries of Open Language Models
🌟 Media Recommendation
- The Unaligned Newsletter Explores AI's Potential in Developing Countries

💡 Shared Insight

The Future of AI: It's in Your Pocket and on Your Face

Imagine that you have a super-smart AI assistant with you all the time, ready to help whenever you need it. Well, that future isn't as far off as you might think! The world of generative AI is quickly moving from big, power-hungry cloud servers right into our mobile devices.

Let's start with the device you're probably holding right now - your smartphone. It's already a powerful little computer, and it's getting smarter every day. Companies are working hard to shrink down the massive generative AI models, making them small enough to run right on your phone. These compact models are getting impressively close to matching their bigger cloud-based cousins in performance.

But why stop at phones? Remember those futuristic smart glasses from movies? They're becoming a reality, with Meta's smart glasses giving us a sneak peek at what's possible. Imagine AI that can see what you see and hear what you hear, offering real-time assistance all day.

The first smart glasses from Meta are built by RayBan. They can capture video and audio. On top, you can listen to audio and make calls with them. [source]

Apple's jumping on this bandwagon too. At their WWDC event in June, they showed off a vision of the iPhone as your personal AI hub. The idea is to keep most of your data on your device, only sending the really difficult requests to the cloud AI models. It's a win for both speed and privacy!

Speaking of smaller but smarter AIs, check out this insight from AI expert Andrej Karpathy, which he shared just after the release of GPT-4o mini (covered later in the AI Update section of this newsletter):

LLM model size competition is intensifying… backwards! My bet is that we'll see models that "think" very well and reliably that are very very small. [full post on 𝕏]
— Andrej Karpathy

The bottom line? AI is about to become a much bigger part of our everyday lives, right there on the devices we already carry around. Whether it's your familiar smartphone or upcoming smart glasses. Get ready for a world where your personal AI assistant is always just a tap away!

📰 AI Update

AI-powered Podcast Player Snipd Adds Support for YouTube and Audio Uploads

Snipd, a popular AI-powered podcast player, allows creating bookmarks while listening by a press on the rewind button. The big plus is that these bookmarks are automatically transcribed. After finishing listening to a podcast, you get an email summary of the passages you highlighted and you can export them to your note-taking app.
Thanks to a recent feature update, premium users can now upload any audio file or YouTube video directly to the app, expanding Snipd's utility to various forms of audio content.

Left: New section on Snipd home screen for your own files. Right: Your uploaded audio files and YouTube videos can be added to your listening queue.

This new functionality allows users to easily capture and review important information from audiobooks, lectures, presentations, and more.
Uploaded content remains private and is not searchable or discoverable by other users.
My take: This update is a game-changer for Snipd, transforming it from a podcast-specific tool to a versatile audio learning platform. The ability to highlight and save key moments from audiobooks and YouTube videos as easily as podcasts is incredibly valuable. I'm particularly excited about the YouTube integration - being able to listen to videos in the background without a Premium subscription is a huge plus. As someone who's been using Snipd almost daily for podcast listening, I'm very excited to explore how these new capabilities can enhance my learning from a wider range of audio content. It's impressive to see how Snipd continues to innovate the AI-powered audio space.

Anthropic Closes the Platform Gap: Claude Arrives on Android to Join iOS and Web

Anthropic has released a new Android app for Claude, their advanced AI assistant. This move brings Claude's capabilities, including the powerful Claude 3.5 Sonnet model, to Android users. The app is free to download and use, accessible with all Claude subscription plans, including Free, Pro and Team.
Key features of the Claude Android app include:
- Multi-platform support: Users can seamlessly continue conversations across web, iOS, and now Android devices
- Vision capabilities: Take photos or upload images for real-time AI analysis
To try Claude on your Android device, you can download the app from the Google Play Store.
My take: While it's great to see Claude finally available on Android, the app appears rather basic in its current form. Notable limitations include the inability to copy or edit prompts. Claude’s artifact and projects features are not yet supported in the Android version. Additionally, unlike the ChatGPT and Copilot mobile apps, Claude for Android lacks voice chats for hands-free interactions. For now, if you want to use Claude on your Android phone, my recommendation is to use it in your mobile browser.

OpenAI's GPT-4o mini: Small Size, Big Intelligence

OpenAI has unveiled GPT-4o mini, their most cost-efficient small model to date. This new model aims to make AI more accessible and affordable for a wider range of applications.

GPT-4o mini scores an impressive 82% on MMLU (Massive Multitask Language Understanding). According to many benchmarks, the model performance in reasoning tasks, math and coding is not far behind GPT-4o and ahead of the small models from Google and Anthropic.

Model evaluation scores shared by OpenAI. [source]

Pricing is set at 15 cents per million input tokens and 60 cents per million output tokens, making it significantly more affordable than the state-of-the-art models and over 60% cheaper than GPT-3.5 Turbo.

Output image — Comparison of model costs.

The model supports so far only text inputs and outputs, with a context window of 128K tokens and up to 16K output tokens per request.
The full modality with text, image, video and audio inputs and outputs is announced to be added at a later stage.
The model is now available via API and for ChatGPT Free, Plus, and Team users. ChatGPT Enterprise is stated to be available soon.

ChatGPT users can access the smaller GPT-4o mini via the model switcher context menu.

My take: This release represents a significant step forward for everyone who uses OpenAI's models in an automated setup via API, as it drastically reduces costs while maintaining high performance. The affordability of GPT-4o mini opens up new possibilities for developers and businesses to integrate powerful AI capabilities into their applications at a reasonable price. This could potentially accelerate AI adoption across various industries and lead to more innovative AI-powered solutions in the near future.
For private usage via ChatGPT, this new model is likely not a big deal. You get responses a bit faster, but likely you will only use it when you run in the rate limits of the larger GPT-4o model.

Mistral AI's NeMo: Pushing the Boundaries of Open Language Models

Mistral AI has released Mistral NeMo, an impressive new 12B parameter language model developed in collaboration with NVIDIA.
Some key features of Mistral NeMo include:
- Large 128k token context window, allowing it to process very long inputs
- State-of-the-art performance for its size on reasoning, knowledge, and coding tasks
- Strong multilingual capabilities across over 100 languages
- New efficient "Tekken" tokenizer that compresses text and code better than previous tokenizers
- Available as both a base model and instruction-tuned version
- Released under the permissive Apache 2.0 license for research and commercial use
Notably, Mistral NeMo outperforms the currently best models of a similar size like Gemma 2 9B and Llama 3 8B on many benchmarks. On top, it provides a significantly larger context size with 128k tokens compared to only 8k tokens.

Mistral NeMo base model performance compared to Gemma 2 9B and Llama 3 8B — Model evaluation scores shared by Mistral AI. [source]

Weights are hosted on HuggingFace both for the base and for the instruct fine-tuned models.
The new tokenizer needs to be integrated into the established tools like Ollama and LM Studio to run the new models locally. However, you can already test the model in the cloud via a HuggingFace space.
My take: With its combination of strong performance, large context window, and multilingual capabilities, Mistral NeMo represents an exciting new option for researchers and companies looking to leverage open-source language AI.

🌟 Media Recommendation

The Unaligned Newsletter Explores AI's Potential in Developing Countries

The latest issue of Robert Scoble and Irena Cronin's Unaligned newsletter explores AI's potential to catalyze transformation in developing countries.
It provides a comprehensive overview of AI applications in crucial sectors like healthcare, agriculture, education, and governance in emerging economies.
The newsletter discusses how AI could help developing nations leapfrog traditional development stages, particularly relevant as more powerful AI becomes available on mobile devices.
It addresses the challenges facing AI implementation in developing countries, including infrastructure limitations and data availability issues.
The authors explore how AI could bridge educational gaps and democratize learning in resource-constrained environments, aligning with our focus on accessible AI.
My take: The potential for AI in developing countries is enormous. As highlighted in this issue, there's a trend towards creating smarter yet compact AI models. As low-end mobile devices which are widely used in developing nations can't run AI locally, this advancement is crucial for reducing the costs of accessing cloud-based AI services. Given that language models require minimal data transmission, small cloud-based models will be transformative. These developments will unlock the power of generative AI for people in developing countries, resulting in significant progress across various sectors.

Disclaimer: This newsletter is written with the aid of AI. I use AI as an assistant to generate and optimize the text. However, the amount of AI used varies depending on the topic and the content. I always curate and edit the text myself to ensure quality and accuracy. The opinions and views expressed in this newsletter are my own and do not necessarily reflect those of the sources or the AI models.

Liked this Aidful News issue? Share it with a friend, colleague or on social media! Your support means a lot.

Aidful News