Dear curious mind,
In this issue, I want to share something practical: I used AI to save hours of tedious clicking by automating 1,300+ Kindle downloads. This hands-on experience, along with the latest industry news and expert insights, shows both the hype and reality of AI today. Let's dive in!
In this issue:
π‘ Shared Insight
How I Automated 1,332 Kindle Downloads with an AI Agent
π° AI Update
Grok 3 from xAI Leaves the Competition Behind
From OpenAI to Thinking Machines: New Company Promises Fresh Approach
π Media Recommendation
Video: xAI's Grok 3 Launch Stream
Video: Anthropic's Definition and Insights on AI Agents
π‘ Shared Insight
How I Automated 1,332 Kindle Downloads with an AI Agent
The rise of AI agents promises to revolutionize how we interact with computers, automating repetitive tasks that would otherwise require countless manual actions. I recently had my first hands-on experience creating such an agent.
The catalyst for this realization was Amazon's recent announcement that Kindle eBook downloads will no longer be possible after February 26, 2025. I currently have 1,332 books in my Kindle library that I wanted to archive for use outside the Amazon ecosystem to read them on a non Kindle device and βchatβ with my books. I faced a daunting task - each book required multiple manual clicks through Amazon's interface to download, with no bulk download option available.
Rather than clicking thousands of times manually, I turned to an open-source tool named Browser Use (30k+ stars on GitHub) that enables AI-controlled browser automation.
I initially set up Browser Use with the task to download all my books, but that approach proved problematic. While it successfully downloaded the first few books, the agent would eventually get lost after a wrong click, from which it does not always recover to continue the task. This highlighted that there's still work needed to make such automation more robust - perhaps just by using a more intelligent model than the currently experimental and with that free model Gemini 2.0 Flash Lite I was using.
However, I then pivoted to a more manageable approach by first creating a list of all my Kindle titles. Even this preparatory step wasn't straightforward - Amazon doesn't offer a simple export option. Instead, I had to open the online Kindle web reader, scroll to the end of my library, to catch all information via copy & paste and clean up the resulting list.
With this list as input, I modified the Browser Use implementation to handle one book at a time. Each download became its own discrete task: find the search bar, enter the title, navigate to the single search result, and complete the download. This approach proved much more reliable - even if the process got interrupted, it is simple to restart from the last successful download and continue through the remaining books.
While my solution might be more accurately described as a scripted workflow than a true autonomous agent, it successfully automated the entire process.
The experience opened my eyes to what's already possible with browser automation tools. While my agent worked slower than a human performing the same actions, it could reliably continue the task for hours without getting tired or bored. More importantly, it demonstrated how relatively simple AI-powered tools can take over genuine busy work that would otherwise consume hours of human time.
This small experiment has sparked my imagination about other repetitive browser-based tasks that could be automated in similar ways. While we're still far from the fully autonomous AI agents that companies like OpenAI and Anthropic are working toward, tools like Browser Use already enable practical automation that can save significant time and effort.
Whether you want to download your own Kindle library before the deadline of February 26 or explore browser automation for other tasks, the barrier to entry is lower than you might think. The future of human-AI collaboration isn't just about conversational AI - it's also about these practical tools that can take over routine tasks and free us to focus on more meaningful work.
π° AI Update
Grok 3 from xAI Leaves the Competition Behind [Grok blog post]
It's incredibly exciting to see xAI achieve state-of-the-art performance with Grok 3, especially considering their first model, Grok 1, was only released in November 2023. The output speed of Grok 3 is a huge plus, especially for its reasoning mode and the DeepSearch (yes, without "Re") feature. Achieving the top spot in the Chatbot Arena is another statement of the model's quality. Congratulations to the xAI team on this remarkable progress!
From OpenAI to Thinking Machines: New Company Promises Fresh Approach [Thinking Machines Lab website]
Great to see former OpenAI CTO Mira Murati and many other top AI talents launching Thinking Machines Lab. Their focus on human-AI collaboration and commitment to openly sharing their work is promising. Only time will tell if they deliver on these ambitious goals, but the team's track record certainly makes them worth watching.
π Media Recommendation
Video: xAI's Grok 3 Launch Stream
The xAI team unveiled Grok 3, promising a significant performance boost. They shared together with Elon Musk many insights about the path so far and the way ahead in the launch live stream. A recording is shared on π or, in slightly worse video quality, on YouTube.
Central to Grok 3's power is xAI's massive compute infrastructure. Initially challenged by limited resources, the team made the bold decision to build their own data center. Elon Musk commented that companies quoted 18 to 24 months for building a 100,000 GPU cluster. This was way too slow for the ambitious plans of xAI. They realized it on their own and realized the data center in just four months by repurposing an old Electrolux factory and leasing mobile cooling equipment from around the US. Meanwhile, they already doubled their GPU count to 200,000 units.
The xAI team highlighted how focusing on mathematics and coding during reinforcement learning unexpectedly boosted Grok's capabilities in other areas.
Grok 3's reasoning and coding abilities were demonstrated live. It generated code for a 3D animated trajectory from Earth to Mars and back, and created a novel game blending Tetris and Bejeweled on the fly.
Looking to the future, Elon Musk announced the launch of a new xAI gaming studio. The studio will build a gaming platform for the future, using Grok as the foundation.
For dedicated Grok users, xAI will introduce a "Super Grok" subscription, granting early access to new features and the highest rate limits to the new standalone Grok app and the dedicated website grok.com. The subscription is separate from the π app.
The upcoming voice mode will be deeply integrated into the model itself, generating audio directly instead of relying on a separate text-to-speech system.
xAI plans to continue its open-source commitment. The team announced that Grok 2 will be open-sourced once Grok 3 is out of beta and considered mature and stable.
My take: xAI is pushing the boundaries of AI development on multiple fronts. The extreme scale of their GPU infrastructure, the outstanding performance and the continued commitment to open-sourcing, make xAI a company to watch.
Video: Anthropic's Definition and Insights on AI Agents
Anthropic employees shared their expertise in a YouTube video about AI agents. This video builds on top of the blog article Building effective agents released in December 2024.
The key difference between AI agents and simpler AI workflows lies in autonomy. An agent allows the LLM to decide how many times to run and iterate until a resolution is found, whereas an AI workflow follows a pre-defined path with a fixed number of steps.
Developers should consider the model's perspective, providing clear instructions and detailed tool descriptions. Don't give the model bare-bones tools with parameters named "A" and "B" without documentation and expect it to work well.
The Importance of verification: Establishing feedback loops and verification is crucial for any agents to converge on the correct answers.
They also shared a potentially controversial take: agents for consumers are fairly overhyped right now. Verifying the agent's actions is almost as much work as doing the task yourself. The given example was booking a vacation.
My Take: This video gives nice insights for everyone interested in the AI space. It's a great resource for understanding the current hype around agents, clarifying what they can and should do, as well as where other solutions might be a better fit.
Disclaimer: This newsletter is written with the aid of AI. I use AI as an assistant to generate and optimize the text. However, the amount of AI used varies depending on the topic and the content. I always curate and edit the text myself to ensure quality and accuracy. The opinions and views expressed in this newsletter are my own and do not necessarily reflect those of the sources or the AI models.