π€ AI Hype vs Reality in the New Search Era
Dear curious minds,
In this week's issue, I bring to you the following topics:
TOMRA AI Summit: Empowering Minds in Perceptive and Generative AI
Google Gemini: Reading Between the Lines of the Launch
Ditch Google! AI Tools Provide Answers and Source Links
Tech Terms: Chain of Thought (and CoT@32)
If nothing sparks your interest, feel free to move on, otherwise, let us dive in!
β¨π TOMRA AI Summit: Empowering Minds in Perceptive and Generative AI
My employer TOMRA, a company known for its reverse-vending machines and sensor-based sorting systems, organized an internal AI summit in Oslo, which took place on the 28th and 29th of November.
TOMRA released an article and a video summarizing the event.
I was in the organizing committee and have to state that it was running super smooth from my perspective.
Day 1 was filled with talks about current and future work in the AI space from people working in the field. As we use AI in TOMRA to classify objects for sorting and reverse-vending machines, this day covered what I call perceptive AI.
Day 2 was my personal highlight, as the initially planned workshops condensed to one about generative AI. Together with two colleagues, I organized the latter, and we prepared a combination of talks for education, group sessions for ideation and practical experiments for exploration.
Another highlight were the keynotes from Microsoft and Google employees, both showcasing their generative AI tooling.
My take: The exchange of ideas between colleagues from different departments was truly enriching. We were able to explore a diverse range of topics within AI, from new concepts of using AI to get better objection classifications for our machines to the new possibilities of using generative AI tools to improve productivity. The event will certainly be remembered for a long time, and I am sure that this will not be the last TOMRA AI summit.
πβ οΈ Google Gemini: Reading Between the Lines of the Launch
Last weekβs issue summarized the release of Googleβs ChatGPT competitor Gemini, and I was fully hyped to finally see a competitor to GPT-4 on the horizon.
Meanwhile, the pixie dust settled and there are various points which indicate that Google did tune their launch material to impress.
The benchmarks comparing Gemini to GPT-4 are not presented in an honest way.
The impressive video showcasing how Gemini understands what happens in a video stream was not done in real-time, but the result of stitching together results from multiple image interpretations.
Another point which I realized too late for the latest issue is that the Gemini Pro integration in Bard is so far not available in the EU and UK. However, you can use a VPN to circumvent this.
Meanwhile, Google released the Gemini API which allows developers to build applications with Gemini Pro.Β
My take: While it's good that Google has entered the AI chatbot field with Gemini, the launch is shadowed by not being totally honest. The highly impressive features shown were actually a combination of results, not real-time executions as they appeared. Being transparent would likely increase the trust of users. Hopefully, future updates will deliver what was promised with the announcement and are a more authentic representation.
π€π Ditch Google! AI Tools Provide Answers and Source Links
Is the time of traditional web-search already over? Tools like Google and Bing already started to integrate AI responses in their tools. At the same time, LLMs with web-access can cover the same requests.
I evaluated various services with the following question: Who is the CEO of the Steinert GmbH? Answer: Peter Funke. The Steinert GmbH is a small company in Germany building recycling machines competing with the ones from TOMRA (see above), so small that their CEO is not encapsulated in the leaned knowledge of GPT-4.
I compared the output of multiple tools for this question and the result are quite surprising:
Google (classic web-search, with some AI features depending on the search phrase): It is an easy question and with a little bit of work you will identify the right link and navigate to the LinkedIn profile of Peter Funke.
Microsoft Bing (classic web-search, with some AI features depending on the search phrase): The Bing Chat (or Copilot as it is called today) integration shows the result including a source link on top of the classic search results.
Microsoft Copilot. The result is correct, but the source more than three years old. Tip: Always switch to creative mode as, in contrast to the Balanced one, GPT-4 is used.
ChatGPT: Surprisingly the worst result in this series as ChatGPT gives the wrong answer and points to the CEO of the US division.
Google Bard (using Gemini Pro): The reply is correct, but Bard does answer this question without linking a source. However, the UI has a button which allows double-checking the responses and looking for sources to back the answer.
Perplexity: The clear winner as the result is presented superfast and backed by many sources. The only minor drawback is the profile picture shown, as it is not showing Peter Funke but his colleague and CTO Markus Reinhold.
My take: The question might be too easy and more complicated requests are needed to really judge which tool is currently generating the best results. Nevertheless, this test clearly shows that even for simple factual requests ChatGPT shouldn't be used, and you always need to validate the results. I want to highlight the hidden champion of this evaluation: Perplexity. They've been quietly revolutionizing web-search result evaluation with their innovative AI for quite some time. Besides the free usage, they also offer a paid tier which can use GPT-4 and various open-source models for the reasoning.
π»π Tech Terms: Chain of Thought (and CoT@32)
Chain of thought prompting is a way to help AI models solve complex problems by making them think step by step, like how a human would. In this method, the AI is prompted to explain its thinking process in detail, breaking down the problem into smaller parts and solving each part one by one. This approach helps the AI to handle complicated tasks more effectively.
To use chain of thought prompting yourself, just add the following to the end of your prompt: βExplain your thought process step by stepβ. Besides increasing the chance for a right answer, the chain of thought output might also help you to judge if the answer is correct.
Google refers in their Gemini report (section above) to a method named CoT@32, which is a specific version of chain of thought prompting. Here, the AI creates 32 different responses following the chains of thought approach for one question. Each of these 32 explanations is like a different way of thinking about the same problem. After generating these explanations, the answer given most times is chosen. This method is useful because it allows the AI to explore many ways of solving a problem, which can lead to more accurate and reliable answers. However, at the same time, way more compute and time is needed to answer a question.
βπ€ Mind Bender
Can AI help in identifying knowledge gaps in our personal notes?
Think about the prompt and identify what your answer to this question is. If you are curious what GPT-4 replies to this prompt, take a look here. You are welcome to share any thoughts by replying to this mail. I would love to hear from you!
Disclaimer: This newsletter is written with the aid of AI. I use AI as an assistant to generate and optimize the text. However, the amount of AI used varies depending on the topic and the content. I always curate and edit the text myself to ensure quality and accuracy. The opinions and views expressed in this newsletter are my own and do not necessarily reflect those of the sources or the AI models.
This publication is free. If youβd like to support me, please recommend this newsletter to anyone you think would enjoy it!