🤝 Equip Your Family Against AI Fake Calls

Jan 25, 2024

Dear curious minds,

The generative AI space is evolving fast and the niche application of AI audio can pose a real threat, which you will learn about today.

In this week’s issue, I bring to you the following topics:

Zuckerberg Reveals: Llama 3 in Training, Aims for Open Source AGI
The New Era of AI Deepfake Calls: Are You Prepared?
Unleash Creativity: Top 3 Free AI Art Apps

If nothing sparks your interest, move on, otherwise, let us dive in!

🤖💰 Zuckerberg Reveals: Llama 3 in Training, Aims for Open Source AGI

Mark Zuckerberg confirms in an Instagram post that Meta is focusing on AI and aims to build artificial general intelligence (AGI). They plan to open source it responsibly for widespread use. Meta's two AI research divisions, FAIR and GenAI, are being brought closer together to support this vision.
Post from Mark Zuckerberg about Meta’s AI efforts. [source]
The first two versions of Llama changed the landscape of AI, making powerful language models accessible on consumer-grade hardware. Llama 2, in particular, was open source and performed exceptionally. Currently, Meta is training Llama 3.
Meta is building an extensive infrastructure, planning to deploy 350k of Nvidia current flagship H100 GPU’s this year. This will result in almost 600k H100 equivalents of compute, with other GPUs included.

The costs are massive, as with the price of $30k per H100, the investment to buy 350k H100 is already 10.5 billion dollars. The costs for running one H100 for a month was estimated by GPT-4 at slightly above $100, which adds up to more than 35 million dollars per month for 35k H100 GPU’s.

# Monthly cost estimate for running one H100 = 106.9 USD (by GPT-4)

# Constants
gpu_cost = 30000  # Cost of one H100 GPU
power_consumption_watt = 350  # Power consumption in watts
electricity_cost_per_kWh = 0.13  # Average cost in USD
operational_hours_per_month = 24 * 30  # 24 hours a day for 30 days
cooling_infrastructure_multiplier = 1.5  # 50% additional cost for cooling and infrastructure
maintenance_annual_percentage = 0.01  # 1% annual maintenance cost

# Calculations
# Monthly power cost
monthly_power_cost = (power_consumption_watt / 1000) * electricity_cost_per_kWh * operational_hours_per_month

# Monthly cooling and infrastructure cost
monthly_cooling_infrastructure_cost = monthly_power_cost * cooling_infrastructure_multiplier

# Monthly maintenance cost
monthly_maintenance_cost = (gpu_cost * maintenance_annual_percentage) / 12

# Total monthly cost
total_monthly_cost = monthly_power_cost + monthly_cooling_infrastructure_cost + monthly_maintenance_cost
total_monthly_cost

Zuckerberg also highlighted the progress in developing new AI-centric computing devices like Ray Ban Meta smart glasses.
My take: This announcement by Zuckerberg is a clear indicator of Meta's strategic shift from the creation of the Metaverse towards AI. At least in the short run, as you could argue that the advances in AI will support the realization of the Metaverse. The focus on training Llama 3, combined with the massive investment in compute infrastructure, demonstrates Meta's commitment to leading in AI. The open-source approach aligns with my vision for democratized AI, in contrast to the scenario where the most advanced AI is controlled by one or a few companies.

🤖📞 The New Era of AI Deepfake Calls: Are You Prepared?

A recent incident involved an AI-generated robocall impersonating President Joe Biden. This call, aimed at voter suppression, was almost certainly created with AI, according to experts. Despite sounding like Biden, the tone was unnatural, a common tell in today’s AI-generated audio.
As of today, AI voice generation tools are easily accessible. They require minimal audio samples to replicate voices, posing a significant risk for misinformation and impersonation.
ElevenLabs, the so far most advanced AI voice technology company, recently secured a $80m Series B funding. This investment will further strengthen their position in voice AI research and product development.
As AI voice technology is constantly improving and identifying deepfakes is increasingly challenging, the Australian professor and AI educator Jeremy Nguyen raised on 𝕏 the following question:
Have you had the talk with your parents: —what to do when "you" call them up and say you're stuck in a foreign airport and need them to wire "you" some money?
To prepare for the next generation of AI deepfake calls, you should think about a 2-factor authentication to identify yourself and others. This means that instead of only relying on the voice of whoever calls, ask questions which only the real person can answer. The safest way to do so is to define a passphrase which is only used in a real emergency.
Sketch visualizing the two-factor authentication to identify persons. [source]
My take: The advancements in AI voice technology, especially by companies like ElevenLabs, are impressive yet alarming. They highlight the urgent need for enhanced privacy and security measures. The fake Biden call is a stark reminder of how AI can be misused, and from my point of view, it will not take long until it will even for AI audio detection tools impossible to distinguish real from fake voices. You should prepare yourself and your family by sharing codewords which are used in emergencies.

🎨✨ Unleash Creativity: Top 3 Free AI Art Apps

If you want to create AI artworks, Midjourney is likely still the best option, but it has no free tier and needs a subscription to be used. If you only generate occasionally an image or two, there are alternatives which also create visually pleasing results and can be used for free.
To test and compare the results of various applications, I used the framework named Dynamic Prompting from @LinusEkenstam which contains six elements to create the prompt stated below:
- Subject - The focus of the image
- Acting - What the subject is doing/feeling/emoting
- Style - Features of the subject including clothing, ethnicity, eye color, hair color, etc.
- Place - Location of the image, also the year or era can be added to impact style
- Time of Day - Exactly as it sounds, but also helpful to tweak the lighting
- Photographic Equipment - The fancy details of film, lens and camera settings that offer fine control
A portrait of a beautiful red-headed woman with freckles on her face, in the winter park at dawn, uhd image, hyper-realistic, half face shot
Microsoft Designer uses the DALL-E 3 model from OpenAI, which can also be used as part of the paid ChatGPT Plus subscription. Both realizations generate only one interpretation of your prompt and do not offer beginner-friendly options to alter or continue working with your generation. However, there are variants in the Microsoft Designer which enable you to remove objects or the image background, as well as expand the latter.
Example of an image generation in Microsoft Designer.
The result of Adobe Firefly looks good at first glimpse, but especially the view direction appears to be inconsistent for most generations. The generated persons appear to squint. However, this is nearly not existing in the upper right creation, which I really like. Overall, the sign-up and the free (up to 100 / month) image generations with Adobe Firefly were a good and smooth experience. The sidebar covers many options which I did not touch for my generation as they were mostly already covered in the prompt itself. In contrast to Microsoft Designer, you get four interpretations of your prompt, which can be a big advantage. Also, the options in the right sidebar are a plus as they help to experiment without expert knowledge or a lot of experience.
Example of an image generation in Adobe Firefly, resulting in four interpretations.
Leonardo AI is a tool which I have used since it early days when it focuses on generating assets for computer games. It builds on top of the freely available Stable Diffusion models, but also adds fine-tuned model versions. In the free tier, you currently get 150 tokens per day. However, depending on the mode, one generation, which results in four interpretations, can already cost up to 21 tokens. Besides image generation, Leonardo offers tools to upscale your generations, extend the images and even has a live generation mode which updates the generated image while you edit your prompt.
Example of three image generations in Leonardo AI, each resulting in four interpretations.
My take: By offering these powerful tools for free, Microsoft, Adobe and Leonardo make it possible for anyone to explore their artistic potential, regardless of their background or experience level.

Disclaimer: This newsletter is written with the aid of AI. I use AI as an assistant to generate and optimize the text. However, the amount of AI used varies depending on the topic and the content. I always curate and edit the text myself to ensure quality and accuracy. The opinions and views expressed in this newsletter are my own and do not necessarily reflect those of the sources or the AI models.

This publication is free. If you would like to support me, please recommend this newsletter to anyone you think would enjoy it!

Aidful News

Discussion about this post

Ready for more?