π€ Generative AI Redefines Video Creation
Dear curious minds,
Welcome to the ultimate newsletter for those interested in Artificial Intelligence (AI) and Personal Knowledge Management (PKM).
In contrast to previous issues, this one includes many videos. They are added as links to YouTube. However, there's a possibility they might be removed by the owner, which is unfortunate. But without these videos, the advancements highlighted would be less impressive.
In this week's issue, I bring to you the following topics:
Images in Milliseconds: SDXL Turbo
Text-to-Video: AI's Next Creative Frontier
Up-to-Date and Factual: PPLX Online LLMs
If nothing sparks your interest, feel free to move on, otherwise, let us dive in!
πΌοΈβ‘ Images in Milliseconds: SDXL Turbo
SDXL Turbo by Stability AI introduces a groundbreaking one-step image generation technology.
It uses Adversarial Diffusion Distillation (ADD) for high-quality, real-time image synthesis which reduces the steps needed for image creation, from 50 to just one.
SDXL Turbo is exceptionally fast, generating a 512x512 image in 207ms on an A100 GPU.
You can try a beta demo on Clipdrop or download the model from Hugging Face.
Available for personal, non-commercial use. For commercial use, contact Stability AI directly.
My take: The impressive speed and quality of SDXL Turbo's image generation is a game-changer. It empowers us to iterate our visual creations quickly, refining our ideas until they precisely match our vision. This means we're no longer settling for 'nearly right' images, but can keep evolving our concepts to achieve the perfect representation of what's on our minds.
ππ₯ Text-to-Video: AI's Next Creative Frontier
In this month, November 2023, we've seen remarkable progress in AI's ability to transform text into video, showcasing exciting developments in this cutting-edge technology.
RunwayML's Gen-2 received a rather silent release (π post) on November 2nd, which brought major improvements in the fidelity and consistency of video results.
On November 16, Meta AI released their text-to-video model named Emu Video besides an image manipulation tool named Emu Edit. So far, you can only try it with a limited amount of demo data.
Stability AI released their first text-to-video AI named Stable Video Diffusion on November 21 with a research only open-source license.
Pika AI released their Pika 1.0 on November 29 with the ability to edit videos in various styles like 3D animation, anime, and cinematic.
My take: These advancements by RunwayML, Meta, Pika and Stability AI signify a major leap in AI-driven video creation. The potential for AI in video creation is vast, offering exciting opportunities for creativity, efficiency, and new forms of storytelling. As we move forward, it's essential to navigate these developments with an eye towards responsible and privacy-conscious use of AI.
ππ Up-to-Date and Factual: PPLX Online LLMs
Perplexity AI introduces two new models,
pplx-7b-online
andpplx-70b-online
, designed to deliver helpful, up-to-date, and factual responses. These models are available through the pplx-api and their LLM playground Perplexity Labs.By leveraging knowledge from the internet, these models can respond to time-sensitive queries with up-to-date information. This involves building on top of open-sourced base models, in-house search technology, and regular fine-tuning for enhanced performance.
Perplexity's evaluation of these models focused on helpfulness, factuality, and freshness. The models were tested against realistic use cases, with human evaluators preferring their responses over gpt-3.5 and llama2-70b-chat.
The pplx-api, hosting these models, is transitioning from beta to public release. A new usage-based pricing structure has been introduced, featuring state-of-the-art infrastructure for fast inference.
My take: ChatGPT with browsing already showed us that utilizing up-to-date and accurate information greatly enhances the answers for many questions. Kudos to Perplexity AI for launching a competitor which is blazing fast. However, the presented comparison might be somewhat skewed, as they didn't include a GPT-4 model with web-search capabilities, which could have offered a different perspective. I tried their models in the playground linked above, and the biggest pain point there is the lack of direct links to sources in their responses, which hinders the trustworthiness of the information provided. This will vanish after the integration of the new models in their default site.
Disclaimer:Β This newsletter is written with the aid of AI. I use AI as an assistant to generate and optimize the text. However, the amount of AI used varies depending on the topic and the content. I always curate and edit the text myself to ensure quality and accuracy. The opinions and views expressed in this newsletter are my own and do not necessarily reflect those of the sources or the AI models.