Mike Young | The Blog Pros

October 13, 2023

Hasdx and Stable Diffusion: Comparing Two AI Image Generation Models

Generating realistic images from text prompts is an exceptionally useful capability enabled by recent advances in AI. In this post, we'll compare two of the top text-to-image models available today — hasdx and Stable Diffusion — to better understand their strengths, differences, and ideal use cases.

First, some background. Both hasdx and Stable Diffusion leverage deep learning techniques to generate images that remarkably match text descriptions provided by the user. This makes them invaluable for creators, designers, and businesses who want to quickly ideate visual concepts, create prototyping assets, or produce custom images and media.

October 12, 2023

Comparing LLMs for Chat Applications: LLaMA v2 Chat vs. Vicuna

AI language models have revolutionized the field of natural language processing, enabling a wide range of applications such as chatbots, text generation, and language translation. In this blog post, we will explore two powerful AI models: LLaMA 13b-v2-Chat and Vicuna-13b. These models are fine-tuned language models that excel in chat completions and have been trained on vast amounts of textual data. By comparing and understanding these models, we can leverage their capabilities to solve various real-world problems.

Introducing LLaMA 13b-v2-Chat and Vicuna-13b

The LLaMA 13b-v2-Chat model, developed by a16z-infra, is a 13 billion parameter language model fine-tuned for chat completions. It provides accurate and contextually relevant responses to user queries, making it ideal for interactive conversational applications. With its impressive capacity, LLaMA 13b-v2-Chat can understand and generate human-like text responses.

October 3, 2023

Hyper-Realistic Text-to-Speech: Comparing Tortoise and Bark for Voice Synthesis

Text-to-speech (TTS) technology has seen rapid advances thanks to recent improvements in deep learning and generative modeling. Two models leading the pack are Bark and Tortoise TTS. Both leverage cutting-edge techniques like transformers and diffusion models to synthesize amazingly natural-sounding speech from text.

For engineers and founders building speech-enabled products, choosing the right TTS model is now a complex endeavor, given the capabilities of these new systems. While Bark and Tortoise have similar end goals, their underlying approaches differ significantly.

July 31, 2023

A Plain English Guide To Reverse-Engineering the Twitter Algorithm With LangChain, Activeloop, and DeepInfra

Imagine writing a piece of software that could understand, assist, and even generate code, similar to how a seasoned developer would.

Well, that's possible with LangChain. Leveraging advanced models such as VectorStores, Conversational RetrieverChain, and LLMs, LangChain takes us to a new level of code understanding and generation.

July 15, 2023

A Complete Guide to Turning Text Into Audio With Audio-LDM

In today's rapidly evolving digital landscape, AI models have emerged as powerful tools that enable us to create remarkable things. One such impressive feat is text-to-audio generation, where we can transform written words into captivating audio experiences. This breakthrough technology opens up a world of possibilities, allowing you to turn a sentence like "two starships are fighting in space with laser cannons" into a realistic sound effect instantly.

In this guide, we will explore the capabilities of the cutting-edge AI model known as audio-ldm. Ranked 152 on AIModels.fyi, audio-ldm harnesses latent diffusion models to provide high-quality text-to-audio generation. So, let's embark on this exciting journey!

July 11, 2023

How to Turn Images Into Prompts With the Img2Prompt AI Model: A Step-by-Step Guide

Have you ever come across a stunning image and wished you could instantly generate a captivating text prompt that matches its style? Look no further. In this guide, we'll explore an incredible AI model called "img2prompt" which allows you to generate approximate text prompts that align with the style of any given image. Whether you're an artist, a writer, or simply looking to explore the creative possibilities of AI, this model will revolutionize the way you approach image-to-text generation.

To kick things off, let's take a closer look at the img2prompt model on AIModels.fyi and understand how we can utilize this powerful tool to bring our imaginative ideas to life.

May 12, 2023

Cleaning Up AI-Generated Images With CodeFormer: A Complete Guide

Sometimes, AI-generated photos come out a little bit… wonky. Maybe they’re low quality, or perhaps there are weird artifacts that make the image look less than perfect. But fear not! CodeFormer is here to save the day, helping you fix up those images in no time. In this guide, I’ll introduce you to the CodeFormer model, show you how it works, and explain how to use it to fix up a slightly warped AI-generated photo. I’ll walk you through the exact steps I used to clean up the weird image I got from another AI model, shown below:

*This dude looks terrible and scary. Original generation from Arcane-Diffusion.*

This image came from the arcane-diffusion model, which I was using for another blog post. I’ll show you how you can use the same workflow I followed to clean up your own generated images and even upscale them to look better. I’ll do this walkthrough using the Replicate Python SDK, but there are also many other languages supported on Replicate.