🎹 Music + 📖 Fiction + 📣 Marketing

Category: Marketing (Page 1 of 2)

Using Apple Books AI Based Digital Narration Tool For Audiobooks

In self-publishing novels with Amazon Kindle Direct Publishing, some asked if I’d ever get audiobooks created. I always laughed, as it was way beyond my budget, as they typically cost anywhere from $6,000 to $10,000 to produce. Of course, I really did want to create an audiobook.

I even dabbled with the idea of recording it myself in a padded closet at home, but didn’t want to deal with listening to my voice or investing in the proper microphone. Plus, my house is always loud with kids and cats, and slamming doors and cabinets. I’d have to record at 3AM nightly to get it recorded in silence. (Of course, there is AI denoising software, but…)

Years ago, we routinely rented Harry Potter audiobooks on CD from the Abington Township Library. My son loved to listen to them on long travels. They were fantastic, as different talents represented unique characters for the dialogue parts. It really brought the novels to life.

Fiverr and ElevenLabs To The Rescue?

Be Home By Dinner Audiobook

As I explored Fiverr.com, I realized some freelancers could produce the novel for about $5,000, but it was still too expensive. Also, the work of creating a character list of how to pronounce the many names and settings seemed challenging to do over a Zoom meeting or many emails.

ElevenLabs.io, which I’m really fond of, was a potential solution. But it would require that I get the $330 / month plan due to the word count. It also required writing out uncertain words and have them spelled out. For example, a pizzeria called Rosario’s would need to be re-typed to Rose-air-ee-ohs so that the AI tool could understand the correct way to pronounce the word.

A Free Solution?

For ebook syndication, I use Draft2Digital, which I recently discovered is an approved partner of Apple Books, and allows authors to generate audiobooks using Apple Books AI digital narration for FREE! All that you have to do is pick out the ideal voice to represent your novel and let the AI tool spend a few weeks with your ebook. It’s currently restricted to categories of romance, fiction, mystery and thriller, or science fiction and fantasy.

When I received notification that my novel Be Home By Dinner was published on Apple Books, I smiled because I knew that that there was no way in hell that the AI tool could possibly have nailed the dozens of character names, locations, 1980s pop culture references and more. I questioned how did it make a leap of faith in determining pronunciations. Why was I even participating in this? Was I just helping the Apple Books AI tool get smarter at my own expense?

On initial listen, I enjoyed the character voice that I had chosen. He was suitable for suspense, which is the genre of the novel. But… the “audiobooks without the overhead” definitely had its fair share of issues.

What Fell Apart With The AI Produced Audiobook

Character Names

I figured this would happen. But some character names, like Kova (the antagonist) took on a different pronunciation at different parts of the book. Sometimes it was Koo-Vah. Other times it was Kah-Vah. And sometimes it was the correct Koe-Vah. The name kept morphing, as if the AI narrator couldn’t agree on what to call this character, which I found odd since it’s a simple 4-letter name.

Author Name

Yes, even my name was butchered. Instead of stating Franke with a silent “E”, they included a hard “E”, like Frankie Goes To Hollywood. I felt like I was back in high school during role call with a new teacher.

What I Miss About A Human Narrator

Mouth Noises

Yes, it sounds weird and gross, but I missed sounds of the human element. Fake breath noises are not part of the AI equation yet, let alone lip smacking or air sneaking through teeth. The AI voice is a bit dry and sterile, with a clockwork tempo. At times you want to rattle the robot and have it take a shot of whiskey to loosen up and expand its range.

Be Home By Dinner Audio Book

Ambient Sounds

The AI voice is precise with perfect audio levels. But I miss the sounds of the room, like pages being turned or a glass of water being put down on a wooden table. The impurities of recordings are often the most endearing. The singer, Sting, accidentally sat on the piano during the recording of the song “Roxanne”, for example. The clang of piano keys was recorded, and The Police kept that in song. I remember listening to a lot of the Beat Generation authors perform readings and hearing the cigarette exhalations and ice cubes tinkling in glasses, cars whizzing by, or uproarious laughter of someone nearby. It was more vulnerable and electric.

Lively Dialogue

When I read a novel, I create a dialogue voice for each character. I imagine most people do. It just happens naturally to help break up the reading. With AI narration, the voice adjusts a bit with a conversation between two people, but it sounds like a screenplay read by someone vaguely interested in auditioning for a part in a film adaptation. The emphasis is not as strong, especially for highly emotional scenes of distress, even with multiple exclamation marks or ALL CAPS.

Correctly Pronounced Words

For heteronyms, the AI tool seemed to work based on a coin toss. For example, the word “tearing” was supposed to be pronounced like “eyes tearing up”, but it was pronounced like “tearing up a piece of paper”. The correct context was picked up by the AI tool sometimes, but not always.

Onomatopoeia can be a bit of a train wreck. For example, the “psss psss psss” cat call sounds resulted in the narration spelling out each instance of these phrases. I laughed hard on that one. “Shhhh” was known though.

Review Process?

An audio file that could be annotated would be the simplest solution, with a section that allowed authors to spell out the pronunciations of misspoken words. The file could be updated and the process automated until a green “Approved!” button is pressed. Maybe in the future?

Regardless of issues, I’m excited for the opportunity to have an audiobook at the ready for Be Home By Dinner. Check it out on Apple Books.

Looking to publish your own audiobook? Here’s how you can get started with your own Apple Books audiobook.

Creating Voice Narration With ElevenLabs Synthesis

What is ElevenLabs?

ElevenLabs specializes in creating natural-sounding speech synthesis and text-to-speech software, using AI and deep learning. 

The simple to use web software allows you to enter in your script and select from over two dozen voice personalities. Each voice has separate settings that can adjust Stability (more variable to more stable), Similarity (low to high) and Style Exaggeration (none to exaggerated). Nudging each of these settings a bit can produce very different outcomes. Sometimes words are over enunciated, extended in length, and sometimes a nervous chuckle get thrown in. But the results are highly realistic and ideal for short passages such as a voicemail greeting or video narration.

ElevenLabs

Over the holidays, there was a spot-on “Santa” character that was as boisterous and jolly as you can get.

Voice Options

The Voice Library provides user adjusted voices that you can add to your default voices. Descriptions really help with finding the most suitable voice. For example: “Middle Aged Man With British Accent”. Tags feature attributes such as: wise, clam, sassy, formal, intense, modulated, pleasant — making it easier to find the ideal voice.

There’s also a multilingual speech model that’s able to generate life-like speech in 29 languages.

How Much Does ElevenLabs Cost?

Five different plans starting at $0 and maxing at $330 are available, offering more features such as better quality, voice cloning, analytics and support.

What Does It Sound Like?

Here’s an example of the “back cover” description for BLIZZARD 96 using the character “Charlotte”:

Utilizing Ideogram AI To Create Custom Stock Photography

Sometimes you need a custom stock photo image. The stock image websites, like Unsplash, Pixabay and Pexels are excellent free sources. But when you have something super specific in mind, Ideogram.ai is one of many tools that could do the job for you.

Here’s some Leasing Consultants giving tours to prospects, created with Ideogram:

What Is Ideogram?

Unlike many AI content creation sites, Ideogram.ai has a minimalist feel with UI with just a small question mark on the bottom right that links out to a few pages. Like another famous social media site that ends in “gram”, Ideogram is a similar social media style account with a very similar feel to that original “gram”. You can choose whether you want your creations to be private or public as well, gain followers, “like” other posts and remix other account creations.

With your Google or Apple account, you can create an Ideogram account and then “Describe what you want to here”. From there, you can fine tune your aspect ratio and add stylistic effects such as:

  • Photo
  • Illustration
  • 3D Render
  • Typography
  • Cinematic
  • Poster
  • Dark Fantasy
  • Graffiti
  • Architecture
  • Conceptual Art
  • Ukiyo-e (I had to look this one up: It’s a Japanese art genre that include woodblock prints and paintings from the 17th to 19th centuries.)
  • and more!

Creating A Fictional Apartment Video With Runway

A fictional apartment community teaser made with Runway. The fire pits are extreme, and dog heads morph into cats, but it’s an aesthetic that keeps getting refined. Just watch out for those expanding donuts.

Yes, that’s supposed to be a dog on that guy’s lap. It’s getting better, but pets seem to be on the eerie side so far.

What Is Runway?

“Runway was founded by artists on a mission to bring the unlimited creative potential of AI to everyone, everywhere with anything to say. Beyond our innovative technology and creative tools, we also strive to create platforms and initiatives that will empower and celebrate the next generation of storytellers.”

from RunwayML.com

Check out a free trial version of Runway and you can create these types of content types based on descriptive queries and more:

  • Video to Video
  • Text / Image to Video
  • Generative Audio
  • Background Removals
  • Text to Image
  • Image to Image
  • Infinite Image

There are many other video effects tools as well.

Baron Ryan – TikTok Creator

Baron Ryan is an entertaining TikTok creator that is breaking boundaries on what you can achieve with one person and an iPhone. His sketches seem inspired by the Wes Anderson cinematic aesthetic, and he packs social commentary with humor that will make you ponder.

This video of him having a contemplative chat with his reflection in a train window is a great example:

I sometimes close my eyes and imagine a hotel and behind every door are all the people I could have been and lives I could have lived, infinite beginnings.

I’m alredy nostalgic for people I’ll never be.

But in the end, you gotta have mercy on yourself, even if you don’t do everything you’ve set out to do in life.

At first you want your dreams to be real. But you realize that some dreams are nicer as dreams. They can be there and you open those doors and live those lives anytime you like on the other side.

Besides, a memory and a fantasy both live in the head.

Baron Ryan
@americanbaron

All the people we could have been. Life could have turned out any which way and every which way comes with its own cocktail of inconveniences. The people we could have been look at us and imagine they could have been us. #loop #life #nostalgiacore

♬ Memory Reboot – VØJ & Narvent

The Creative Act: A Way Of Being

If you’re in a rut with any creative endeavor, or need a new vantage point, this book is an excellent compass. Packed with micro-chapters of every possible step on the journey to creating a song, painting, novel, design, web site–anything, this book will help you realize those eureka moments in the shower, or long gazes at inanimate objects, or stumped lacked of progress are just part of the process. Great read, void of fluff, straight to the point: The Creative Act: A Way Of Being by Rick Rubin

The Creative Act: A Way Of Being, By Rick Rubin

Perplexity.ai vs. Google SGE

Perplexity.ai, touted as “where knowledge begins” is a research engine that utilizes AI and natural language predictive text to answer questions. Similar to Google SGE, it cites sources (with direct links) and has a “follow up” chat section. Unlike Google though, there is a paid Pro version that allows you to top into a “smarter AI and more Pro Search” and generate images for $20 / month. (I haven’t tried that.)

What’s welcome about Perplexity.ai is the clean interface and lack of ads. With your “ask anything” prompt, you can then choose a “Focus” setting of:

  • All (Search across the entire interent)
  • Academic (Search published academic papers)
  • Writing
  • Wolfram | Alpha (Computational knowledge engine)
  • YouTube
  • Reddit (Search for discussions and opinions)

Another cool tool is the Library, which stores all of your sources and allows you tap back into them for the answers. You can also group your searches in to Collections and then create Secret or Shareable links to them.

If you’re looking for a unique break from Google and give your searches a collective stronger purpose, the Perplexity.ai free version is worth a spin.

Fender Rhodes on Perplexity.ai

Using DALL-E To Create Graphics You’d Never Visualize In Photoshop

The theme of this year’s Pennsylvania Apartment Association APARTogether conference was AI. With the event taking place at Valley Forge Casino Resort, I figured it would be great opportunity to use DALL-E Open Ai imagery and show George Washington leading a state-wide community of apartments, infused by a gambling theme!

Included with ChatGPT 4.0, DALL-E is one of many “text-to-image” AI tools that brings imaginative visions to life. Before these types of tools, if you asked me to create an illustrative graphic with a Founding Father standing amid a bustling array of apartment towers, I’d likely juggle several stock art photos, vectorize them and try to create a cohesive color, levels and saturation level to each layer.

Speaking of gambling, text-to-image tools feels like a slot machine at times, and the results of your query (like a bet) leading to fantastic rewards. Even if you don’t like the results, they often provide inspiration to create something even better.

Pennsylvania Apartment Association at Valley Forge Casino
Pennsylvania Apartment Association at Valley Forge Casino, near King of Prussia
Pennsylvania Apartment Association at Valley Forge Casino with Philadelphia in the background

Making Art In the Age Of Content

If you consider yourself an artist in any capacity, this video is for you: “Making Art In The Age Of Content”

Some highlights for me:

➡️ “The most modern form of art: Create something to serve the algorithms in an attempt to make it go virile.”

➡️ “There is no way to have a daily process that’s repeatable with guaranteed results, but these platforms encourage that and they force you to have higher output and open up the chances that something is going to take off.”

➡️ “The democratization of self expression is what turned everything into a competition.”

➡️ “No matter how much you put your heart and soul into something, it’s just as disposable as everything else on any given platform.”

➡️ “The algorithms exist, even though it’s not their intention, to continue to perpetuate a divisiveness between real artists and content creators.”

Death Of The Follower & The Future Of Creativity On The Web

This was a great presentation from Patreon CEO Jack Conte at SXSW. He speaks of striving for deeper connections opposed to more connections, and the relentless chaotic pursuit of chasing algorithms within social channels. His history of social media in the beginning alone is fascinating, especially if you lived through it in the late 90s and early 00s. So much has changed for the creative artist that is looking to promote their works.

The extreme “death of the follower” seems true on Facebook and Instagram. But on TikTok, the ability to toggle between followers and “For You” creates allows for two unique feeds offering the best of both worlds, and presents an powerful outlet for artists. Their works can seep into the “For You” feed as expected, potentially suited content that can transcend.

Focusing on gaining followers only can be a zero sum game. What percentage of your followers are truly the ones that purchase and promote your brand, opposed to the obligatory occasional “like”?

« Older posts

© 2024 Carl Franke

Theme by Anders NorenUp ↑