I Created a Podcast Episode with A.I.

Updated: Oct 5, 2021



What if I told you you could type a script, have a computer read it, and within seconds, have the perfect voiceover in any accent you want... for free?


Would it sound natural? Real? Could it pass the uncanny valley? Or perhaps would all just be a big waste of time.


Jump to the bottom to watch/listen to the full podcast episode video (And yes, I used AI to make that too)!

In a recent episode of the TDGR Podcast, we let you be the judge. Let's just say that when it comes to the implications of A.I. in digital marketing strategy, stay tuned, it's about to get deep!


Advancements in A.I.

With massive advancements in AI, text-to-speech technology has vastly improved in recent years. In some cases, to the point that voices produced by computers are virtually indistinguishable from humans. This technology is by no means perfect, but, as I said, it's pretty darn close.


So, what does this mean for content creators? As anyone who knows me, has read a past blog post or heard an episode of my podcast knows, I love to test new technologies before I recommend them or their applications to my clients. I'm very much a "practice what you preach" kind of guy.


With that in mind, I wanted to test the capabilities and practical applications of the technology by seeing if I could produce a full podcast episode using nothing more than text-to-speech technology. More importantly, I wanted to see if the episode would be something that people would actually listen to, or if the end product would sound like the podcast equivalent of nails on a chalkboard.


There are many reasons that someone might want to utilize this kind of technology. Perhaps you're a great writer, but not a big fan of having your voice recorded. You know you need content, but are not sure how to go about creating it. Maybe you need voiceover work for an explainer or marketing video on YouTube. Maybe you'd love to start a podcast, but don't have proper recording equipment and operate on a tight budget. I've often said that many blogs would make great podcast segments or even full episodes!


The practical applications for content creation and repurposing content really are endless.

I was able to create a true test of the effectiveness of the technology. I wanted to try different AI voices, inflections, accents, and more, and blend them seamlessly into one podcast segment for an episode. In fact, I wanted to make sure the entire episode, with the exception of my professional intro by Steve Zarro, was 100% created with text-to-speech.


Artistic image of an android to represent the power of AI to imitate humans

All of this was done simply by using a free tool at voicegenerator.io. Now, please note that voicegenerator.io is not a sponsor of this episode or my blog.


That said, I was able to create an engaging podcast episode, rather quickly, simply by going to the website, typing out my script, and choosing from any one of 13 different English voices provided, and hitting play. Each time I hit play, the segment was temporarily saved into a small "playlist". I then used my phone to record the audio directly from my laptop's speakers.


Some of the text samples faired better than others, but overall, I have to say, I was impressed with the application- Especially for being a free tool.

From this simple exercise in testing new technology, I was able to produce countless pieces of valuable content including a podcast episode, a blog, snippets for sharing on social media, video content, and more.


Like any tool, the power is not in the tool itself but in how it's used.


Let's go back to the 13 voices option for a moment. I should be more specific. The 13 voices are only the English options. In total, I counted about 75 different languages, accents, and even gender combinations available.


This is an important factor because knowing your target audience is critical for any marketing effort. People respond best to others who remind them of themselves.


The human voice has played a key role in powerfully connecting with others for generations. From radio shows of yesteryear to podcasts, Clubhouse, Twitter Spaces, and other audio social platforms, voice marketing has continued to grow in popularity. This trend has shown no signs of slowing, so you can see exactly how voice generation and text-to-voice applications can be attractive.


With the ease of copy and paste into platforms like voicegenerator.io, I can see the appeal. I can't tell you how many times I have written a blog and thought, "God, this would make an awesome podcast episode... if only I had the time to record it!" Who knows, perhaps in future episodes of the TDGR Podcast I'll do just that!

A CGI Image of a creepy driod

A word of warning, however...

The reason that voice is such a powerful medium is that it fosters connection. There is a uniqueness in every human voice that may never, and perhaps should never, be replicated.


People are starving for connection. It's in our nature to crave a sense of belonging, feel included, and be a part of a community, or something bigger. We want to connect with others. It's why story branding is so powerful!


By relying too heavily on new technology, we risk losing what made us unique. Our voice.

I believe that people seek to find a genuine connection to others, to that spark that makes your unique voice, in this case quite literally, yours.


Is simplification of the process worth risking the alienation of your audience?


Believe it or not, there may be a solution to this too. What if you could type out your script, and have the voice produced by AI be yours? With a product called descript, you can.


I wish I could provide an example here, but, unfortunately, that option costs money for a higher access model. Descript allows you to do the opposite of what we have discussed here.


With Descript, you upload a voice sample or previous recording, and the AI transcribes it for you. This technology is great if you would like to create closed captions for a YouTube or Vimeo video or turn a podcast episode into a blog.


The product comes in a variety of pricing options including Free, Creator, Pro, and Enterprise. These options give access to unlimited voice-to-text, screen capture options, and even audio and video editing options.


The higher-priced options give access to something you may find particularly interesting, if not downright scary.


Super freaky image of a robot with a human face and  hands

Your voice, automated.

At the Pro and above option, starting at $24 per month, you get access to a feature called Overdub. Overdub is essentially AI-produced voice deep fakes of nearly any voice... including yours.


Meaning, that once you upload samples of your voice reading various scripts, the program can turn almost anything you type into voiceover using your voice. Obviously, the more samples you provide, the better it sounds, but this might be the easiest and most efficient way to turn blogs into podcasts and more. The free version even comes with a trial of Overdub with a 1,000-word vocabulary.


The pro-level also gives access to better features for transcription of audio including AI technology that removes filler words such as "like," "um," and "you know," as well as access to the ability of up to 30 minutes of audiograms and other features.


So is it worth it?

You tell me. You can check it out for yourself at www.descript.com. How would you use this technology? What are your thoughts?


The only thing I can say for certain is that this technology is already in use.

With the ease-of-use and low cost to produce, AI voice technology is already being applied to many types of voice content that you're likely consuming- from podcasts to radio and voiceover ads. The future is now and it's an exciting time for trendsetting and the application of technology!



 
Want to see if AI has an application in your marketing strategy? Let's talk!


17 views0 comments