If you’re ready to add video streaming to your app, you need to understand transcoding, the technology that makes it a reality.
In this article, you will learn what transcoding is, what it means for livestreaming, and how you can get started building decentralized video streaming with Livepeer Studio.
Transcoding is the process of decompressing (or decoding) an encoded video file and recompressing it into a new format so that viewers who stream using lower Internet speeds, bandwidths, and different sized devices can consume a video at the highest quality possible without buffering.
Thanks to video transcoding, you get multiple versions of the same video in different sizes (i.e., resolutions) and qualities (i.e., bitrates, frame rates), each of which is optimized for viewers with different internet speeds and devices. This ensures that your video is viewable by anyone in the world with an internet connection, regardless of their location and viewing device.
To understand how transcoding works, we first need to take a step back and talk about encoding.
Encoding is the process wherein your computer turns an uncompressed audio or video file into a new compressed format so that you can watch it properly. Your computer compresses the video file with the help of an algorithm known as “codec” (a portmanteau of the words “coder/decoder”).
As the name suggests, the codec is responsible for both encoding and decoding a video file — the former to compress it, the latter to watch it.
Encoding is the first step of a video streaming process. Once you have finished encoding your video file, you then need to make it compatible with every viewer's device and internet speed.
That's where transcoding comes into play.
To put it simply, you need video transcoding to reach more viewers with your content. Streaming a video directly from a video recording device without transcoding would prevent viewers with slow internet connections and low bandwidths from viewing the video properly.
Imagine you’re streaming a video using the MPEG2 format — common in broadcast television — with a resolution of 3840×2160 pixels (i.e., 4K UHD) with a bitrate of 13 Mbps.
Without transcoding it, viewers with slow internet connections and small devices wouldn't be able to load the video properly. Maybe they'd watch a few seconds uninterrupted at 4K, but soon after, they would have to wait a few seconds to load more video. The slower the connection, the more time it'd take, something that would ruin their viewing experience.
With the help of transcoding software, you'd simultaneously create different renditions of your video, each with a different bitrate, resolutions, and frame sizes so that every viewer gets an uninterrupted stream that best suits their internet speed, bandwidth, and device size.
Transcoding provides a win-win situation for publishers and viewers because:
Transcoding is a GPU-intensive process involving changing several aspects of a video asset’s format at once. Three similar but less resource-intensive processes to transcoding that still benefit publishers and viewers are:
Transrating, which takes a video's bitrates and converts them into lower-quality bitrates without modifying their original size and format.
Transsizing, which reduces a video’s frame without modifying its content. For example, it takes a video with a 4K Ultra-HD resolution (3840×2160 pixels) and re-packages it for 1080p (1920×1080 pixels) and 720p resolutions (1280×720 pixels).
Transmuxing, which repackages a video and audio file into a new format without changing its content, bitrate, or size. For example, you may take a video in a Flash (.flv) format that uses the H.264 video codec and AAC audio codec and repackage it into a .mov file.
When it comes to streaming video online, you will have to decide what quality and size you want to transcode. The higher the quality and the larger the size, the more it will cost you.
But to do that, you first need to understand the three factors that affect a video's quality and size: the frame rate, the resolution, and the bitrate.
The frame rate is a value that represents the frequency at which images — or frames — appear in one second of your video. The higher the frame rate, the better quality of your video, but the larger the file size and the higher amount of storage you need to store the video.
The movie and gaming industries typically use 24 fps and 30 fps respectively, while high definition or HD video use 60 fps, the maximum fps that most monitors and TV’s can display.
The resolution is a measurement that represents the number of horizontal lines a video has from top to bottom. It's measured by pixels (px) or single "points" in a picture. Some of the most common video resolutions are:
A video with a 720p resolution is made up of 720 columns of pixels and 1280 rows. For comparison, a video with a 360p resolution is made up of half that many columns. Consequently, the latter will be half as sharp as the first resolution and should be viewed on a smaller screen to achieve the same perceived video quality.
The bitrate represents the number of bits — the smallest unit of computer data — that can be processed in a specific amount of time. The bitrate of a video file is measured by its size in bytes divided by the playback time of the recording (in seconds), multiplied by eight.
Since bitrate is defined by the video size, the resolution, the frame rate, and the codecs affect it. For example, a YouTube video with a 720 pixel resolution and 30 fps using the H.264 codec would have an average bitrate of 2.7 Mbps. The same video with 60 fps would have an average bitrate of 4.1 Mbps.
A higher bitrate will generally mean a higher video quality and larger file size, though the quality will depend on what codec is used in the video transcoding process. Newer codecs, like HEVC (H.265), have gotten much better at keeping bitrates low, which make videos smaller and easier to transfer over the internet while maintaining the video quality high.
The hardest problem you want to avoid when streaming content online is latency. Put simply, latency is the delay (or “lag”) between the server that streams the content, generally a CDN, and the transmission the viewers receive.
A minor amount of latency should be expected in any transmission as the time it takes for data to be transmitted through the web, when aggregated, should add up to a few seconds. But when the latency is high enough that the viewer is aware of the delay between what they view and what’s happening in real life, it can ruin the streaming experience.
The goal for your video workflow is to minimize latency while keeping your reach intact. With that goal in mind, how do you actually achieve it?
Traditionally, broadcasters handled transcoding by sending their video to an onsite or centralized media server that handled the process. Each rendition would be packaged into the delivery format and sent to the CDN edge, where the viewer would load the video to watch on their device.
The problem with this process is that once the broadcaster scales the number of streams, it can quickly run into performance bottlenecks with the media server and transcoding system.
One way to overcome the scalability issue is to run multiple concurrent media servers and load the balance between them. But without proper management, one of the transcoding processes can fail in the middle of a stream, causing significant delays.
Newer solutions that leverage a decentralized transcoding network, like the one Livepeer Studio uses, have overhauled the way ingest servers handle incoming video streams.
First, the ingest servers break down the video into segments based on incoming keyframe intervals, which are then sent to a network of decentralized transcoding clusters. Each cluster can transcode over 100 video streams simultaneously, reducing the latency compared to traditional cloud networks.
Second, the ingest servers make sure that the latency and performance of each transcoding cluster is never close to its maximum capacity, allowing it to switch to different clusters when performance starts to slow down.
Each transcoding cluster has layers of redundancy and failover, so if a failure happens, they quickly retry within the cluster. Should a performance bottleneck arise, like a congestion in a region, a different independent transcoder cluster can continue the work instantly.
Such an approach allows each ingest server to handle several streams simultaneously. Video streamers can scale their reach without fearing any bottlenecks in the middle of a broadcast.
If you want to transcode your live broadcasts on your own, you will need to set up a software solution (like ffmpeg) on a computer with a powerful GPU that can handle the feed and adapt the RTMP feed for each resolution, frame rate, and bitrate. This entire process can quickly become expensive, challenging, and time-consuming.
A much more common approach is to use an Infrastructure as a Service (IaaS) solution like AWS or Cloudflare to transcode your content at scale. Although they can be highly adaptable and price-competitive, they require technical knowledge to implement.
Alternatively, Software as a Service (SaaS) solutions like Wowza or Dacast offer fixed plans based on the amount of content you need to stream. These companies are ideal for clients that need a simpler solution, despite offering higher hourly prices than their IaaS competitors.
Livepeer Studio offers a powerful alternative, with a unique approach that leverages the Livepeer network, international network of distributed computers to provide reliable video transcoding at prices 10x cheaper than public cloud transcoding providers.
Cloud services like AWS charge as high as $3 per hour of transcoding. However, these prices can vary depending on the resolution you wish to stream—the higher the resolution, the more you will pay.
In contrast, Livepeer Studio charges $0.3 per hour (or $0.005 USD per minute) of video ingested. We offer such low transcoding costs thanks to the use of our global network of mining farms (known as “orchestrators”). These orchestrators offer their unutilized mining equipment (such as their GPUs, CPUs, and bandwidth) to ingest and transcode video content in exchange for ETH and LPT.
Unlike AWS, we don't charge different prices for transcoding to different resolutions. Whether you want to offer a 4K resolution or 360p, the price will be the same, ensuring that all of your viewers from around the world will enjoy a flawless experience.
The transcoders are rewarded for their work while video streamers receive their ingested and transcoded content seamlessly.
Transcoding is the process that allows content creators to stream video to any viewer, no matter their Internet connection, device, or location. Without it, people wouldn't be able to watch your videos without suffering any significant delays and quality issues.
The higher the frame rate, resolution, and bitrate you stream your video, the more expensive and complex your transcoding will be. Make sure to analyze these factors when you decide to use video streaming on your site.
Although traditional transcoding solutions can easily fulfill your needs, newer solutions like what Livepeer Studio offers can provide the same results without disregarding scalability, speed, and cost-effectiveness.