Adaptive Bitrate Estimation using Reinforcement Learning
Photo by rawpixel on UnsplashAdaptive Bitrate (ABR) algorithms are highly important when it comes to a greater video streaming experience. Reinforcement learning has proved to be extremely useful in this domain. In this study, we propose the NANCY-FFE model which uses the Asynchronous Advantage Actor-Critic network with Follow then Forage Exploration to increase the entropy-based explorations in the Pensieve model. The model was trained on the 3G/HSDPA dataset and compared with 6 other ABR models by testing on the 3G/HSDPA and FCC traces datasets using three different Quality of Experience (QoE) metrics. On two of the metrics, it showcased a maximum increase of 12.44%. In addition, it takes less time for training. This presents a promising advancement of state-of-the-art reinforcement learning based adaptive bitrate algorithms.