Fast Intra Mode Decision for HEVC Ban Doan[0000−0003−0900−6284] and Andrey Tropchenko[000−0001−9812−7947] ITMO University, Saint Petersburg, 197101, Russian Federation bandoan@itmo.ru, aatropchenko@itmo.ru Abstract. For the higher coding performance than the previous video coding standards, High-Efficiency Video Coding (HEVC) adopts an intra prediction method with 35 modes, which requires heavy computational complexity. Intending to reduce this complexity, we analyzed the role of modes and proposed a scheme that contains two rough mode decision (RMD) processes with a customized set of modes to be tested in the first stage. The second stage of the RMD is calculated for a maximum of 4 modes. As compared to the default encoding scheme in HEVC test model HM-16.20, experimental results show that the proposed method reduces encoding time up to 22.74% with negligible loss of coding efficiency. Keywords: HEVC/H.265 · Video compression · Intra prediction · Mode decision · Rate-distortion optimization. 1 Introduction In recent years, there has been a growing interest in services related to the transmission and storage of high and ultrahigh definition videos. The video cod- ing standard H.264/Advanced Video Coding (AVC) [1] published in 2003 has been unable to meet those requirements and the introduction of the HEVC [2] video coding standard as one of the solutions to the problem. Mainly due to the new coding tools and the flexible data structures, HEVC provides a significant improvement in compression efficiency compared to its pre- decessors H.264, especially when operating on high-resolution video content [4, 3]. Similar to older video compression technologies, HEVC is based on a hy- brid scheme of coding image blocks, which uses intra- and inter-frame prediction coding together with transform coding of residual data. HEVC contains several elements improving the efficiency of intra predic- tion over earlier solutions. HEVC design supports a total of 35 intra prediction modes, including Planar, DC and 33 angular modes, as presented in Figure 1, which contribute to representing different texture and object edge direction more precisely [5]. Due to the significantly increased number of intra modes, more techniques are required to efficiently encode the mode, one of which is to divide the frame into segments called coding units (CU), prediction units (PU), and transformation units (TU). The encoder needs to try all the combinations of Copyright c 2019 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). 2 Ban Doan and Andrey Tropchenko CU, PU, and TU in the rate-distortion optimization (RDO) process to find the best mode with the lowest rate-distortion (RD) cost [5]. Such a process is very time-consuming. Fig. 1. Intra prediction modes in HEVC While an increase in the number of intra prediction modes can provide sub- stantial performance gains, it also makes the RDO process more complex. To reduce the computational load of intra prediction, the official HM software [6] uses a fast encoding algorithm [5, 7, 10] with two phases through a combination of RMD and RDO process. First, all 35 modes are evaluated with respect to a cost function. N modes with minimum cost JSAT D are then selected as the most promising candidate modes. JSAT D = SAT D + λpred × Rpred (1) where SAT D represents the absolute sum of Hadamard transformed residual signal for a PU. λpred is a Lagrange multiplier, and Rpred represents the number of bits for the prediction mode. The number N is varied depending on the PU size. The N is set to {8, 8, 3, 3, 3} for 4 × 4, 8 × 8, 16 × 16, 32 × 32, and 64 × 64 PU, respectively. In the second step, three most probable modes (MPM), which are derived from the intra modes of the left and top neighboring PUs [5], are added to the list of candidates [8, 9]. The full RD costs with the reconstructed residual signal used for the actual encoding process are compared among those (N + 3) modes, and the prediction mode with the minimum RD cost is selected as the final prediction mode. The RD cost (JRDO ) for each intra mode is computed by: Fast Intra Mode Decision for HEVC 3 JRDO = SSE + λpred × Rtotal (2) where SSE represents the sum of the squared errors between the original CU and the reconstructed CU. Rtotal is the total number of bits used for encoding with this mode. In this way, the RDO process has to check only a maximum of 11 modes instead of all 35, and so the computational load can be reduced. However, the complexity is still high, since, in the RMD step all the 35 modes need to perform the cost calculation, the number of modes for RDO is still large. In this paper, a fast intra mode decision is proposed to further reduce the complexity of HEVC intra coding while maintaining the RD performance. 2 Analysis mode selection probability Theoretically, 35 intra modes play the same role and their probability of choice is equal. However, the results of analyzing some videos of various categories have given a different perspective. To perform the statistical analyses of frequently chosen modes, HEVC reference software HM-16.20 was used to encode a set of video sequences of different classes and resolutions. Statistical results for test sequences in class B and the sequence ”PeopleOnStreet” are shown in tables 1 and 2 with the four most frequent modes are represented in bold. Table 1. Average distribution of intra prediction modes for B-class test sequences. Mode Frequency, % Mode Frequency, % Mode Frequency, % Planar 23.31 12 1.95 24 2.02 DC 13.31 13 1.65 25 2.75 2 1.01 14 1.54 26 8.45 3 0.89 15 1.35 27 2.32 4 0.89 16 1.13 28 1.74 5 1.18 17 1.25 29 1.55 6 2.00 18 1.23 30 1.33 7 2.02 19 1.27 31 1.00 8 2.13 20 1.22 32 0.92 9 2.97 21 1.32 33 0.98 10 6.05 22 1.45 34 0.98 11 3.11 23 1.74 Figure 2 shows diagrams reflecting the probability (P ) of the choice of specific prediction modes when encoding test video sequences. Statistics show that Planar and DC prediction modes are most likely for all video sequences. However, the probability of being the best choice for the vertical and horizontal modes (Angular10 and Angular26) is much greater than the other angular modes. 4 Ban Doan and Andrey Tropchenko Table 2. Distribution of intra prediction modes for test sequence ”PeopleOnStreet”. Mode Frequency, % Mode Frequency, % Mode Frequency, % Planar 20.27 12 1.05 24 3.09 DC 12.11 13 1.24 25 2.62 2 1.61 14 1.83 26 7.72 3 1.35 15 1.06 27 2.24 4 1.70 16 0.81 28 2.35 5 2.62 17 0.91 29 1.94 6 3.77 18 0.78 30 1.61 7 2.68 19 0.97 31 1.11 8 2.22 20 0.98 32 0.94 9 3.85 21 1.23 33 1.21 10 4.69 22 1.83 34 1.57 11 1.27 23 2.75 Fig. 2. Mode selection probability. Fast Intra Mode Decision for HEVC 5 There has been significant work to speed up the intra mode decision process. Based on the above analysis, this paper proposes a scheme for reducing intra modes in RMD and hence increasing the encoding speed, the descriptions of which will be described below. 3 Fast mode selection Instead of all 35, we prefer to test the most probability modes. For this purpose, a set of modes was created, including DC, Planar and Angular modes (2 + 4i) where 0 ≤ i ≤ 8. As a result, only 11 modes are tested using the RMD process to find modes with the lowest cost. Let’s call the 2 best modes FM (first mode) and SM (second mode). After that, a flexible step (RMD2) was added after the first. The input data for RMD2 depends on what the FM and SM are. The general scheme is shown in figure 3 with the mode selection algorithm for the second step is presented as follows: Fig. 3. Block diagram of the proposed algorithm - For PU 16 × 16, 32 × 32 and 64 × 64: check if FM is not Planar, DC or vertical mode (0, 1, 26) then perform the second step of calculating RMD2 for 6 Ban Doan and Andrey Tropchenko 4 adjacent FM modes: F M − 2, F M − 1, F M + 1, F M + 2, after that update the list of candidates. Otherwise, RMD2 is skipped. - For PU 8 × 8 and 4 × 4: if FM and SM are Planar or DC modes, the encoder will add MPM modes (if not already included in the candidate list) and perform the RD cost calculation step. In other cases, the second step RMD2 will check some other modes F M − 2, F M − 1, F M + 1, F M + 2 (if FM is an angular mode) or SM − 2, SM − 1, SM + 1, SM + 2 (if FM is DC or Planar). According to the proposed scheme, in the second stage of calculating cost RMD2, we can calculate a maximum of 4 more modes surrounding FM and SM. The minimum and the maximum number of checked modes is 11 and 15, respectively, which reduces computational load. After that, the RDO process will be performed with MPM modes added to the candidate list. 4 Experimental results The proposed scheme has been implemented on top of the HEVC reference software HM-16.20. A set of standard video sequences in five classes covering a wide range of resolutions and use cases (see Table 3) was tested using the All Intra-Main configuration and four values of the quantization parameter QP 22, 27, 32 and 37 as specified by [11]. Table 3. Test sequences. Class Resolution Sequence Frame Rate Traffic 30 Hz A 2560 × 1600 PeopleOnStreet 30 Hz Kimono 24 Hz ParkScene 24 Hz B 1920 × 1080 Cactus 50 Hz BasketballDrive 50 Hz BQTerrace 60 Hz BasketballDrill 50 Hz BQMall 60 Hz C 832 × 480 PartyScene 50 Hz RaceHorses 30 Hz BasketballPass 50 Hz BQSquare 60 Hz D 416 × 240 BlowingBubbles 50 Hz RaceHorses 30 Hz FourPeople 60 Hz E 1280 × 720 Johnny 60 Hz KristenAndSara 60 Hz To evaluate the efficiency of the algorithm, comparisons were made in terms of the Bjontegaard peak signal-to-noise ratio (BD-PSNR) and Bjontegaard bitrate (BD-Bitrate) [12], and time-saving ∆T (%). Fast Intra Mode Decision for HEVC 7 THM −16.20 − Tprop ∆T = (3) THM −16.20 where THM −16.20 denotes the time consuming of the default HM-16.20 and Tprop represents the time consumed by the proposed algorithm. Table 4 shows that compared to HM-16.20, the proposal can reach up to 22.74% of time-saving. This is because it reduces the number of the candidate modes for both the RMD and RDO processes compared to the original HM software. On the other hand, this happens with a slight decrease in bitrate and negligible degradation in video quality (see Table 5). Different sequences are obtained different results, due to different detail and complexity. The RD curves of the proposed algorithm and the HM for some sequences are shown in Figures 4 and 5. It can be seen that the proposed algorithm achieves almost the same PSNR on different bitrates. Table 4. Encoding time reduction compared to HM-16.20 ∆T , % Sequence QP=22 QP=27 QP=32 QP=37 Average Traffic 10.11 12.09 15.41 21.69 14.83 13.49 PeopleOnStreet 14.36 12.30 10.53 11.45 12.16 Kimono 33.81 40.26 13.24 15.36 25.67 ParkScene 34.31 12.82 17.96 14.29 19.85 Cactus 16.84 12.09 14.03 12.44 13.85 16.77 BasketballDrive 9.83 10.99 12.03 10.05 10.73 BQTerrace 13.15 11.86 12.76 17.27 13.76 BasketballDrill 34.33 34.30 37.50 39.13 36.32 BQMall 9.06 12.91 10.05 13.06 11.27 16.74 PartyScene 7.80 8.39 7.75 8.63 8.14 RaceHorses 8.54 10.07 10.19 16.07 11.22 BasketballPass 22.22 22.36 27.79 24.24 24.15 BQSquare 27.57 16.37 29.89 36.34 27.54 22.74 BlowingBubbles 33.93 28.95 27.29 26.03 29.05 RaceHorses 10.86 9.32 8.94 11.74 10.22 FourPeople 11.56 10.30 14.82 11.57 12.06 Johnny 11.77 14.90 13.61 11.72 13.00 12.58 KristenAndSara 13.30 11.65 12.43 13.32 12.68 8 Ban Doan and Andrey Tropchenko Table 5. Coding efficiency comparisons between proposed intra prediction and HM- 16.20 software. BD-PSNR (dB) BD-Bitrate (%) Class QP=22 QP=27 QP=32 QP=37 QP=22 QP=27 QP=32 QP=37 A 0.026 0.024 0.027 0.033 -0.39 -0.72 -1.09 -1.36 B 0.018 0.015 0.019 0.017 -0.28 -0.69 -1.02 -1.50 C 0.039 0.041 0.044 0.041 -0.74 -0.89 -1.15 -1.46 D 0.046 0.047 0.049 0.051 -0.85 -1.16 -1.59 -2.03 E 0.024 0.028 0.040 0.037 -1.20 -1.91 -2.50 -3.39 Average 0.031 0.031 0.036 0.036 -0.69 -1.07 -1.47 -1.95 (a) (b) (c) (d) Fig. 4. RD-curves for some test sequences 5 Conclusion By interfering with the mode selection for the RMD process, the number of modes to be tested has been significantly reduced. It can be argued that the proposed scheme requires less computational load, while the coding performance Fast Intra Mode Decision for HEVC 9 Fig. 5. RD-curve for sequence ”Kimono” around QP=27 remains almost at the same level compared to the original HEVC encoder. It may be recommended to be combined with an algorithm that optimizes the CU separation process. References 1. Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 17(7), 560–576 (2003) https://doi.org/10.1109/TCSVT.2003.815165 2. Sullivan, G.J., Ohm, J.-R., Han, W.-J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22(12), 1649–1668 (2012) https://doi.org/10.1109/TCSVT.2012.2221191 3. Vanne, J., Viitanen, M., Hamalainen, T. D., Hallapuro, A.: Comparative Rate- Distortion-Complexity Analysis of HEVC and AVC Video Codecs. IEEE Trans- actions on Circuits and Systems for Video Technology. 22(12), 1885–1898 (2012) https://doi.org/:10.1109/tcsvt.2012.2223013 4. Ohm, J.-R., Sullivan, G.J., Schwarz, H., Tan, T.K., Wiegand, T.: Comparison of the Coding Efficiency of Video Coding Standards—Including High Efficiency Video Coding (HEVC). IEEE Transactions on Circuits and Systems for Video Technology 22(13), 1669-–1684 (2013) https://doi.org/10.1109/TCSVT.2012.2221192 5. Lainema, J., Bossen, F., Han, W.-J., Min, J., Ugur, K.: Intra Coding of the HEVC Standard. IEEE Transactions on Circuits and Systems for Video Technology 22(12), 1792–1802 (2012) https://doi.org/10.1109/TCSVT.2012.2221525 6. HEVC reference software, https://hevc.hhi.fraunhofer.de/. Last accessed Nov 2019 7. Hosseini, E., Pakdaman, F., Hashemi, M.R., Ghanbari, M.: A computationally scal- able fast intra coding scheme for HEVC video encoder. Multimed Tools Appl. 78(9), 11607–11630 (2019) https://doi.org/10.1007/s11042-018-6713-y 10 Ban Doan and Andrey Tropchenko 8. Zhao, L., Zhang, L., Ma, S., Zhao, D.: Fast mode decision algorithm for intra prediction in HEVC. 2011 Visual Communications and Image Processing (VCIP) https://doi.org/10.1109/VCIP.2011.6115979 9. Zhao, L., Fan, X., Ma, S., Zhao, D.: Fast intra-encoding algorithm for High Efficiency Video Coding. Signal Processing: Image Communication. 29(9), 935–944 (2014) https://doi.org/10.1016/j.image.2014.06.008 10. Piao, Y., Min, J., Chen, J.: Encoder improvement of unified intra prediction. Doc- ument JCTVC-C207, JCT-VC. Guangzhou, CN (2010) 11. Bossen, F.: Common test conditions and software reference configurations. Docu- ment JCTVC-L1100, JCT-VC. Geneva, CH (2013) 12. Bjontegaard, G.: Calculation of average PSNR differences between RD-curves. Doc- ument VCEG-M33, ITU-T. Austin, Texas (2001)