VOICE QUALITY OF Codecs (VoIP)

3.9
Mean opinion score (MOS) is used to measure the voice quality performance of the compression scheme or end-to-end voice call. Different methods like
E-model [ITU-T-G.107 (2005)], perception based on listening, and perceptual evaluation of speech quality (PESQ) are used for MOS reporting [URL
(DSLAII), URL (Opticom-PESQ)]. To present MOS in this topic, PESQ
and rating (R) factor are used. Some more details on voice quality measuring procedures are given in topic 20.
PESQ is an active measurement conducted mainly on instruments. In a PESQ-based measurement, the instrument sends a reference waveform from one end and compares the distortions at the other end with the original. PESQ is an external measure. The PESQ and R-factor are mapped to MOS. PESQ-based MOS differs from R-factor MOS.
MOS is presented in the range of 1 to 5, but it is limited to 4.5 in actual usage. The PESQ-based MOS of G.711 is 4.3 to 4.4. Worldwide G.711 in PSTN is treated as delivering toll voice quality. MOS of greater than or equal to 4.0 is considered as toll voice quality. To achieve toll quality in VoIP, G.711 and G.726 at higher bit rates and wideband codecs have to be used.
In the narrowband case, G.729 and G.729A are popularly used. G.729A gives a PESQ MOS of 3.75 to 3.8. The main G.729 codec gives better quality at about 3.9, but it requires more processing. These are not toll (MOS more than 4.0) quality codecs, but they are accepted worldwide with voice quality broadly rated as very close to satisfactory and acceptable levels. Codec G.729.1 is for wideband voice applications. The listening perceptions of wideband codecs exceed G.711 PSTN quality. Higher quality than ideal G.711 may be limited to a MOS of 4.5. Wideband MOS extensions to represent beyond 4.5 are not documented clearly at this stage. Refer to the latest ITU recommendations for the updates on wideband MOS.
An overview of codec quality comparisons with an R-factor of the E-model and corresponding calculated MOS is given here. The R-factor connects quantitatively many parameters that influence voice quality. The R-factor calculation makes use of several parameters broadly classified under delays, echo, noise, phone characteristics, packet, and signal transmission characteristics. In this section, the R-factor is considered purely from the codec-dependent impairment factor (Ie); other dependencies are given in topic 20. For narrowband speech, the R-factor ranges from 0 to 100, with 100 being the MOS equivalent of 4.5. In telecommunications, a G.711-based full digital ISDN system gives a highest R of 93.2 (initially it was 94.2 and was amended to 93.2) and the corresponding MOS is approximately 4.4. Linear 16-bit samples give a MOS of 4.5. User satisfaction levels with R-factor and MOS are given in Table 3.5. MOS is calculated from the R-factor as
MOS = 1 + 0.035 R + R(R – 60)(100 – R)(7 x 10-6), for R = 1 to 100, MOS = 4.5, for R > 100, MOS = 1 for R < 0
Considering only codec compression, under ideal conditions R-factor is expressed as R = R0 – Ie, where R0 = 93.2 and Ie is the codec specific impair-


Table 3.5. Relation among R-Factor, MOS, and User Satisfaction

User
Satisfaction
R-Factor and MOS for R = 70 to 99 Level
R 99 98 97 96 95 94 93 92 91 90 R of 90 to 93
MOS 4.49 4.48 4.47 4.46 4.44 4.42 4.41 4.38 4.36 4.34 is very satisfactory PSTN quality; R of >94 is better than PSTN quality
R 89 88 87 86 85 84 83 82 81 80 Satisfied; R >
MOS 4.31 4.29 4.26 4.23 4.2 4.17 4.13 4.1 4.06 4.02 80 and MOS > 4.0
is referred to as toll quality
R 79 78 77 76 75 74 73 72 71 70 Some users
MOS 3.99 3.95 3.91 3.86 3.82 3.78 3.73 3.69 3.64 3.6 dissatisfied

ment factor. Different codecs Ie, R and corresponding R-based MOS mapping is given in Table 3.6. The quality of wideband codec exceeds that of G.711, and it has a rating of more than 93.2. Hence, the R-factor scale is extended to a maximum value of 129 that takes care of mapping both narrowband and wideband on the same scale. For this reason, a different R0 of 129 is used [ITU-T-G.107 (2006), ITU-T-G.113 (2006),Alexander (2006)].An R-factor of 129 is obtained with direct wideband 16-bit linear samples at 16-kHz sampling. To get the same R-factor for a narrowband codec on a wideband R scale, a new equipment impairment parameter Ie is introduced with name Ie,wb. The impairment parameter Iewb values are modified for narrowband codecs to give the same narrowband results. In the original narrowband R-factor, G.711 has Ie = 0. On the wideband scale, Iewb for the same G.711 is 35.8. The resulting R in both cases is the same (93.2 is the same as wideband 129 – 35.8 = 93.2). It is also expressed as Iewb = Ie + 35.8.
Combined narrowband and wideband R-values are listed in Table 3.6 . In Table 3.6, both narrowband and wideband values are given in different columns for clarity in presentation. The result of the R-factor is the same for narrowband codecs. From the table, it is clear that wideband performs better than narrowband. In narrowband, G.711 gives the best quality. Based on the R-factor comparison, the wideband G.722 gives an R-factor of 116. These values represent much higher quality than G.711 at an R-factor of 93.2. Refer to later revisions of recommendations/standards for possible updates on the MOS

Table 3.6. Narrowband and Wideband R-Values for Different Codecs

Narrowband Wideband (R0
Bit Rate MOS (Re = 93.2) wideband = 129)
Codec (kbps) from R Ie R = 93.2 – Ie R-Nb,wb = 129-Ie,wb Remarks
G.711 64 4.4 0 93.2 36 93.2 PSTN quality
G.729E 11.8 4.3 4 89.2 40 89.2
G.726 32 4.23 7 86.2 43 86.2
G.728 16 4.23 7 86.2 43 86.2
G.729 8 4.13 10 83.2 46 83.2
G.729A 8 4.1 11 82.2 47 82.2 Commonly used in
VoIP
G.723.1MP- 6.3 3.95 15 78.2 51 78.2
MLQ
G.722 64 4.5* 13 116* Exceeding
G.711
rating

*While preparing this topic, MOS mapping was not available for R > 100; hence, the highest MOS given is (4.5).
scale for wideband and on proper connectivity in MOS mapping. In Table 3.6, a MOS of 4.1 appears against the G.729A codec. After considering several end-to-end transmission contributions, this level is below the toll quality. A
PESQ-G.729A MOS is 3.75 to 3.85, which is below the toll quality, but it is
widely accepted in deployments.
3.9.1

Discussion on Wideband Codecs Voice Quality

The ITU wideband codecs considered at this stage for VoIP are G.722, G.729.1, and G.722.2. Wideband codecs are rated on an extended R-factor scale of 129. In Table 3.6, a G.722 rating is given as 116. The rating of G.722.2 is stated as
128 based on [ITU - T- G.107 (2006) , ITU - T- G.113 (2006)] . With the highest
rating being 129, G.722.2 at provide 128 seems to higher quality. G.729.1 is still under study, and it is expected to appear in later revisions of ITU documents. While writing this topic, the ITU study group 12 (SG12) was actively evaluating wideband codecs and arriving at various Ie,wb, R – factors and MOS mapping. At this stage, the results from auditory and instrument measures from the modified WB-PESQ vary in estimating Iewb. Among codecs G.722, G.722.2, and G.729.1, several combination results are noticed based on the SG12 results. Most tests convey that G.722.2 at 23.05 kbps are higher quality than G.722 and G.729.1. The results from G.722 and G.729.1 are very close. Some tests reveal
that G.722 performs better than G.729.1, and some combinations reveal the opposite. In general, G.722.2 requires five times more processing than does G.722, and G.729.1 requires three to four times more processing than does G.722. Hence, G.722 is used in most early wideband products. To maintain backward compatibility with the narrowband, G.729.1 is considered. Codec G.722.2 may be considered to interoperate with wireless infrastructure enhancements and media convergence as well as to present a higher quality supported product.

Next post:

Previous post: