litating new work practices such as ‘virtual desking’ where staff arrive at work and use any available desk.
However, the greatest benefits will undoubtedly come in the shape of ever more sophisticated multimedia applications. The increasing power of even the most basic desktop PCs to run multimedia applications such as video conferencing and integrated voice/email will drive the continued development of Computer
Telephony Integration (CTI) applications such as those deployed by web-enabled call centres.
In simple terms, any system which transports voice across a data network employs packet voice technologies. Analogue voice signals are digitised and the resultant digital stream is converted into standard packets. Voice packets appear to a network as ‘data’ and as such can be treated like any normal data packet i.e. switched across companies’ LANs, routed across WAN links or sent out over the Internet.
From a historical perspective, telephony systems started out using analogue signals to transmit voice from end-to-end. Today, analogue signals are only used from the customer site to the telco’s local exchange, where the signal is then digitised and transmitted throughout the PSTN as a digital signal. Since the human voice frequency range lies below 4kHz and the sampling rate defined by classical theory is twice this frequency, to get an accurate representation one needs to sample an analogue voice signal at 8000 times per second. The samples are then digitised and at 8 bits will provide enough resolution. 64,000 bps is the standard transmission rate for digital voice. Therefore, as a voice signal only requires 64kbps, it is relatively easy to employ Time Division Multiplexing techniques to combine multiple voice channels for transmission over a single high speed digital link – or ‘trunk’ in telco parlance.
This is the standard interface used in digital PBXs and commonly referred to in Europe as an ‘E1’ (2.048 Mbps or 30 voice channels) and in the USA as a ‘T1’ (1.544 Mbps or 24 voice channels). Most offices with digital PBXs employ an E1/T1 trunk to their local exchange.
The digitising function described above is referred to as codec (COding/DECoding), and the different codec standards are given G.7xx numbers by the International Telecommunications Union (ITU). G.711 is the standard 64kbps telephony codec but there are of course various compression schemes which will reduce the number of bps required, thus enabling the use of lower bandwidth links to carry voice traffic. These various compression schemes are also described by G.7xx standards and are summarised in the
| table below: | ||||
|---|---|---|---|---|
| Code Type | Bit Rate | Processor Usage | Voice Quality | Delay |
| G.711 | 64 | — | Very good | Negligible |
| G.726 | 40/32/24/16 | 8MIPS | Good (40) to poor (16) | Very low |
| G.729 | 8 | 30MIPS | Good | Low |
| G.729A | 8 | 20MIPS | Fair | Low |
| G.723 | 6.4/5.3 | 20MIPS | Good (6.4) to fair (5.3) | High |
| G.723 | 1 6.4/5.3 | 20MIPS | Good to fair | High |
| G.728 | 16 | 40MIPS | Good | Low |
Without going into too great a detail, what can be deduced from this table is that although you get reduced bit rates when using compression schemes, they can result in inferior voice quality and increased processor usage. In general however, this trade-off between voice quality and bandwidth savings is tolerated as compression schemes offer considerable bandwidth optimisation and cost savings.
Clearly, digitised voice signals can be treated in the same way as digital data and converted to a packet for transmission over a packet switching network. But, before moving onto discussing the major packet voice technology – Voice over IP (VoIP), it is worth looking briefly at the two other prominent technologies:
Frame Relay (FR) is a packet switched WAN protocol widely employed today to interconnect company LANs. FR is based on the older X25 WAN technology but due to improvements in digital line quality has removed the requirement for error protection/correction and therefore offers much improved speed and efficiency over X25. Companies can purchase FR services from providers for point-to-point or point-to-multipoint connections and it offers viable alternatives to expensive leased line services. VoFR’s real function is the transmission of voice over WAN links and hence does not scale to the desktop. Nevertheless, many enterprises employ VoFR for the savings made on long distance branch-to-branch calls. Historically, VoFR solutions have been proprietory by vendor, but the establishment of the FRF.11 standard for call setup, coding types and packet formats will allow future interoperability between vendors’ products.
ATM is a highspeed backbone technology which offers inherent superior Quality of Service (QOS) features and is ideally suited for transmission of voice and video. As with VoFR, VoA is primarily for transmission over WAN links and does not scale to the desktop.
Internet Protocol (IP) is the most widely used protocol today and is ubiquitous throughout LANs, campus networks, enterprise intranets and the Internet. Its popularity makes IP the unifying protocol for telephony solutions. Companies with existing LAN/WAN infrastructures running IP will find it easy to implement VoIP. Suitable solutions may scale from a purely internal IP telephony system right through to enterprise-wide systems employing WAN links.
As IP is a connectionless protocol it normally works in conjunction with the connection oriented Transmission Control Protocol (TCP) to ensure a guaranteed delivery service. However, although this works smoothly with data – any packet not delivered is simply re-transmitted – it will not work with real-time applications such as voice, because any word received out of sequence within the structure of a sentence will result in a garbled message.
Consequently, a new standard was required to cope with the real-time video and voice applications which have become increasingly popular. The H.323 standard is one such standard and provides a foundation for audio, video and data communications across IP based networks. By conforming to H.323 standards, multimedia products and applications from different vendors can interoperate across IP based networks, including the Internet. A comparison between the ITU H.323 Standard and the ISO Protocol Layers is shown below:
Presentation/Session H.323, SIP, H.245, H.225, RTCP Transport RTP, UDP Network IP, RSVP, WFQ Data Link PPP, Frame, ATM etc
VoIP uses the ITU H.323 Standard as opposed to the traditional ISO Protocol Layer (OSI 7 Layer model), and although VoIP uses TCP to carry the signalling channels, the real-time audio streams deploy H.323’s Real Time Protocol (RTP). RTP uses the connectionless UDP protocol as a transport mode since it has lower delay than TCP and, as previously explained, retransmissions are pointless. VoIP is the most popular implementation of packet voice and it is the increasing prevalence of H.323 based desktop applications that will drive it to even greater acceptance. It also constitutes the primary focus of this paper.
A typical definition of Quality of Service (QoS) is: ‘a mechanism for defining absolute and relative network performance requirements for the various streams of traffic on a network’.
It is of paramount importance to VoIP that QoS can be guaranteed from end-to-end. The two main network problems that