8.1 Signals look like noise!
One of the most important practical questions which arises when we are designing or using an information transmission or processing system is, "What is the Capacity of this system? — i.e. How much information can it transmit or process in a given time?" We formed a rough idea of how to answer this question in an earlier section of this set of webpages. We can now go on to obtain more well defined answer by deriving Shannon's Equation. This equation allows us to precisely determine the information carrying capacity of any signal channel.

Consider a signal which is being efficiently communicated (i.e. no redundancy) in the form of a time-dependant analog voltage, . The pattern of voltage variations during a specific time interval, T, allows a receiver to identify which one of a possible set of messages has actually been sent. At any two moments, & , during a message the voltage will be & .

Using the idea of intersymbol influence we can say that since — there is no redundancy — the values of & will appear to be independent of one another provided that they're far enough apart () to be worth sampling separately. In effect, we can't tell what one of the values is just from knowing the other. Of course, for any specific message, both and are determined in advance by the content of that particular message. But the receiver can't know which of all the possible messages has arrived until it has arrived. If the receiver did know in advance which voltage pattern was to be transmitted then the message itself wouldn't provide any new information! i.e. the receiver wouldn't know any more after its arrival than before. This leads us to the remarkable conclusion that a signal which is efficiently communicating information will vary from moment to moment in an unpredictable, apparently random, manner. An efficient signal looks very much like random noise!

This, of course, is why random noise can produce errors in a received message. The statistical properties of an efficiently signalled message are similar to those of random noise. If the signal and noise were obviously different the receiver could easily separate the noise from the signal and avoid making any errors.

To detect and correct errors we therefore have to make the real signal less ‘noise-like’. This is what we're doing when we use parity bits to add redundancy to a signal. The redundancy produces predictable relationships between different sections of the signal pattern. Although this reduces the system's information carrying efficiency it helps us distinguish signal details from random noise. Here, however, we're interested in discovering the maximum possible information carrying capacity of a system. So we have to avoid any redundancy and allow the signal to have the ‘unpredictable’ qualities which make it statistically similar to random noise.

The amount of noise present in a given system can be represented in terms of its mean noise power

where R is the characteristic impedance of the channel or system and is the rms noise voltage. In a similar manner we can represent a typical message in terms of its average signal power

where is the signal's rms voltage.

A real signal must have a finite power. Hence for a given set of possible messages there must be some maximum possible power level. This means that the rms signal voltage is limited to some range. It also means that the instantaneous signal voltage must be limited and can't be beyond some specific range, . A similar argument must also be true for noise. Since we are assuming that the signal system is efficient we can expect the signal and noise to have similar statistical properties. This implies that if we watched the signal or noise for a long while we'd find that their level fluctuations had the same peak/rms voltage ratio. We can therefore say that, during a typical message, the noise voltage fluctuations will be confined to some range

where the form factor, , (ratio of peak to rms levels) can be defined from the signal's properties as

When transmitting signals in the presence of noise we should try to ensure that S is as large as possible so as to minimise the effects of the noise. We can therefore expect that an efficient information transmission system will ensure that, for every typical message, S is almost equal to some maximum value, . This implies that in such a system, most messages will have a similar power level. Ideally, every message should have the same, maximum possible, power level. In fact we can turn this argument on its head and say that only messages with mean powers similar to this maximum are ‘typical’. Those which have much lower powers are unusual — i.e. rare.

8.2 Shannon's Equation.
The signal and noise are uncorrelated — that is, they are not related in any way which would let us predict one of them from the other. The total power obtained, , when combining these uncorrelated, apparently randomly varying quantities is given by

i.e. the typical combined rms voltage, , will be such that

Since the signal and noise are statistically similar their combination will have the same form factor value as the signal or noise taken by itself. We can therefore expect that the combined signal and noise will generally be confined to a voltage range .

Consider now dividing this range into bands of equal size. (i.e. each of these bands will cover .) To provide a different label for each band we require symbols or numbers. We can therefore always indicate which band the voltage level occupies at any moment in terms of a b-bit binary number. In effect, this process is another way of describing what happens when we take digital samples with a b-bit analog to digital convertor working over a total range .

There is no real point in choosing a value for b which is so large that is smaller than . This is because the noise will simply tend to randomise the actual voltage by this amount, making any extra bits meaningless. As a result the maximum number of bits of information we can obtain regarding the level at any moment will given by

i.e.

which can be rearranged to produce

If we make M, b-bit measurements of the level in a time, T, then the total number of bits of information collected will be

This means that the information transmission rate, I, bits per unit time, will be

From the Sampling Theorem we can say that, for a channel of bandwidth, B, the highest practical sampling rate, , at which we can make independent measurements or samples of a signal will be

Combining expressions 8.11 & 8.12 we can therefore conclude that the maximum information transmission rate, C, will be

This expression represents the maximum possible rate of information transmission through a given channel or system. The maximum rate we can transmit information is set by the bandwidth, the signal level, and the noise level. C is therefore called the channel's information carrying Capacity. Expression 8.13 is called Shannon's Equation after the first person to derive it.

Content and pages maintained by: Jim Lesurf (jcgl@st-and.ac.uk)
using HTMLEdit and TechWriter on a RISCOS machine.
University of St. Andrews, St Andrews, Fife KY16 9SS, Scotland.