Sound chips in 8-bit computers

Part 2. POKEY sound chip

For the readers of this article (actually of the whole FLOP magazine), it will be no surprise that all Atari 8-bit computers are typical representatives of home computers that were using a special integrated circuit to synthesize both sound effects and music. For sound synthesis (and other important functions described at the end of this text), Atari has developed integrated circuit named POKEY. The name is derived from two words: POtentiometer and KEYboard. This is interesting, as both sound synthesis and controlling serial communications are functions of a greater importance.

The POKEY chip is not used only in home microcomputers, but also in certain gaming consoles and arcade machines manufactured by Atari. Moreover, production of these machines represented a major portion of the company's revenue. The POKEY chip is a hybrid integrated circuit that contains both digital portion (used for sound synthesis) and analog portion. This circuit allows (relatively slow) conversion of analog signal to 8-bit samples. Atari used this feature to connect the paddle controllers to their computers and gaming consoles.

In the first article of this text I mentioned the POKEY integrated circuit is used (apart from other functions) for sound synthesis. You can generate sound using four sound channels (it is possible to create 4-voice polyphony). If needed, two channels can be paired to create one channel that allows more precise control over the frequency of the sound. In essence, it means that two 8-bit counters/divisors are merged to create one 16-bit counter. It is also possible to change source of the input clock signal - see below. Nine 8-bit control registers are used to control the sound subsystem: AUDF1 - AUDF4, AUDC1 - AUDC4 and the common control register - AUDCTL.

Each sound channel produces rectangular signal with amplitude that can be set to one of the 16 possible levels. In the control registers, only four bits are dedicated for specifying amplitude for each channel. Logical zero value of the signal on input results in zero voltage on the output of a sound channel. Logical 1 is converted to one of the 16 voltage levels (like TIA and also almost all other PSGs)

The dependency between selected level and voltage isn't precisely linear. The generated signal is also not exactly rectangular (noticeable for example when connected to an oscilloscope). To this day, this is used while creating triangular and sawtooth signals (with significant help of the CPU, precise timing, and eventually a high-pass filter mentioned below).

The CPU Clock signal is conveyed to the input of the sound system. The signal has a frequency of approximately 1.79 MHz (computers designed for the NTSC TV standard) or 1.77 MHz (computers designed for the PAL TV standard). For further purposes, the signal is divided by constants of 28 and 114, so two additional signals are available. These signals have frequencies of approximately 63 kHz and 16 kHz. The control registers allow selecting which of the signals will be used to control the sound channels (obviously not all options can be selected at once, see below).

The sound frequency in each sound channel is controlled by a 1:N divisor. The divisor is internally implemented as a counter (using counters, the POKEY circuit detects key presses, generates pseudo-random numbers, performs A/D conversion from the paddle controllers and communicates with SIO). If the sound channels are not paired, the counter performs a simple function: one of the three clock signals mentioned above is conveyed to its input. With every cycle, value of the counter is decreased by one.

When a counter underflows, a logical one appears on its output and its value is reset to a user specified value (the value is stored in matching control register AUDF1 - AUDF4). Output of the counter is used in other circuits. For example, when generating a pure tone, the logical 1 on output of the counter toggles one flip-flop (so the input frequency is divided by two).

By setting the control register of the POKEY chip, you can configure one of the three basic combinations of the sound channels:

  1. Four independent sound channels. The frequency of each channel is given by an 8-bit divisor that has one of the clock signals (ca 16 kHz, ca 63 kHz) on its input. This configuration is used by Atari BASIC (the well-known SOUND a, b, c, d statement). In this mode, one can generate tones with range of 4 octaves. By choosing a different clock signal, it is possible to shift the whole scale of the 256 distinct tones.
  2. Two independent sound channels. The frequency of each channel is given by a 16-bit counter created by pairing of two 8-bit ones. In this case, the CPU clock frequency can be also conveyed directly to the divisor. This means that one can choose between three input signals (ca 16 kHz, ca 63 kHz and 1.77 or 1.79 MHz depending on the TV standard). The tone scale is beyond the limits of human hearing (infrasound, ultrasound) and beyond the limits of amplifiers.
  3. One sound channel controlled by a 16-bit divisor and two channels controlled by 8-bit divisors. This configuration was used by many musicians, who devoted the fine controlled sound channel for a bass musical instrument and two remaining channels for a percussion musical instrument - sometimes sampled - and main melody often played with higher frequencies.

Note: When two sound channels are paired, the number of sounds that can be played back is decreased from 4 to 3 or 2. On the other hand, the precision of the counter is increased, because the input frequency can be divided by values from 1 to 2^16. In this case, higher clock frequencies (63 kHz or CPU clock) are selected to be conveyed to the input, otherwise dividing by too high value would generate an infrasound on the output of the sound channels. The infrasound with rectangular waveform would result only in "pops" of the speaker membrane when amplitude changes steeply.

Now let me describe the subsystem for generating noise or distorted periodic signals. The subsystem is based on LSFRs described in previous part, i.e. shift registers with a feedback loop (two selected bits are returned back through a XOR gate to the input of the counter). LSFRs are sometimes denoted as polynomial counters. The POKEY circuit contains three polynomial counters in total (4-bit, 5-bit and 17-bit). Output of these polynomial counters can control selected sound channel. All four sound channels are sharing the same polynomial counters; however each sound channel can be set to different frequency, so the resulting sound will be different. When using the 17-bit polynomial counter (see below), the distance between the same samples (period) is so big that it can be considered a generator of random impulses creating white noise.

The principles of controlling a sound channel with a polynomial counter are actually very simple. Controlling is performed by a simple logical gate and a D flip-flop. The polynomial counters are changing the output value very quickly (as they are controlled directly by the CPU clock, i.e. 1.79 MHz for a NTSC computer or 1.77 MHz for a PAL computer), but maximum frequency on the output of a sound channel is limited to a frequency obtained from the 1:N divisor. This is because of the connection to the logical gate and the flip-flop. In other terms, the value on output of the flip-flop cannot be changed with a frequency that is higher than a frequency conveyed to its CLK input.

In fact, it is a very elegant solution, where the POKEY chip creators made it with just three polynomial counters shared by all four sound channels. We encounter the same minimalism with the high-pass filter described below.

The polynomial counter, i.e. generator of a pseudo-random noise is a shift register controlled by an external clock signal. With every clock cycle, the value in the register is shifted. Four, five, seventeen, or nine (in a special case) bits are shifted by one binary digit. A binary value obtained from a XOR gate with inputs connected to the third and last bits of the shift register is conveyed through a feedback loop to the input of the shift register. The value on the output of the gate is written to the first bit.

If we consider the logical function represented by the XOR gate, we conclude that after the POKEY chip initialization, a shift register can have any non-zero value, because in this situation, it is quickly populated with pseudo-random data that are periodically repeated. The duration of the period depends on the bit length of the shift register itself.

If the register is n bits wide, the period is equal to 2^n -1 cycles, as the remaining state - all zeroes - constitutes a separate (and unimportant) cycle. The last bit of the shift register represents its final output, i.e. sequence of pseudo-random binary values that, after being processed by the pokey circuit, we can hear, especially after setting the amplitude.

While the 4-bit and 5-bit polynomial counters generate relatively quickly repeating pseudo-random sequences (one can emulate sound of aircraft engines, for example), the sequence of the 17-bit polynomial counter is long enough to generate random sound. This counter can be reconfigured so that its width is decreased to 9 bits (probably has to do with effort to keep up with sound capabilities of the TIA chip). The reconfiguration is simple: the feedback loop is connected to the 8th bit instead of the 1st bit.

The last interesting function of the POKEY chip related to sound generation is a possibility to add a so called "high-pass filter" to the processing chain. We shouldn't get misguided by the name as it is nothing like real analogue or digital high pass, but "only" a D flip-flop augmented with a gate. Output from one channel is conveyed to the D (data) input. Output from another channel is conveyed to the CLK (clock, or D memory) input of the flip-flop. Moreover, output from the first channel is combined through XOR gate with the Q output of the D flip-flop. What is the meaning of this? Every edge that occurs on channel 1 inverts output from the connection; while every edge that occurs on channel 2 gets the output zeroed (same values on inputs of the XOR gate occur).

The result is a signal with a waveform somehow resembling a result of PWM (pulse-width modulation) in a moment when frequencies of both channels are close. And that is the purpose of the high-pass filter, apart from other tricks.

If there is a need (and also sufficient CPU power), it is possible to switch the rectangular signal generation completely off and control only amplitude on the output of each sound channel. This way, it is possible to play back sampled sounds. Given the properties of the POKEY chip, there are only 16 volume levels available with a dynamic range of 24 dB only (compare with CD-Audio using 16-bit sampling with a dynamic range of 96 dB and SID using 8-bit sampling with a dynamic range of ca 48 dB, roughly equivalent to tape recording).

In fact, if necessary, digitized sound can be played back in all channels together. The number of volume levels (and the dynamic range) is slightly increased. The sum of intensities is not linear but logarithmic. Sampled speech was used in Atari versions of the Ghostbusters or Berzerk games. Sampled percussion instruments are more common.

Apart from sound synthesis, the POKEY chip has been used for reading of the keyboard (native support for up to 64 keys with specially processed keys Control, Shift and Break), communication with peripherals connected via serial port (the known SIO interface - Serial Input/Output), as a pseudo-random number generator, timer (three audio channels could be switched to timer functions, used when working with the data recorder).

It was the combination of various functions to one 40-pin integrated circuit that allowed reducing total number of circuits in all 8-bit Atari computers. That resulted in (to a greater extent than today) relatively low total price of a computer or console, reduced failure rate etc. Just for clarification: In the classic 8-bit Atari computers, there were four "big" circuits with 40 pins - MOS 6502 CPU, ANTIC graphics coprocessor, CTIA and later GTIA graphics chips, and last but not least the versatile POKEY chip.

Pavel Tišnovský
2018