HRT music streamer-noe forklaringer fra produsenten om asynchronous drift
Asynchronous transfer protocol. For detailed reading on the subject, I would recommend the USB forum's paper entitled "Universal Serial Bus Device Class Definition for Audio Devices" which is available at:
http://www.usb.org/developers/devclass_docs/audio10.pdf
In direct reply to the question"...but how can this new master clock control a computer's bus timing..." This misunderstanding is what needs to be corrected. The device (Streamer) does not control the host's (computer's) bus timing, the computer runs completely independently of the Streamer. What is controlled is the size of the data packets that the host (computer) sends to the device (Streamer).
The term asynchronous (or 'not' synchronous) defines that neither the clock within the host (computer) nor within the device (Streamer in this case) are synchronized at all. Both are free running and while they may be close in terms of frequency (or just as easily may not be very close), there is no fundamental control of either over the other. The clock generated by the host is typically a very poor one as it is subject to many sources of modulation from both hardware and software constraints. Using it in any way virtually guarantees large levels of jitter. The asynchronous approach allows both ends to operate independently and via a communication mechanism the device (Streamer) is polled by the host on a predetermined interval for a 'feedback' value that the device calculates and supplies to the host. The host then modifies its data payload and sends either more or less samples in subsequent data packet(s). This allows the data rate on average to match yet frees the device from the poor clock performance of the host. The actual mechanism for calculating and supplying this feedback data to the host (and handling errors of the host) is very complex and well beyond what can be conveyed in a limited amount of space (and time) but the concept is, at the surface level, very easy to understand.
Let me give an example.
Say that the host clock is intended to be supplying audio at 44,100 Hz (CD data rate) but due to its very poor nature is actually running at and average frequency of 44,099 Hz. In this same example, the device is generating an accurate, and more importantly low jitter, clock that is exactly 44,100 Hz for driving its audio circuits. This 1 Hz difference would quickly cause the two ends to drift apart and the data packets would become corrupt and a complete loss of function would result. Since the data comes in packets and the number of audio samples in the packet is under control of the software, the device sends to the host (in response to a request from the host) a value that describes the difference between the two data rates. This information that defines the difference is called the feedback value. In this example, the hosts clock is .0022676% slower than it should be; and after a matter of just a few thousand frames, the two ends would be hopelessly apart and the result would be something that would be described to as completely random noise, not audio.
In a streaming audio application the data is sent in packets that arrive once every millisecond (1/1000 of a second or 1000 data packets per second). Each packet would need to contain 44.1 samples, but only whole samples can be sent, so the host will send 44 samples for 9 frames (a total of 396 samples) and then on the 10th frame, it will send 45 samples. This makes a total of 441 samples in 10 frames and then the process is repeated during the next 10 frames. Since in my example the host is running slower than it should, eventually, the device's buffer would underflow (run out of data) and the whole system would crash, but with the device calculating the error of the host, it can request that one data packet have an additional sample to keep the two ends in relative synchronization but still allowing their clocks to run completely independently. As long as the buffer neither underflows (runs out of data) nor overflows (runs out of space), then the system works perfectly.
For a situation where the host is running faster than the device, the device sends feedback data that reduces the number of samples in the occasional frame. Again, the system stays intact via the feedback mechanism.
Consider the situation where the host generates a very poor clock interval (due to task handling, hardware design or as is nearly always the case, a combination of both) and each sample arrives either on time, early or late, this is exactly what happens and the result is classic jitter. Since the device (Streamer) is independent of this, there is zero impact on the conversion performance since our device operates only from its own locally generated clock. The difference in jitter levels is typically (with most computers) a factor from 100 to 1000 times lower for an asynchronous approach when contrasted to any of the other protocols available to an audio streaming (isochronous or constant interval) task.
Hopefully, this brief explanation affords one adequate information to understand how an asynchronous transfer protocol system can be completely independent of jitter which would otherwise result for any computer sourced audio application. What should be obvious is that S/PDIF is also subject to this same poor clock performance of the host (computer) as there is no feedback mechanism and the device must extract the clock from the data itself (which adds even more jitter). Compared to asynchronous USB, S/PDIF can be tens of thousands times lower performance and by contrast, makes it a very poor choice for anything even approaching high performance audio.
Kevin Halverson-HRT