almikel Posted January 10, 2016 Posted January 10, 2016 What is this 'auditory time resolution' ? My googling leads to some fairly abstract research on hearing impairment And similar results around perception of gaps in low frequencies. https://www.google.com.au/search?q=auditory+time+resolution http://link.springer.com/chapter/10.1007/978-3-642-70622-6_9 My basic understanding is that the linked video states our brain can determine time cues as short as to 7uS, even though our hearing only works up to 20kHz. 7uS equates to a frequency of 142kHz - T=1/f. If that's true, it still doesn't mean we need to sample at 2 x 142kHz to capture those cues, because no transducer in the chain can deal with that anyway, and no musical instrument has anything like the rise time of 142kHz. I'm fascinated by the claim that our brains can resolve to that level (7uS), but we can only hear to 20kHz or so - I wonder why evolution determined that we needed such fast time resolution, but our hearing only needed to go to 20kHz?? cheers Mike 1
Nada Posted January 10, 2016 Posted January 10, 2016 The crucial info, to me, was that humans have an auditory time resolution of ~7 uSec ( see 11.52 min in original video). A 1 kHz signal will take 1,000 uSec to complete a single cycle. If recorded at 48 kHz sampling, it will only be sampled every 20.8 uSec. So the original signal could change up to 3 times over a 21 uSec time period and that change might be completely lost in a 48kHz recording. A transient that doubled in volume then returned to baseline 20 uSec later might be completely lost in a CD (or 48 kHz) recording. Sure, I'll accept that a full cycle of 1 kHz can be perfectly recreated via CD quality recording but then you are talking about a measurement time that is 142 times longer than our hearing can resolve in the time domain. Am I missing something? The start of a tone is captured exactly with sampling at a much lower frequency then the wavelength of the frequency. You might be laboring under the assumption that a sample only captures the maximum amplitude of a wave ? The timing accuracy depends on clock jitter of the ADC/DAC and thats reliably under one nanosecond so over thousand times better then needed for temporal cueing. 2
Guest rmpfyf Posted January 10, 2016 Posted January 10, 2016 (edited) The crucial info, to me, was that humans have an auditory time resolution of ~7 uSec ( see 11.52 min in original video). A 1 kHz signal will take 1,000 uSec to complete a single cycle. If recorded at 48 kHz sampling, it will only be sampled every 20.8 uSec. So the original signal could change up to 3 times over a 21 uSec time period and that change might be completely lost in a 48kHz recording. A transient that doubled in volume then returned to baseline 20 uSec later might be completely lost in a CD (or 48 kHz) recording. That statistic, which is central to the video shown, is complete BS. Honestly. A auditory temporal resolution of 7us would mean 192kHz (after Nyquist) isn't going to cut it either. We'd need 2x 142kHz (I see this has been posted as I type). What does "auditory temporal resolution" even mean? Does it mean "I heard a chirp"? At what frequency? At what age? Against what noise floor? Does it mean "If we can discern a change at 7us, can we discern changes at a time-base rate of 7us?" That's crap, because 7us means we'd have frequency response way, way in excess of 7us. How's it consistently measured? Show me a study that justifies 7us beyond a blanket statement, and we'll talk about it. We've got a bloke on the web presenting it as fact. It's not. Whether the wave changes a billion times between adjacent points in the scenario you suggest or not doesn't matter - it's frequency content you cannot year. Your ears have zero frequency response here. Assuming you've the audio system to deliver it, a young puppy might show signs of some toe-tapping PRAT or flee the scene, but you will hear not different. There is spectral content in audio generated above 20kHz, e.g. a cymbal crash is a good example. Also whilst we view content in frequency spectra, we don't listen to spectra, we listen to waveforms in a time domain. Spectra are simply convenient data for visualisation and are consistent with the means we digitise discrete samples in a time history. Could the rise of a given wave form sound different? A more significant acceleration of air? Could it even be generated (everything beyond the DAC) at reasonable amplitudes? Questionable. A cymbal - about the hardest thing one can smack to generate a noise you'd want to listen to - doesn't generate anything over 50khz, so what's with 192kHz for playback at any rate? Can your system replicate even that? That's the most relevant argument for as much from a 'can we hear the difference in content reconstruction' perspective. The rest is filtering. The best thing those extra samples do is provide a greater number of samples to linearise a signal from with Fourier methods, which (for the frequencies we care about) provides a greater statistical probability of reducing jitter in reproduction where there's phase noise in the methods employed for reconstruction. And you can hear this. Then again, you could just spend money on decent re-clocking and be happy. It's simply easier to ram a more time-resolute sample through a DAC than it is to find a super-low-phase oscillator. The oscillator is harder to find but not impossible. Edited January 10, 2016 by rmpfyf
JDWest Posted January 10, 2016 Posted January 10, 2016 These guys at least attempt to put sampling speed & temporal resolution in context with limitations of other equipment http://www.yamahaproaudio.com/global/en/training_support/selftraining/audio_quality/chapter5/09_temporal_resolution/ I'd guess we evolved such good time resolution as it relates to survival; determining direction of sound via small timing diffs on arrival at left vs. right ear. Not sure why people keep talking as if sample rate only relates to such high frequencies (above Nyquist). Clearly, it also relates to temporal information contained in much lower frequencies. At a guess, I'd suggest our temporal resolution is better at higher frequencies than lower frequencies. It seems that the video used 1 kHz as their reference (based on the maths)
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 Now where I think the disagreement starts is the following: Academically we are suppose to gain extra benefit from higher sampling rate and larger bandwidth. But what happens in practice? In particular give the first dot point that we can't hear frequencies above 21kHz? Some people will argue that it is a given that the audible limit automatically implies we can't ever reap the benefit of the extra information in the frequency domain and time domain. What we hear from higher sampled music is exactly the same as what we hear from CD Others may propose that we can't reap the benefit of extra frequency information in the frequency domain (we still can't say 'I can hear that 37kHz sound'), BUT we can reap the benefit of higher rise time in the time domain and sharper notes. This is the basis of the MQA proposal. They claim our sensitivities to timing is far above frequencies. .... and some say, that while we CAN hear very short durations ..... we don't actually need higher sampling rates to reproduce those short time scales (for signals of lower frequency than half the sampling rate) This is the basis of the MQA proposal. They claim our sensitivities to timing is far above frequencies. This is actually quite a good point to go further into. Meridian don't claim that themselves..... they refer to others who carried out this research, and made this claim (and nobody is arguing with that) What Meridian claim, is that: Providing the sampling kernel2 is not too extended and that any subsequent quantization is properly dithered, then transient events can be accurately located in time [15]. However, higher sample rates do allow shorter details to be captured, improve dither convergence, and enable encoding kernels that provide much less uncertainty of an event’s duration [14]. Breaking that down.... In the paper [15] (Lipshitz), it says: One often misunderstood aspect of sampled-data systems is the question of their time resolution—can they resolve details that occur between samples,such as a time impulse or step? To show that the time resolution is infact infinitely fine for signals band-limited in conformity with the sampling theorem,and is completely independent of precisely where the samples happen to fall with respect to the time waveform,we shall now present some computed examples..... ie. signals can be accurately located in time, no matter what sampling rate is used. The next parts. Allowing shorter details to be captured, and more accurate event duration .... are just code for "high frequencies". The only "shorter details", than what can be captured by a certain system, are higher frequencies than half the sampling rate. ie. these "shorter details" are not audible. The other bit IS a good reason to use higher sampling rates. Dither.... and also making the design of high performance filters lots easier.... but to take advantage of these, we don't need the audio distributed to us at high rates, as we can raise the sampling rate ourselves 2
almikel Posted January 10, 2016 Posted January 10, 2016 I'm fascinated by the claim that our brains can resolve to that level (7uS), but we can only hear to 20kHz or so - I wonder why evolution determined that we needed such fast time resolution, but our hearing only needed to go to 20kHz?? cheers Mike Off topic, but wondering at a philosophical level why we might need temporal resolution down to 7uS - just did some googling on a humans "reaction time", and it popped up drag racing sites of all things - a human's reaction time is in the vicinity of .21s or 210msec (which coupled with the vehicle's reaction time is why as a drag racer you hit the go button on the last yellow light, not the green light - who'd a thunk that? - OK - ALL the drag racers amongst us have known that forever). Maybe the temporal resolution to 7uS gives the brain time to process some directional information to make us react (after 0.21sec) and go in the right direction (away from the danger)??? Back to audio - sorry Mike
Newman Posted January 10, 2016 Posted January 10, 2016 Not sure why people keep talking as if sample rate only relates to such high frequencies (above Nyquist). Clearly, it also relates to temporal information contained in much lower frequencies. No, it doesn't. As explained by others a few posts up. Any digital recording -- I don't care if its sampling rate is 1 kHz -- has timing accuracy better than a nanosecond. That's 1/7000th of the human perception limit under discussion. 3
JDWest Posted January 10, 2016 Posted January 10, 2016 Haven't got time atm to read it or look for other citations of microsecond resolution http://boson.physics.sc.edu/~kunchur/papers/Temporal-resolution-by-bandwidth-restriction--Kunchur.pdf
Nada Posted January 10, 2016 Posted January 10, 2016 (edited) My basic understanding is that the linked video states our brain can determine time cues as short as to 7uS, even though our hearing only works up to 20kHz. 7uS equates to a frequency of 142kHz - T=1/f. If that's true, it still doesn't mean we need to sample at 2 x 142kHz to capture those cues, because no transducer in the chain can deal with that anyway, and no musical instrument has anything like the rise time of 142kHz. I'm fascinated by the claim that our brains can resolve to that level (7uS), but we can only hear to 20kHz or so - I wonder why evolution determined that we needed such fast time resolution, but our hearing only needed to go to 20kHz?? cheers Mike Good question. Imagine your bush-walking with snakes about, hear a rustle and jump out of the way . Your brain unconsciously processes the spatial location. It does that in a different brain area to pitch discrimination. This suggests it has an important survival role. It does it very fast saving your life. Later you will probably realize it was just a gecko rustling at your feet but it was on your right and you instinctively jumped to the left. The brain does that probably by comparing phase delay between your ears. To do that the tone pitch has to be quite low. Why? Well the half wavelength has to be greater then ear separation to be able to register phase delay. If your ears are 200mm apart that allows a wavelength of 400mm traveling at about 333m/s gives a frequency below 832Hz. As phase delay is our most sensitive spatial cue and can only work properly below say 1000Hz or to be generous 2000Hz it suggests ultrasound over 20kHz is totally redundant. Sorry Im being ridiculous. We middle aged males are unlikely to perceive above 15kHz consciously or unconsciously. It doesnt matter anyway. There's nothing melodically useful above 5KHz. Just harmonics giving a bit of "air". Edited January 10, 2016 by Nada
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 I'll go further and suggest (someone contentiously) that it's easier to make a modern signal chain sound good with hidef material Are you thinking of any ways which couldn't be handled by oversampling?
Nada Posted January 10, 2016 Posted January 10, 2016 Haven't got time atm to read it or look for other citations of microsecond resolution http://boson.physics.sc.edu/~kunchur/papers/Temporal-resolution-by-bandwidth-restriction--Kunchur.pdf That paper uses special experimental conditions to find the limit of auditory template phase discrimination. 5us is stunningly quick. The rate at which nerve cells can fire is much slower. Shows how amazing life and evolution are. Amazing. Just note in real life we cant hear down to 5us without special conditions and need to be wearing closed headphones.
almikel Posted January 10, 2016 Posted January 10, 2016 No, it doesn't. As explained by others a few posts up. Any digital recording -- I don't care if its sampling rate is 1 kHz -- has timing accuracy better than a nanosecond. That's 1/7000th of the human perception limit under discussion. Hi Newman (and any others), So I think I've got a reasonable understanding of Nyquist and Shannon and information theory, but help me out here... If somehow the brain can resolve timing cues down to 7uS, even though the ear/brain interface can only hear up to 20kHz or so, could those timing cues (down to 7uS) if present in an analog signal be replicated in redbook? I "think" the answer is yes provided you haven't dropped into the noise floor - but my head is getting confused now... If you had 1 drum hit followed by another drum hit 7uSec later, would redbook capture that? (not saying you could hear it, or speakers could produce it). cheers Mike
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 Now, if we were to sample a more realistic music waveform (but complex enough to have some ultra high frequency content), the academic knowledge gained above is still applicable. That means the 192kHz sampling will still give us quicker rise time and closer resembles to the music waveform being sampled. (note - I am not suggesting we can actually hear this, its academic at this point) This is correct (I know what you mean) .... but could confuse people, a little. (If you already realise, then cool .... this comment is for anyone else) If you have very high frequency content ..... then sampling at a "high" rate will capture the signal. Sampling at a "low" rate, won't capture the signal correctly. For both the high sampling rate and the low sampling rate, to be able to successfully capture the signal ..... first you need to filter out the high frequencies ..... and once you filter out the high frequencies, sampling at a "high" rate, and a "low" rate, produces identical results. The point is that.... if you have a signal to sample ..... then above a certain sample rate will work correctly. Below a certain sampling rate won't. Increasing the sample rate doesn't make it do a "better" job (ie. a more accurate job). 1
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 Am I missing something? Yes! A transient that doubled in volume then returned to baseline 20 uSec later might be completely lost in a CD (or 48 kHz) recording. Yes. It would.... but what IS a signal which rises and falls in 7us? .... It's very very high frequency. To sample audio with 48khz sampling rate..... you have to filter out all the signal above 24khz first (otherwise it doesn't work). So.... saying that "A transient that doubled in volume then returned to baseline 20 uSec later might be completely lost" .... is the same as saying that we can hear frequencies over ~ 20khz (which we can't). Hope that helps. A secondary question then normally comes ... saying, ok if we just consider audible frequencies (ie. < 20khz) ..... then what about where THEY start and stop in time? .... The (incorrect) thinking being that if we don't have a small enough sample period, then we can't accurately define where in time a signal starts and stops. .... but we can make a signal start/stop in between the sampling period, and for academic purposes we have infinite time resolution. The only thing a finite sample rate does, is dictate the limit for how steeply a signal can rise (ie. how high in frequency it is). 1
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 Maybe the temporal resolution to 7uS gives the brain time to process some directional information Yes. I've seen this mentioned as an expected reason.
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 Not sure why people keep talking as if sample rate only relates to such high frequencies (above Nyquist). Clearly, it also relates to temporal information contained in much lower frequencies. No, it doesn't. Proponents of highres, seem to be telling us that it does (or at least craftily implying that to us) ..... but it isn't so. Meridian in their MQA article, even reference a paper which says just that. Libshitz, S. P., Vanderkooy, J. - Pulse Code Modulation an Overview One often misunderstood aspect of sampled-data systems is the question of their time resolution—can they resolve details that occur between samples,such as a time impulse or step? To show that the time resolution is infact infinitely fine for signals band-limited in conformity with the sampling theorem,and is completely independent of precisely where the samples happen to fall with respect to the time waveform,we shall now present some computed examples..... 4
Nada Posted January 10, 2016 Posted January 10, 2016 Time to reveal the real villain that kills imaging in stereo sound playback. Phase delays. Digital filers have them so 44.1 down-sampled from 192 by the studio will have phase aberrations. Is that a sufficient reason to want 192 files? Maybe not as phase delays in most speaker crossover's and drivers will dominate. For those with ESLs MQA might be interesting in years to come.... 1
davewantsmoore Posted January 10, 2016 Posted January 10, 2016 If you had 1 drum hit followed by another drum hit 7uSec later, would redbook capture that? (not saying you could hear it, or speakers could produce it). Yes... but. One you take away the high frequency content, the two drum hits don't look like most people would imagine (ie. not like, "impulse, gap, impulse") .... because they last a LOT longer than 7us, you just two waveforms beginning 7us apart, which are mixed together.
Guest rmpfyf Posted January 10, 2016 Posted January 10, 2016 (edited) Are you thinking of any ways which couldn't be handled by oversampling? Oversampling brings about it's own challenges so far as audiophile reproduction goes - ideally we're seeking the least filtering possible. Big sample rates allow more flexibility here. Edited January 10, 2016 by rmpfyf
Guest rmpfyf Posted January 10, 2016 Posted January 10, 2016 (edited) Time to reveal the real villain that kills imaging in stereo sound playback. Phase delays. Digital filers have them so 44.1 down-sampled from 192 by the studio will have phase aberrations. Is that a sufficient reason to want 192 files? Maybe not as phase delays in most speaker crossover's and drivers will dominate. For those with ESLs MQA might be interesting in years to come.... Yup. Group delay, phase response, it's not just amplitude response (which can be made very flat). This is at the heart of my earlier points - big sample rates allow more flexibility to this end. Softer filters within audible frequencies. And you're 100% correct. Additional frequency distortions downstream. In the frequency domain these can be multiplied throughout the signal chain, with the end result being the frequency response (amplitude and phase) of your entire system and transmission path at whatever point you listen to it. Playback at higher frequencies and a system can be employed with softer filter distortions in the audible range. But you could achieve the same, even better with a calibrated microphone and a copy of DRC... I alluded to this earlier. Give it a shot - the difference is night and day. Particularly when phase response is corrected over two channels to the same listening location - the net effect of the which is sharpness, clarity, awesome. DRC is free, so's experimentation. Edited January 10, 2016 by rmpfyf
BradC Posted January 11, 2016 Posted January 11, 2016 I reckon that the main benefit of MQA is the inclusion of the apodized inverse A/D and D/A filters. The use of higher sampling rates probably just makes this filter more effective. They could probably produce a plugin that performs this function without any new format, but can't sell that as easily. And you would only have a generic solution, not the 'authenticated' one that knows the A/D that was used (but this only needs metadata) in the recording. 3
davewantsmoore Posted January 11, 2016 Posted January 11, 2016 Time to reveal the real villain that kills imaging in stereo sound playback. Phase delays. Digital filers have them so 44.1 down-sampled from 192 by the studio will have phase aberrations That isn't necessarily true. When the amount of time it takes to do the decimation (down-sampling) doesn't matter.... then filters which no phase issues can be used.
davewantsmoore Posted January 11, 2016 Posted January 11, 2016 (edited) Oversampling brings about it's own challenges so far as audiophile reproduction goes - ideally we're seeking the least filtering possible. Big sample rates allow more flexibility here. ? What is the difference between receiving your audio directly from the studio with a sampling rate of 352.8khz .... and receiving it with a sampling rate of 44.1khz, and then oversampling it 16x (to 352.8khz). ie. How does "big sample rates allow more flexibility" ? (in a way that oversampling can't provide an equivalent) Yup. Group delay, phase response, it's not just amplitude response (which can be made very flat). This is at the heart of my earlier points - big sample rates allow more flexibility to this end. Softer filters within audible frequencies. Phase response can also be made flat.... either with computationally heavy filers, or oversampling, or both. (sorry, if I've misunderstood something, please elaborate) Edited January 11, 2016 by davewantsmoore
weirving Posted January 11, 2016 Posted January 11, 2016 Watch the video's in the links provided again. Pls Dave I liken this to the car argument, electric is obviously going to be better but there will be some that will dogmatically defend fossil fuel ICE to the bitter end, even though it is inferior. Poor analogy. One of the main arguments back in the day against higher sampling rates was the cost in both hard drive space and CPU time. Nowadays hard drive space is cheap as air and processors are much, much faster. I still haven't made up my mind about whether 192kHz is necessary for true high fidelity, but technical and cost limitations no longer are barriers to its wide use. As for electric versus ICE cars, while it's true that electric in some form is likely the way of the future, there are still major limitations to the technology which prevent universal adoption. To wit: I will buy an electric car when I can afford one; today they are nearly twice the cost of a similarly sized and appointed ICE. I will consider an electric car when I can drive 300 miles or more per single charge and then recharge at a recharging station in 15 minutes or less (about the length of a gasoline refuel stop). The electrics of today are feasible for local commuting only. For long road trips they are still laughingly impractical. In short, unlike with so-called high-res audio, cost and technical factors are still major barriers to mass adoption. Sent from my SCH-R530U using Tapatalk
Guest rmpfyf Posted January 11, 2016 Posted January 11, 2016 (edited) @@davewantsmoore, it's really down to how 'hi' is your 'hifi', which isn't necessarily representative of a linear progression of 'better'. The example you give ('what's the difference') is indifferent (assuming effects owing to system design e.g. more intensive oscillator etc are inaudible) with respect to generating a high-sample time history. To get 44.1kHz from the master though there is resampling with inherent filtering, and practically in either case the LP filter post conversion will change. It's remastering, not lossless compression. Let's be careful about not confusing oversampling in ADC, mastering and DAC processes. Oversampling exists for a number of reasons - initially to reduce noise and beyond this to allow easier filtering implementations. Without it a sharp and typically expensive filter is needed. With it, a simpler analogue filter can be applied. It'll have a much softer slope and different phase effects. On a sufficiently resolute system and to a sufficiently trained ear, this will sound different (the audiophile's 'black hole' - the more your learn, the more you know, the more you spend...) Your suggestion negates the reason oversampling came into being and typically exists in digital-to-analogue conversion, the R2R vs delta-sigma DACs. The former recreates a waveform very accurately in one step, requires very little post-conversion filtering (reconstruction not mastering here), does away with charge capacitance issues and the like in the output stage and is very expensive because the resistor ladder requires precision manufacture. The latter takes an inherently noisy signal and relies on the degree to which it can be cleaned up. We are ultimately speaking of trading resolution for accuracy in a paradigm sense, and refining the solution from there. This is at the core of much of hires playback. An accurate, uber-hires system is difficult to implement, particularly from a filtering perspective, and a system that is both beyond the range of human hearing offers nothing of tangible value. This is not why we have hires! It's just the easiest way to sell it. We're not going to see Youtube clips of men with beards saying 'buy MQA, because unless you're a dedicated Redbook nut, the rest of your audio system is probably an inherent compromise that we'll make audibly easier' (though it's a broad truth). Yes, delta-sigma DAC's don't all need to be single-bit, some are hybrid R2R, but there's a mega filter at the end of it by definition, there's a significant amount of noise created by definition, and there's a significant oscillator that needs to be employed to make it tick. So we oversample much. Take the AK4490 used in the Klein - 256x oversampling, 8x filter. It's designer's benchmark is an R2R DAC. When was oversampling kicked off? Around 1980. As a reproduction technique, oversampling is still under development. It will certainly continue to improve, just as it's certain that R2R chip production is dead. There are some things that are just inherently difficult to avoid in sigma-delta designs - reproduce a sound just above the noise floor, for instance - in an R2R design, it's 0V. In a delta-sigma design, even if oversampled many times at higher bit depth and time resolution - the end result is not 0V. Granted we don't spend money to listen to complete silence but it's a corner case that does tangibly contribute to the way we listen to music. If signal processing - not filtering - then you can dial up whatever phase and amplitude correction combination you like. A high-speed high-transistor count chip is probably a noise source you don't want near your DAC. Do it at the source or on the source material - see my earlier note on running DRC on a PC, or just using it to convert (linearise) source material before playback. Signal processing is inherently different as a process to filtering, though DSP can be used to implement filtering. It's uncommon. Edited January 11, 2016 by rmpfyf
Recommended Posts