On the Measurement and Audibility of Headphone Break-in

The Experiment
A couple of days before I left for CanJam at the Rocky Mountian Audio Fest, I stuck a brand new pair of AKG Q701 on my Head Acoustics measurement head, positioned it properly in the measurement chamber, and closed the door. Then I waited a couple of days for the ear pads to settle in. Before heading out the door for the 700 mile drive to Denver, I pushed the button to start the last break-in test.

For the next 300 hours or so, the headphones played pink noise at 90dBSPL for an hour, then rested silently for 10 minutes to let it cool down, and then underwent a battery of tests including: frequency response; THD+noise vs. frequency; intermodulation distortion spectra, and impulse response. The system glitched a couple of times and had to be restarted, so the test was really done in three roughly 100 hour tests; there was a one day break between the first two, and a couple of hours break between the middle and last data sets. (For some dumb reason, Excel would choke at about 110 hours. Argh!)

A total of about 1.7 million individual measurements were made over the 330 hours.

After the test was completed, and the spreadsheet of the three chunks of data assembled onto one spreadsheet, I sent it off to a couple of folks for a look-see. Arnaud was a real trooper and did a bunch of number crunching. It took him numerous tries to get it to do his bidding, but after persistent effort, he managed to produce some movies of the data changing over time. Let's have a look ....

Frequency Response
Here's Arnaud's video of the frequency response over the 330 hours of testing. The first bit of the video toggles between the beginning and the end FR, and then the video goes through all 330 hours of measurement.

Here's what I see in this data:

  • At about 2kHz I see a dip that kind of breathes back and forth. We'll see this action in a more pronounced way in the cumulative spectra decay plots. I believe what we're seeing here is the day/night temperature cycle. We did see some similar artifacts in previous experiments.
  • I see the data jump around sometimes. I'm not sure where this is coming from, but I think Arnaud might have been calculating FR from impulse response data. Because the system sets the level for the measurement using an maximum length sequence signal, which sounds much like pink noise, it might be that the signal level of this test is not set reliably at the same number due to the noisiness of the signal.
  • If you assume the "breathing" and the glitches are not from break-in, there's not much left to point at as artifact that might point in that direction. If break-in happens and is measurable, it doesn't seem to be visible in the frequency response data.
As the methods used by Arnaud couldn't yield low frequency FR information, and many people claim break-in improves bass response, I created this plot that shows the raw FR data for the left and right channels at the start and the end of the test.


I've offset the right channel down 5dB for clarity. I breathed a sigh of both relief and angst looking at these data. It's fairly obvious that:

  • The system is performing quite reliably, as the start and finish plots of both channels overlay each other almost perfectly over roughly 12 days of measurements. I'm very happy to see the gear looking so stable.
  • There is virtually no change in FR data over the 330 hours. If break-in exists, it's not affecting basic frequency response.

THD+Noise vs. Frequency
Having measured THD+noise Vs. Frequency for this test, I plotted the data over time.


This test is very sensitive to sound in the measurement environment. If a large truck happens to be driving down the street outside my home during the test, it could very easily cause a blip in the measurement. Here we can see some random blips here and there, but no observable change that trends over time. Bottom line: I see nothing here to indicate break-in.

I guess we're going to have to look elsewhere. Fortunately, Arnaud also made movies of cumulative spectral decay. Let's have a look ...


flatmap's picture

Here's a thought. Consider listening to music while a mosquito buzzes nearby. The listening experience is much less enjoyable even though the amplitude of the mosquito's sound is small. But ridding yourself of the mosquito may have a large subjective change in listening pleasure.

This suggests that a small measurable change can still be significant subjectively.

inarc's picture

Thank you and Arnaud for this excellent work!

My personal take on break-in is that it does exist in some physical form or another ("everything flows") but that it is too subtle to be subjectively significant, if even audible.
The kind of break-in that is mostly talked about happens on the receiver's end.

Tyll Hertsens's picture
... the kind of break-in that is mostly talked about is mythic keyboard enthusigasm.
MacedonianHero's picture

Fantastic write up and investigation! Thanks Tyll! :)

KikassAssassin's picture

It would be interesting to see the same test done on a headphone that isn't considered to benefit much from break-in, and see how it compares to this. One thought I have about this is that if the differences people hear with break-in mostly happen in their heads, and the K701 is widely considered to change significantly with break-in when the measurements show otherwise, I wonder if this might mean that there's some aspect of the K701's sound that just isn't very pleasant, and it takes more time than usual for people's brains to adjust to it.

arnaud's picture

Hi Tyll,

Many thanks for this interesting article! I am sorry to hear you will not repeat the test with another headphone as I find it fascinating!

On the sudden jumps:
> All the headphone responses are normalized by the maximum amplitude for the equalized response (so around 150Hz it seems for this headphone) so, apart from a sudden change in the frequency content of the excitation, the jumps in the response shouldn't be related to the MLS sequence
> On the other hand, since the FRF and CSD graphs are extracted from single impulse response data (not your typical average of 5 or so positions), they are sensitive to the headphone position
> I can't recall, but didn't you say you had to reseat the headphones one time or something (for instance the jump between 104 and 105 hours would seem like an effect of reseating).


Tyll Hertsens's picture
Well ... I said I'd likely not be doing more. I'm a geek ... ya never know what might get my attention. But I do have ideas for other things that need looking at.

"I can't recall, but didn't you say you had to reseat the headphones one time or something (for instance the jump between 104 and 105 hours would seem like an effect of reseating)."

The headphones weren't reseated between the series after 100+ hours; I just restarted the test, the door was never opened. I was careful to leave the door closed and not even enter the room unless I had to.

arnaud's picture

Thanks for the clarification on the experiment Tyll. The sudden jumps are quite puzzling and I still can't imagine they're related to the excitation signal. One thing that came back to my mind is that the input signal is (fortunately) taken out of the equation when measuring the impulse response by MLS. The impulse response is obtained from cross-correlation between the input signal into the amp or heaphone and that of the microphone.
So, as long as some noise is injected over the audio range, you'll get the same frequency response function (assuming the system measured doesn't change) even if the input signal varies a bit in level and other frequency characteristics.
The puzzling part is that the jumps don't occur at fixed interval (every 50 hour +/-15 hours) but they all do the same thing (increase level by up to 2dB, the higher the frequency, the more visible it is). Another puzzling bit is it would appear all the responses after the jump are similar, so it's like resetting the response to about what it was at the last jump...

The one thing that makes sense is that the variations are more pronounced at high frequencies since it's where the response is most sensitive to slight perturbations in the system.

You know your test rig better than anyone, but I wonder if you could get more representative results by performing standard 5 position average tests every 50hours of burn-in or so and compare that against another pair of same headphone which would also be measured every 50 hours but not being used in between tests.
With such test, you would not only address the question on measurability of burnin effects but you would also show if any variation observed can be just as much attributed to burnin as simply natural uncertainty in the measurements (temperature, spurious noise and vibration, ....). BTW, you also wouldn't have to monopolize the test rig for the whole duration of the experiment in that scenario and you could make sure you control some of the environment variables (temperature, exterior noise/vibration...).

Considering some people often try to dismiss measurements when a headphone is objectively criticised or large variations are observed between several pairs of the same model, addressing the level of uncertainty in the measurement would be extremely useful to do (not just for the fun of it but simply to give more authority to the results presented on this site).

End of long rumbling post ;)

jeckyll's picture

This is why I come to this site. I've personally experienced disappointment with new headphones sounding very closed in and 'disappointing'. Only to find that after a bit of time (not hundreds of hours) I enjoyed them much more.

It's cool to see an attempt to find at least a bit of scientific basis for this type of phenomenon :)

Jazz Casual's picture

Wait 'til the mythbusters over in the Head-Fi sound science forum see this! A link to this article should be posted in every forum thread where someone inanely posts "it needs at least (insert number here:) ... hours of burn-in before it really begins to open up."

I suppose I fall into the burn-in sceptic camp. I've never noticed any discernible change in the sound of my headphones over time, and that's not through want of trying. The results from Tyll's testing gives me no reason to alter my view at this stage.

dalethorn's picture

Generally I'm agnostic on break-in, but I would like stronger assurance that these high-quality headphones were not played on a test bench before shipping. I have read many articles stating that many electronics manufacturers play their products for awhile as part of their automated Q/A. It costs them little, since it's automated.

My second thought, based on much reading of different products and websites, is that Grado is widely regarded as a significant break-in candidate, especially on the lower models, and even by Grado themselves. I don't recall any other manufacturers claiming that break-in is needed.

Alfred's picture

Thanks for the great article Tyll!

I've found that just playing noise for breaking/burning in (whatevas) didn't work well for me. I use variety of songs (and noise would be used too, but only a portion of it) and plug the cans into a powerful amp for breaking/burning in.

And yes even the lower cans need 100 hours at least.

IMHO I also feel that the higher cans need like a thousand hours to really open up.

Jeff Graw's picture

It's obvious that all of these changes are slight, and each one by itself is probably more or less not preceptable.

Perhaps the whole is greater than the sum of the parts though?

thune's picture

Nice article with sensible advice.

One point: For loudspeaker woofers, break-in is known to lower the resonance frequency and slightly change the woofer TS parameters. (With the woofer spec being for the broken-in condition.) A simple impedance sweep before and after break-in shows this change.

One potential experiment: take a dynamic headphone (or several) with a low frequency resonance characteristic (not the Q701 or planars), and show an impedance sweep before and after a burn-in period. This experiment might provide insight into claims of "better bass" after burn-in.

Tyll Hertsens's picture
Thanks for the tip, I'll keep it in mind.
Milton's picture

I found your results very interesting! Will your findings effect how you review headphones in the future? In other words, have you concluded that it would be a waste of time to break in headphones before reviewing them? Or will you take the "chicken soup" point-of-view (i.e. Maybe it doesn't help, but it couldn't hurt.) Thanks again for posting these results!


Tyll Hertsens's picture
I've got a burn-in station where I can run in headphones 24/7. I always give stuff a quick listen when I get it to get a first read on whether I like it or not, then I drop it in burn-in til I want to listen to it again or other cans come in and pushes it off the burn-n station. I'll usually make sure a headphone has a hundred hours on them or so before I seriously listen to them. I want to give the cans their best shot.
CarlSeibert's picture

>The miracle is in your head ... not in the headphones.

Yup. That's the biggest part of the system. I would say, you're not imagining the effects that your mind and body introduce to the system, but actually, the end product of a music system IS imagination, so, well, I won't. ;-)

This Wall Street Journal story on placebo effect is interesting. (Executive summary: In cases where your brain is part of the equation, "placebos" can be so effective I'd hesitate to call them that. It's not just that you THINK you can see or hear better, you actually can.)


It's interesting that break-in may center on IM effects. It seems like a few things that we are having trouble measuring now come home to IM. When we can measure IM effects better, lots of stuff might come into better focus. Remember when all "competently designed amplifiers sounded the same" because they measured the same? Well, they never did sound the same. They don't measure the same anymore, either.

Kudos to Tyll for being the voice (and ears) of reason.

The Monkey's picture
One of the things that fascinates me about the burn-in "debate" is the defensiveness with which the placebo effect is dismissed by the true believers. I simply do not understand the "YOU DON'T KNOW WHAT I HEAR BUT I DO AND I HEAR IT LALALALA!" Regardless of one's position on burn-in, I would like to believe that people are willing to bring more intellectual curiosity to the table. I know that's a tall order. I suspect that the "I BELIEVE IT CUZ ITS REALZ CUZ I HEAR IT" crowd's fear of the placebo effect is a combination of, among other things: (a) deep insecurity; (b) a fear of the inner-workings of their own minds (i.e., a pre-disposition against any notion of the subconscious); (c) a lack of critical thinking; and (d) a fundamental lack of understanding about the difference between a fact and an opinion. I suspect that Tyll's groundbreaking work will do little to even open the minds of the true believers, and that's unfortunate because they seem to be taking over certain sites dedicated to this hobby. I suppose, in some ways, that's indicative of a broader national "discourse," but that's a subject for another day.
Tyll Hertsens's picture
... and rightly so, of course.

And, my dear friend, that's exactly why I wrote three articles and spent so much time on the subject: to try to put this topic into some sort of rational context --- pour a little cool, clear water on the heads-a-fire crowd.

The Monkey's picture
I think you've accomplished what you set out to do and then some. Case in point, the intermodulation distortion data should not be ignored, even by a cynical old monkey like me. There are too many people whose ears I respect in this hobby who swear that break-in exists for me to just dismiss it, though I have remained skeptical. In some ways, your data provide fodder for both sides, but your data coupled with your sensible conclusions more helpfully provide a path forward for perhaps actually resolving this debate. Well done!
Jazz Casual's picture

The headphone listening experience is inherently subjective. It is unlikely that "burn-in" believers will renounce their faith in the face of headphone measurements that are ultimately inconclusive. The "debate" should continue ad infinitum regardless of the data, as battles between belief and "science" do. What never ceases to astound me is our capacity (I'm referring to blokes here) to obsess and vehemently argue over the minutiae of life. It's almost endearing. ;-)

Tyll Hertsens's picture
Thanks Carl. Your absolutely right, we're running around with a personal experiencing device that won't be matched in this century tucked neatly in our noggin. It's capable of doing all sorts of cool stuff ... un measurable stuff, but cool stuff nonetheless.

People think to much. I think we should more often just relax and experience things.

dalethorn's picture

Are we all going to assume what we assume on the basis of the AKG only? And without knowing if they had break-in at the manufacturer? Seems like someone would have that knowledge or at least not make assumptions based on not knowing.

Tyll Hertsens's picture
Or are we thinking too much about this?
dalethorn's picture

I don't know if anyone is thinking *too much* about break-in, but they sure as heck are thinking a *lot* about break-in.

zobel's picture

Thanks Tyll for your work at trying to make quantifiable measurements that tell us something real about headphone performance. The frequency response graphs look really wacky on the top end, and are probably not how we perceive those frequencies, but I am glad you stuck to a single response correction in all your tests so we can make some sort of general comparison between headphones. As you know, Sennheiser makes headphones for audiologists to use to measure people's hearing, and those must be very flat in frequency response. Also, I'm sure that you know that all the best headphones have a fairly flat and smooth frequency response, as we perceive them. It would be cool if we could measure and chart what the cans actually are doing, or can we? I think your measurement system just needs recalibration on the high end to reflect what we actually hear. Would FFT be more accurate?
I've never noticed much difference in a "broken in" set of cans over new ones. I have noticed much more difference between different runs of the same model, built in different locations.
I love hearing your reviews of all the cans, and from the ones I've heard, I always agree with your ears. Thanks.

halcyon's picture

Very good work, as always. Such a pleasure to read your measurements and comments.

Two things of note here, obvious to Tyll, but not perhaps to everybody else.


The human hearing is much more sensitive to intermodulation distortion, even down to 0.5 dB differences (or below, not enough data exists on this). This is highly dependent on the frequency distribution of the IMD artifacts. Lower Q resonances, as they cover more of critical critical bands, is more audible, even at lower absolute amplitude than a higher Q artifact.

Also, the frequency of the IMD artifact is of high importance and at typical playback volumes, the upper range presence region IMD artifacts are most audible due to diminished masking effects from increased hearing sensitivity at that region.

2) The biggest break-in in human hearing is it's constantly adaptive and compressive/expansive capacity. It is not a fixed, nor a non-linear instrument. Any sound you feed it long enough will form as the new baseline against which other things are *relatively* measured.

This psychological break-in has at least two phases: almost instantaneous effect that starts after a few seconds after the echoic memory fades and gets re-inforced after instant repeated simultaneous plays. This is the reason why a lot of the long-duration A/B blind switching trials succeed during the first 3-7 listens and then decline rapidly towards the pure random variation. It is unsure - to me at least - whether this happens at cochlea level or at cortical level - or perhaps even somewhere in between.

The second break-in is (probably) more a function of cortical excitation changes where the action potentials from continued exposure to a certain repeated signals get lowered and the initial higher level of excitation is diminished. In layman's terms: you may have perhaps heard of the cat with electrodes implanted to her auditory center to produce a constant phantom "beeeep" sound. The cat goes berserk for a while until the brain adjusts and baselines this signal as normal. Then as the electrodes are removed along with the baselined signal, the cat goes berserk again for a while, until the cortical response is baselined.

A similar thing happens in a human brain. It cannot be unlearned, it cannot be avoided, it happens to all of us. Regardless of how trained ears, how much experience or book knowledge we have. That's why all gear tends to sound different after hundreds of hours of repeated acoustic exposure to them. For better or worse.

In summary: does audible break-in exist?

Of course it does! It would be silly to deny the human hearing along with its adaptive properties as (see point 2) as the fundamental measurement instrument of this test. Human hearing does adapt, thus does the audible signal, thus there is break-in. With everything you listen to repeatedly: Stradivarius, that new headphone amplifier or a jack-hammer.

However, whether there is absolutely measurable non-psychoadaptive signal changes attributable to 'break-in effects' of headphones AND are these changes actually psychoacoustically perceptible AND to whom of us (this skill is partially trainable), well let me say that the jury is still out on that one for me.

Further measurements on IMD resonance at low Q-levels in the upper presence-region which is the most sensitive part in equal loudness contours might turn out something interesting.

So, until the next time!

Mkubota1's picture

Thank you so much for doing this. It will serve as a good point of reference when discussing the topic. As the saying goes, "Extraordinary claims require extraordinary proof." Until an equally rigorous demonstration is done showing the contrary, in my mind there is little need to struggle with this any further. Thanks again (Arnaud!).

borispmchan's picture

Hi Tyll,

I'm slightly surprised by your findings because I think I've witnessed break-in on my Ultimate Ears Super.fi 3.

I have bought it around 2 or 3 years ago, and when I took it right out of the box for a listen, it was pretty unbearable. The bass wasn't quite there and the sounded quite sterile indeed. I wasn't happy with it and just left it plugged to my computer for breaking in and around a week later, I gave them another shot and they sounded pretty nice indeed. Far from sterile, got some real punch and deep bass. I'm not sure about K701, actually I tend to believe they've been tested and somehow broken-in before being shipped. I may be wrong though. Anyway, any measurement of break-in effects on balanced armature IEMs would be nice.


Grizzled Geezer's picture

...when I reviewed headphones for Stereophile, I noticed that their sound changed as they "broke in". So I ran all 'phones overnight with noise, at a fairly high level.

Conventional dynamic headphones showed only a slight change. But planar headphones -- both electrostatic and "orthodynamic" -- showed much more-noticeale changes. As interesting as these tests are, it would be even more interesting to see how a pair of STAX 'phones change. (Does anyone else make 'stats? Sennheiser?)

There's another headphone issue that needs research. It seems that the brain "adjusts" to headphone colorations -- reducing their subjective effects -- in a way that does not occur with speakers. (That's been my experience, anyway>)

jimmyjames's picture

Audio folks have often said about component break in that it's not the component that's breaking in, it's our ears. Hmmm?

Also, from KikassAssasin: "I wonder if this might mean that there's some aspect of the K701's sound that just isn't very pleasant, and it takes more time than usual for people's brains to adjust to it." --

Couldn't have said it better myself except that my brain never adjusted to the 701's sound. Argh!

Dick Emery's picture

I have read your comparisons with keen interest and what caught my eye was how you mentioned that the characteristics changed throughout a 24 hour period and could possibly be attributed to temperature changes throughout the day. That got me thinking. For testing purposes the cans are put on a dummy head. Something that is not alive and does not emit any real heat of it's own.

Humans are made of flesh and bone and have heat variances from one moment to the next. When you place a set of headphones onto your head your own body heat permeates into the cans body and ultimately the drivers and electrical components.

My theory based on your measurements is that body heat can have a dramatic effect on the overall response. The temperature variances on the artificial head were probably slight. Whereas on a human head I would imagine those self same variances would be rather more extreme.

What do you think?

Tyll Hertsens's picture
But they'll certainly change a bit as they warm up next to your head. It's all pretty small potatoes, I think.
aphocus's picture

I read I think on nswavguy's blog about very high predictably in people preferring audio of as little as 0.1dB louder, with no other alternations, mind you this effects the entire sound, and not -60dB below peak, at 115hrs there's a sudden drop in almost 1dB this is probably the mostlikely point at which the change would be most audible.

The other idea would be that the IMD is audible but very subtle maybe to the point of almost inaudible, and when it reaches a certain critical point, it becomes inaudible, and the human's auditory processing centers no longer have to filter it out, requiring less effort on the brains part to achieve meaningful information. people with say an auditory processing disorder could possibly sense that as a "night and day" change.