True silence is usually not an issue though, but there might be other reasons to record the silent room. Like getting the impulse response data, aligning the DC offset or getting the noise profile for noise reduction.
In other words: It's mostly used a reference rather than the explanation given in the post.
Their metaphor still works though. The length of the wild sound tells me the OP might be talking about an older process before digital noise reduction was as common as it is now where less than half minute is enough. The idea that a "silent" room has a recognizable unique sound or even that a recording setup has a unique sound like internal mic noise is still valid for the metaphor of basically something that is perceptible to humans but difficult to give a well rounded answer as to the multiple variations that exist because they are generally so very subtle.
Like in regards to water and sound humans can tell hot liquid from cold when it is being poured or moved by sound. Actually explaining the difference in words requires a more complicated use of language but you basically know the differences when you hear it.
Since actual silence is very rare (Edit: on Earth before one considers the vacuum of space) and requires tech to purposefully create one can assume they mean just "a room where no one is talking" which weirdly itself is a more antiquated definition of silence .
'Silence' is a highly contextually defined word, with many social, physical, and metaphorical uses, each of which shifts, depending on your intent.
Three versions of the word are running through the recordist's mind at this point: silence as in hold your tongue and twitches, as an artifact captured as 'room tone', and as the absence of unwanted electromagnetic signals in the toolset.
If you want to be fussy about usage of the word, you really have to pin down the intent of both speaker and audience.
To be fair, a simple word like 'set' is similar in complexity of usage. 'Silence', however, carries a lot of baggage wherever it is used.
Such is the case with most things in language. We really are translating thought through an imperfect medium to reassemble it on the other side in someone else's brain. Linguistics being what it is that imperfection leads most of the time to pedantry. The idea of "silence" as an absense of sound translates very differently when you start looking at sound with the technological equivalent of a magnifying glass versus just the naked ear.
Well sure most lexicon discussions can turn into hair-splitting, but I would like to make a special case for the word 'silence' as a term with more than an average amount of emotional weight and semantic specificity.
Its use is often quite subjective while myopically considered obvious by the user, because it is intrinsically confusing. It can be very politically loaded when referring to reticence or censorship. In physics it's a problem of absolutes, and in psychology a phenomenological conundrum.
Then there's zen, sufism, and mystic shit like that. Rabbit holes and silent all the way down.
To the naked ear, silence is always relative to a previous soundscape, since even at the quietest moments you will still hear your heartbeat, breath, digestion. It's neurophysiology and psychology and philosophy and more when talking about silence to the naked ear, all using different definitions of the word.
Yes it's fascinating for sure, but a "scientifically silent room" is a very different phenomenon from someone watching a TV show on a TV not producing fake silence in their living room, where there is already noise and reverberation.
You can turn the volume down already if you want to experience the non-phenomen of not having a TV produce noise. It will not upset you, I promise.
The reason why people get disoriented in silent rooms is the lack of response from reverb and/or lack of sensory input at all.
Addition of ambient noise to an audio signal is for other reasons. With proper mixing and envelope control you will not experience cutting in and out of silence. However, audio production is a lot more complicated than just cutting audio together. It makes sense to create or simulate an entire "enviroment" at which to throw audio at, sort of like priming a canvas before painting.