For about half the call, the person on the other side is saying nothing, so there is no reason for the program to listen to the silence on their end and then transmit that non-information all the way to you. So the people who invented VoIP came up with something called "voice activity detection." When the other person isn't talking, the program detects that and stops transmitting, saving a ton of bandwidth. Then in comes the comfort noise to convince you the conversation's still happening and saving you from a call that audibly goes in and out.
When Skype was first being put together, engineers tinkered with a bunch of different levels of comfort noise to figure out which worked best. And while bandwidth for carrying audio may not be quite so scarce today as it was a decade or two ago, reducing bandwidth by half has got to offer some savings when multiplied across millions of users, so new versions of Skype still use the system. Currently, programs don't also fill silences with simulated breathing, the crinkling of chip bags, or the occasional unscheduled toilet flush, but we trust they'll add all those in time.