Skip Navigation

What is the most difficult problem that you have fixed in linux?

163

You're viewing a single thread.

163 comments
  • I manage a machine that runs both media transcodes and some video game servers.

    The video game servers have to run in real-time, or very close to it. Otherwise players using them suffer noticeable lag.

    Achieving this at the same time that an ffmpeg process was running was completely impossible. No matter what I did to limit ffmpegs use of CPU time. Even when running it at lowest priority it impacted the game server processes running at top priority. Even if I limited it to one thread, it was affecting things.

    I couldn't understand the problem. There was enough CPU time to go around to do both things, and the transcode wasn't even time sensitive, while the game server was, so why couldn't the Linux kernel just figure it out and schedule things in a way that made sense?

    So, for the first time I read up on how computers actually handle processes, multi-tasking and CPU scheduling.

    As FFMPEG is an application that uses ALL available CPU time until a task is done, I came to the conclusion that due to how context switching works (CPU cores can only do one thing, they just switch out what they do really fast, but this too takes time) it was causing the system to fall behind on the video game processes when the system was operating with zero processing headroom. The scheduler wasn't smart enough to maintain a real-time process in the face of FFMPEG, which would occupy ALL available cycles.

    I learned the solution was core pinning. Manually setting processes to run on certain cores of the CPU. I set FFMPEG to use only one core, since it doesn't matter how fast it completes. And I set the game processes to use all but that one core, so they don't accidentally end up queueing for CPU time on a core that doesn't have the headroom to allow the task to run within a reasonable time range.

    This has completely solved the problem, as the game processes and FFMPEG no longer wait for CPU cycles in the same queue.

You've viewed 163 comments.