On 14 Jan 2003, Paul Millar paulm@astro.gla.ac.uk wrote:
I've just tried distcc and its wonderful!
Thanks. :-)
I haven't benchmarked it yet, but subjectively (from watching the nodes' load-avr) distcc seems to give *much* more even loading that just running OpenMOSIX. If anyone has more than one machine at their disposal, I'd strongly recommend investigating distcc.
On Thu, 9 Jan 2003, Martin Pool wrote:
I haven't tried it myself but I would have expected the short, intense jobs generated by compilation to be a problem [for OpenMOSIX]
Yes, I agree. But, I think OM still has a role to play. Running OM underneath distcc should help improve the mean performance (in a heterogeneous cluster). Whenever a faster node is unloaded whilst a slower node is busy compiling (and this situation lasts for any length of time) OM should migrate that process to the faster node, speeding up compilation. That might occur just before linking, for example.
The argument against it is this: compiler processes have a large working set (>20MB, say, though it varies), do a lot of IO, and only run for a few seconds. On a 100Mbps network migration of a running process will take a few seconds, after which time many other processes may have started and stopped, so the load pattern may be very different. I think it may be difficult for OpenMOSIX to react fast enough to handle the condition you describe.
I have not personally benchmarked OpenMOSIX for this, so you should take the above with a pinch of salt.
I'd very much like to work with somebody with a >8 machine cluster to run distcc benchmarks and comparisons to SSI clusters.
The "grainy" nature of the workload makes it difficult for distcc to schedule optimally, although it should improve somewhat in the next few months.
This paper describes good results using MOSIX for software building:
http://www.mosix.cs.huji.ac.il/ftps/usenix.ps.gz
Thought they did spend USD $390,000, which is more than I can manage. :-)
I guess the improvement will depend strongly on the composition of the cluster and the code you're compiling. The improvement (from running OM) might be marginal in certain cases, but I don't think it would make things worse.
Well, it may use network bandwidth and CPU cycles for migration or overhead that might be better spent on either cc or distcc.
It's great that there is good free single-image clustering software. The point of distcc is just that you can distribute the particular task of compilation with a much simpler and less intrusive program.