Ever since Steve Jobs first unveiled the next version of OS X, dubbed “Snow Leopard,” the internet has been abuzz with excitement and wondering about the supposed “evolutionary” qualities of OS X 10.6. One of the most-hyped improvements is the promised revamp of the SMP capabilities of OS X, with a “breakthrough” in SMP performance.
The codename for the technology behind the SMP improvements in OS X Snow Leopard has been named “Grand Central,” which Apple describes best:
“Grand Central,” a new set of technologies built into Snow Leopard, brings unrivaled support for multicore systems to Mac OS X. More cores, not faster clock speeds, drive performance increases in today’s processors. Grand Central takes full advantage by making all of Mac OS X multicore aware and optimizing it for allocating tasks across multiple cores and processors. Grand Central also makes it much easier for developers to create programs that squeeze every last drop of power from multicore systems.
Our guess is that these SMP “breakthroughs” are going to be delivered in two blows:
- Improvements to the OS X kernel intended to boost multi-threading & multi-tasking performance and better-distribute the loads across multiple CPU cores more intelligently.
- Provide an SDK (perhaps as improvements to XCode) that allows developers to more-easily write multi-threaded code, handle forking, and provide load-balancing features on a per-core basis.
The first feature is what’s exciting – we believe there’s a good chance Apple will be using some form of FreeBSD’s ULE scheduler or the other in OS X.
There isn’t much info available on what scheduler(s) OS X is currently using as of 10.5 (the only question we could find on the topic remains unanswered). But OS X has its roots firmly planted in the *nix world, and it’s possible to make some educated guesses on the topic. The XNU Kernel that OS X uses is a mesh of the Mach Kernel and large portions of the FreeBSD project, and OS X uses the Mach kernel’s scheduler – or at least it did back when OS X was first launched.
The FreeBSD project has long been working on alternative scheduler intended to replace the default and aging 4BSD scheduler: the ULE scheduler. ULE is now scheduled to become the default scheduler in the upcoming FreeBSD 7.1 release. ULE has shown significant improvements in multi-core environments, and was designed from the ground up to provide increased SMP scalability. Most importantly is ULE’s overhauled support for per-processor queuing of tasks and the ability to set CPU affinity per-processor-per-thread.
If Apple were to implement a form of the ULE scheduler in OS X 10.6, Snow Leopard would be a formidable OS indeed. Using ULE guarantees huge performance benefits for multi-threaded applications, and would help address the second point listed above: the SMT affinity options provided in ULE would make creating an SDK intended to allow developers to use multiple cores efficiently and evenly quite easy. OS X has always been close to the FreeBSD project, and something like this is a natural fit for an OS looking for improvements to SMP/SMT performance.
Of course any time Apple offers a feature, it has a twist of its own. In this case, it’s OpenCL – a technology Apple says will allow developers to use the GPU as a number-crunching processor right from the usual code without much effort. This lies squarely in ULE’s playing field, since the ULE scheduler was designed with full support for load-balancing and threading across processors of varying performance, clock speeds, and fortés – which isn’t something that other schedulers can do, and would make OpenCL simply a matter of interfacing with the ULE scheduler and add the GPU to the list of CPU cores available for the ULE thread scheduler to take advantage of.
The bottom line is, the history of OS X and the XNU Kernel, the features promised in Snow Leopard, and the design and architecture of the ULE scheduler all point to a high likelihood of Apple using a redesigned thread scheduler that is either an implementation of the ULE scheduler or at least based around it in OS X 10.6. And if this is the case, OS X 10.6 will be one heck of a powerhorse.
Further Reading
It’s about time. The threading model for their BSD-on-Mach (XNU) is lousy.
In the NYT blog, Jobs describes it as a “breakthrough.” For their platform, maybe. He makes it sound as though the rest of the industry has ignored this issue, which is ridiculous (but the usual RDF in action around those that don’t follow this field closely). If anything, it’s *Apple* that is late to this party. The truth is that they *have* to revamp OS X to remain competitive as CPU performance is now increased by adding cores instead of MHz.
How are they supporting the multi-core architecture ? how are they balancing CPU vs GPU performance issues ? What role does a developer play here ? I’m sure a lot of developers have to learn the parallel programming paradigm or will that be taken care of by the task scheduler,meaning we still write sequential code and the optimization handled by the OS ?
Chandan:
We’re still pretty far away from operating systems and hardware that can convert sequential, single-threaded code into multi-threaded code that can be executed on multiple cores (though many universities have projects along those lines, but it’s no easy task) and it’s highly unlikely that’s what Apple has in mind.
Most likely Apple will be adding some sort of easy multi-threading functionality to XCode akin to what’s available in C# and the .NET Framework right now; that is to say, a line of code to start a new thread, a line of code to synchronize threads, etc. but developers still have to decide where those lines will go.
The balancing of CPU vs GPU performances is mentioned in the article as one of the more-compelling reasons behind our belief that Apple will be using ULE or something similar – it’s the only scheduler out there (to the best of our knowledge) that can handle multiple cores of different capabilities and speeds on a single computer. If Apple embraces ULE, then they don’t even need to worry about that question – the new thread scheduler would take care of it for them.
You mean that the OS task schedulers have to evolve to take care of load balancing.So programmers have to take care of it themselves as of now.We know there exists runtimes like the .net CLR has some features,besides the Microsoft CCR which manages concurrency,but these do not change how we code any application for performance and how they do it is also not known !! I saw a MSDN mag article about Vista’s Task Scheduler 2.0,which is far more powerful than the original, which has been around since Windows 98. Any idea if the Linux task scheduler can scale to the multi-core era with new methods.
>I saw a MSDN mag article about Vista’s Task Scheduler 2.0,which is far more powerful than the original, which has been around since Windows 98.
Windows 98 == DOS plus a nice gui
WinNT, W2k, WinXP, Vista are a completely different line of development
So you see even with Vista the scheduler scales like crap. And so does Mac OS X, because it’s designed for the desktop.
—
Rather nonsense, you cannot exchange a scheduler in a _completely_ different kernel. You have to take care of your OS-facilities and this would be a matter of some years.
More 64 bit for example, the kernel of Mac OS X _is_ 32 bit. So if you?re using more than 4G of memory you have to use with something similar to PAE. Last not least SCHED_ULE in FreeBSD 7 isn?t the best for UP-systems and you will gain the most performance with many cores only in database and server-area. Remember? Mac OS X _is_ an OS for the desktop.
So maybe this is a nice dream of some Apple-users, but in reality it?s just a lack of knowledge about the operating system of choice.
Oliver, no one is suggesting to cut ‘n paste the code, obviously changes will be made during the porting procedure.
Moreover, the OS X kernel is available in 64-bit flavor:
http://www.apple.com/macosx/technology/64bit.html
The ULE scheduler was designed for FreeBSD and obviously may need some tweaks for it to become a viable replacement for the OS X scheduler currently in use in Leopard & Co., but it’s not as-drastic of a change as you’re suggesting.
FreeBSD isn’t a server OS – it’s an OS for everyone and everything. It’s the fastest *nix I’ve used to date as a desktop (though I freely admit the experience was subjective as I hadn’t been given the chance to benchmark it) and everyone will testify to its performance on the server-field of course.
Just to clarify here regarding the Task Scheduler in Windows–that’s completely separate from the OS scheduler. The Task Scheduler in Windows is essentially cron.
Also, as for the scalability of the Vista-era NT kernel, I disagree. On the most recent Top500 list, Windows scored the highest efficiency rating on the list for x86 hardware:
http://blogs.technet.com/windowsserver/archive/2008/06/18/windows-hpc-deputs-in-the-top-25-fastest-supercomputers-in-the-world-what-more-do-i-need-to-say.aspx
@bluvg, Oliver:
You guys may want to take a look at this comment on another article of ours for some really interesting info on the NT kernel:
http://neosmart.net/blog/2008/shipping-seven-is-a-fraud/#comment-160264