An absolutely impressive achievement. I echo sentiments seen in other HN threads recently -- "I'd never do in-place FS upgrades!" -- yet Apple pulled it off on billions of devices with zero fanfare. Folk have scores of stores about btrfs/ZFS/etc issues, I don't recall similarly for APFS?
The impressive part about the APFS migration was that they performed dry runs on iOS updates prior to measure success.
> We actually had this process running for earlier iOS updates … where when you updated to 10.1 or 10.2, we were trial migrating your whole file system … consistency checking it … reporting back to us whether the upgrade was 100 percent clean, then rolling it back.
If you read the story you will hear about a test roll out an install.
I was bitten by this and it was an awful couple of days of head scratching with Apple Support.
My story is that I upgraded my mail server but the install hung never completed. I ended up with a a hard drive with no files and completely full. Diskutil and Disk Utility reported everything as good but there was a partition that could not be seen. I forget the eventual resolution. Probably starting all over again with a completely fresh install.
I never understood what had happened until I saw a talk or read an interview with the APFS creaator. During the talk he spoke of how they did a complete re-format to APFS and verified everything and than reverted to HFS+ for the install.
This is all from memory but I think I have the details mostly correct. The whole process was technically impressive but not without hiccups.
There are a ton of features in ZFS that would have been very nice to have. Like streaming backup. That could enable a Time Machine 2.0 with enormous efficiency/speed gains.
Ehh, when you control the hardware and software, an in-place filesystem upgrade isn’t that big of a deal. Especially when you can store everything in the cloud and pull it back down again if need be.
Meanwhile, I’m expanding my ZFS single drives to mirrors and wondering why I can’t do the same with APFS. Apple should have embraced ZFS (and almost did, just like they almost embraced Ruby instead of Swift), but Apple is absolutely the worst offender when it comes to NIH syndrome.
To clarify, that would have been Oracle's CDDL license at the time. Not trusting Oracle should be the default position of anyone with any sense in the industry.
> Apple abandoned the ZFS adoption process because of concerns over Sun's CDDL licence, not because of NIH syndrome.[0][1]
The CDDL is/was the license that is part of the public source distribution.
Given that Sun/Oracle owned the code, there was nothing stopping them from relicensing it to Apple under something different with a standalone contract that would have included things like support and escrow. From the zfs-dicuss post by Bonwick:
>> Apple can currently just take the ZFS CDDL code and incorporate it (like they did with DTrace), but it may be that they wanted a "private license" from Sun (with appropriate technical support and indemnification), and the two entities couldn't come to mutually agreeable terms.
> I cannot disclose details, but that is the essence of it.
CDDL was not the problem (as evidence by DTrace), but the terms of the 'private license' were as Apple wanted more than just basic functionality (which they could have gotten by importing the code).
> The APFS engineers I talked to cited strong ECC protection within Apple storage devices. Both NAND flash SSDs and magnetic media HDDs use redundant data to detect and correct errors. The Apple engineers contend that Apple devices basically don't return bogus data. NAND uses extra data, e.g. 128 bytes per 4KB page, so that errors can be corrected and detected. (For reference, ZFS uses a fixed size 32 byte checksum for blocks ranging from 512 bytes to megabytes. That's small by comparison, but bear in mind that the SSD's ECC is required for the expected analog variances within the media.) The devices have a bit error rate that's low enough to expect no errors over the device's lifetime.
Dominic Giampaolo wrote BeFS, Spotlight, and now APFS.
In my 15-ish years running ZFS at home, the only time I’ve had corruption was when there was also noticeable hardware issues (cables, drives, enclosures). ZFS made them easy to deal with, but wouldn’t have helped if I wasn’t already running RAIDZ or mirrors.
I’ve not looked recently, but in the past ZFS was extremely RAM hungry and relatively CPU expensive, not necessarily something optimized for mobile devices or battery life.
ZFS expects to have a huge cache, and defaults to 50% of memory, separate to any other fs cache. For advanced features, it requires a certain amount of cache per TB of storage.
For single-disk non-checksummed, non-deduplicated storage, it's a lot of wasted code that a device with a "mere" gigabyte of RAM doesn't need. So APFS hits most of their needs: volume management + journal + better disk layout for SSD.
Dedupe is an optional and often misunderstood feature. It's a solution to a niche problem and not something you should enable because it exists. Data protection doesn't require lots of memory, but the way it's implemented means you can't just use the file system cache to back memory mapped files 1:1 like you can with a less reliable file system that modifies file content in place without checksumming. ZFS doesn't have to be the memory hog it's made out to be. A lot of the issues with ZFS on 32bit systems come from the fact that pre Meltdown the kernel heap had to fit into <1GiB of the address space shared with memory-mapped I/O. Often its cache hat to be restricted to ~300MiB and tried to allocate 128kiB continuous buffers from that since it was designed for 64bit virtual address spaces there was no support for chaining multiple smaller allocations to back a large block.
The big problem with APFS is that it was designed by people believing in magical hardware that doesn't let the file system observe whole classes of errors...
I don’t think it was every one or the other. Ruby wa/ hot around 2005 and Apple experimented with language bindings for AppKit (and osascript?). I did a bunch of prototyping using MacRuby and then would rewrite in Objective-C.
Ruby’s object model is relatively similar to objc - no direct member access outside the object, everything is loosely coupled messages with dynamic runtime, you could get away with duck typing in some cases in objc using `id` types (at the expense of getting type checks from the compiler).
I don’t think it was ever intended at the next system language, though. It was just a convenient binding and optimized runtime (many of its core types were Core Foundation types - NSString, etc. IIRC).
There was a brief window when it was possible to do some really cool stuff with MacRuby, such as inspecting and even monkey patching core OS X classes. Probably one major reason why Apple killed the project in the first place. Being able to interact with the operating system on a low level through macirb was an absolutely mind blowing experience.
Sadly, those days are long gone, and I just don’t think Swift brings the same kind of flexibility or joy that MacRuby did.
This seems wild. Ruby and Swift couldn't be any further from each other. I struggle to believe that Apple ever seriously believed that Ruby was their ideal language for app development on their platforms.
That’s what MacRuby was. There was a time when you could develop native, compiled Mac applications using Ruby as a first-class language. For a while, MacRuby was looking to be the lightweight alternative to Objective-C. Unfortunately, the project was killed around the same time Apple released and started promoting Swift. I don’t know that there was any definite reason given (not that there ever is) but my understanding at the time was that Apple wanted a language developed in-house rather than adopting something developed somewhere else.
> If the Go issues were distinct I’d imagine they’d choose a different day to disclose/release?
I think it's just a funny coincidence. That's going based on what I know about the OpenSSL one; I don't know anything about the Go one. We'll find out!
In the context of the article "drowning doesn't look like drowning". Drowning a week after you have been submerged in water doesn't "look like drowning". The definition may not be medically accepted but the concept is well documented.
From the original article.
Dry drowning occurs when, after being submerged in water, a person's vocal cords experience a spasm and close, making it difficult to breathe, said Dr. Mike Patrick, an emergency-medicine physician at Nationwide Children's Hospital in Columbus, Ohio, who was not involved in the boy's care. When this happens, the body's response is to send fluid to the lungs to try to open up the vocal cords. But this can lead to excess fluid in the lungs — a condition called pulmonary edema. Symptoms of dry drowning usually start within an hour after a person is submerged in water, Patrick said.
Related: "The Manual (How to Have a Number One the Easy Way) is a 1988 book by The Timelords (Bill Drummond and Jimmy Cauty), better known as The KLF. It is a step by step guide to achieving a No.1 single with no money or musical skills, and a case study of the duo's UK novelty pop No. 1 'Doctorin' the Tardis'."
A bit of historical context: this paper came out in 2003, the same year NPTL (in RH9) and Linux 2.6.0 (first stable kernel with epoll, 2.5.44 introduced it in 2002) were released.
It is worth noting that kqueue was introduced in FreeBSD 4.1 in 2000, /dev/poll in Solaris 8 at about the same time, and though somewhat different in nature, IOCP and Overlapped I/O were in Windows NT from before even that.
So there was already experience with high performance event driven mechanisms before epoll.
That said, the examples presented in this paper do appear to be linux examples.