Sadly I never got to work on this when I was at Apple (interviewed for it though!), but hearing about this a few years ago sort of made me realize something that should have been obvious: there’s not really a difference between a database and a file system.
Fundamentally they do the same thing, and are sort of just optimizations for particular problem-sets. A database is great for data that has proper indexes, a file system is great for much more arbitrary data [1].
If you’re a clever enough engineer, you can define a file system in terms of a database, as evidenced by iCloud. Personally, I have used this knowledge to use Cassandra to store blobs of video for HLS streams. This buys me a lot of Cassandra’s distributed niceties, at the cost of having to sort of reinvent some file system stuff.
[1] I realize that this is very simplified; I am just speaking extremely high level.
> there’s not really a difference between a database and a file system. Fundamentally they do the same thing, and are sort of just optimizations for particular problem-sets.
Conceptually that is quite true, though the domain dependencies make a lot of the code end up looking quite different.
But the first true database (pre-relational!) was developed for SABRE, American Airlines' computerized reservation system, in the early 1960s. Before that tickets were issued manually and the physical structure of the desks and filing systems used to make reservations reflected the need!
Unfortunately I can't find the paper I read (back in the mid 80s) on the SABRE database but I remember that record size (which is still used today!) was chosen based on the rotational speed of the disk and seek latency. Certainly there was no filesystem (the concept of filesystem barely existed, though Multics developed a hierarchical filesystem (intended to be quite database-like, as it happens) around the same time. The data base directly manipulated the disk. I don't know when that changed -- perhaps in the 1970s?
Like I said I can't quickly find the paper on the topic, but here's a nontechnical discussion with some cool pictures: https://www.sabre.com/files/Sabre-History.pdf. A search for "American Airlines SABRE database history" finds some interesting articles and a couple of good Wikipedia pages.
I think direct manipulation never went away, but the abstractions that were provided for general use were too useful to pass up for most workloads.
Some kinds of storage like cloud-scale object storage use custom HDD firmwares and custom on-disk formats instead of filesystems (±2005-era tech), we also have much newer solutions that do direct work on disks like HMR (not to be confused with HAMR or HAMMER2) where the host manages the recording of data on the disk. There are some generally available systems for that, but we also have articles like this: https://blog.westerndigital.com/host-managed-smr-dropbox/ (Which mostly focuses on SMR but this works on CMR too).
As for the record size in the DB vs. Disk attributes, that's probably not used like that anymore, but I do know that filesystem chunks/extents/blocks are calculated and grouped to profit from optimal LBA access. If you run ZFS and have it auto-detect or manually set the ashift size to make it match the actual on-disk sector size. This was especially relevant when 512e and 4Kn (and the various manufactures 'real' and 'soft' implementations) weren't reliable indicators of the best sector access size strategies.
I could be wrong, but I sort of think when I learned Oracle back when I was in school (mid-2000s) supported dropping a database on a raw block device. So it's been around a long time, but would be uncommon in some tech circles.
Yeah, until the mid '00s you would run your db directly to raw disk devices, both in order to optimize the use of larger contiguous disk regions (disk drives were slow in those days!) and, crucially, because if/when your server went down hard any pending OS-buffered writes would result in a corrupted database, lost data, and lengthy rebuilds from logs (generally after having to do a long fsck recovery just to get back into the OS). It wasn't until journaled filesystems became common and battle-tested that you saw databases living in the filesystem proper.
Good old ISAM (Indexed Sequential Access Method) before DASD (Direct Access Storage Device) took over. (Aren't you glad IBM didn't win the "name the things" contest? :-))
I'm going to guess that by "domain dependency" you're talking about how
handle = open("foo.txt");
Looks semantically different than
err = db->exec("SELECT * from DIRECTORY where NAME = 'foo.txt';", &result);
So yes in that regard they certainly "feel" different, although at some point I needed a file system for an application than built a wrapper layer for sqlite that basically gave you open/read/write/delete calls and it just filled in all the other stuff to convert specialized filesystem calls into general purpose database calls.[1]
The best thing you can say about the way UNIX decided to handle files was that it forced people to either use them as is or make up their own scheme within a file itself (and don't get me started on the hell that is 'holey' files)
[1] In my case the underlying data storage was a NAND flash chip so the result you got back which was nominally a FILE* like stdio had the direct address on flash of where the bits were. read-modify-write operations were slow since it effectively copied the file for that (preserving flash sector write lifetimes)
In addition to disks, IBM direct-access storage options available in the middle sixties included a variety of magnetic drum devices and the short-lived, tape-based Data Cell Drive[1].
If you think about it, modern IBM mainframes have a lot of weirdness about their filesystems and the concept of a file. Those machines are very alien for people who grew up on Unix.
SABRE was specifically disk drives, though given the capacity of drives in those days I'm sure tapes were very important (and you see a lot of them in the photos from the link I included)
> there’s not really a difference between a database and a file system
Having written an interface to FoundationDB in preparation to moving my app over to it, I couldn't disagree more.
Even "has proper indexes" is not something we'd agree on. In my case, for example, I am extremely happy with the fact that my indexes are computed in my app code, in my language, and that I am not restricted to some arbitrary concept of database-determined "indexable fields" and "indexable types".
Then there are correct transactions, versionstamps (for both keys and values), streaming large amounts of data, all of that in a distributed database, it's really nothing like a filesystem.
I'm interested in having you expand on these thoughts, so I'll play devils advocate here. I personally don't have strong opinions on the subject.
> has proper indexes
Does it matter where in the code the index lives? Are you arguing that databases don't have proper indexes or that filesystems don't? I'm not sure I'd agree with either argument.
> correct transactions
filesystems and databases have transactions, which one is "incorrect"?
> versionstamps (for both keys and values)
filesystems have timestamps, not sure what a versionstamp is but I suspect it's some domain specific name for a more general concept that both databases and filesystems utilize.
> streaming large amounts of data
many databases stream massive data and filesystems certainly do
> all of that in a distributed database
every major PaaS has some form of distributed filesystem
How do they differ from vector clocks? Just a different implementation of the same thing maybe? Either way, distributed filesystems definitely have the same general concept.
You can instantiate a POSIX compatible FS using database tables, and then mount them using FUSE. From there you can export it via NFS if you wish. You can also export the FS via WebDAV and thus mount it over the network using the WebDAV support built in to Windows or macOS.
If you want to work with the FS transactionally, you have to do that using PL/SQL. POSIX doesn't define APIs for FS transactions, so some other approach is needed.
Because it's stored in the DB you can use all the other features of the RDBMS too like clustering, replication, encryption, compression and if need be, you can maintain indexes over the file content.
Absolutely, I was referring to the cleverness of the engineers that actually made those implementations.
Making a FUSE file system is sort of a bucket list thing I haven’t gotten around to doing yet. Maybe I should hack something together while I am still unemployed…
"there’s not really a difference between a database and a file system"
The BeOS filesystem was basically a database.
But there are a lot of differences between a database and a file system. A better way of thinking about it is that a filesystem is just a specialized database.
From an old school, a data base is really just a collection of data. An RDBMS = relational database. A filesystem is just another kind of database. etc etc.
BeFS wasn't really a database as we'd normally understand it. It had no transactions, for one. It only understood string and numbers as datatypes as well.
It had what was basically a normal UNIX filing system, complete with /dev, /etc and so on, and it had support for indexing extended attributes. Your app was expected to create an index with a specific API at install time, and then after that writes to the indexed xattr would update a special "index directory". The OS could be given a simple query predicate with range and glob matching, and it would answer using the indexes, including a live query.
This was neat, but you could implement the same feature in Linux pretty easily. Nobody ever has, probably because xattrs historically didn't work that well. They don't get transmitted via common network protocols and have a history of getting lost when archiving, although I think these days every archive format in real use supports storing them.
There's also the question of how it interacts with POSIX file permissions. BeOS was an aggressively single user system so just didn't care. On Linux you'd need to think about how files that you can't read are treated in the indexing process.
Multiple devices also poses problems. BeOS simply required that apps create an index on a specific device themselves. If you plugged in a USB drive then files there just wouldn't show up in search unless the files had been created by not only BeOS, but an app you had previously installed. Note that installing an app post-hoc didn't work because creating an index didn't populate it with existing files, even if they had the right xattrs.
And of course it only worked with files. If you had content where the user's conception of a thing didn't map 1:1 to files, it was useless. For example you couldn't index elements within a document this way. Spotlight can index app states and screens, which also obviously BeOS couldn't do.
So there were a lot of limitations to this.
The modern equivalent would be writing a search plugin:
If anything a database is a form of filesystem, as the name filesystem comes from 'file system', a system of organizing files or records. But filesystems officially came after databases, as early databases were designed to make best use of hardware and storage devices to store and retrieve data efficiently, making it easier and faster for computers of the time to use the data. So databases were, effectively, the first filesystems.
But the distinction is pretty small. Both filesystems and databases are just wrappers around a data model. The former is primarily concerned with organizing data on a disk (with respect to space, speed, location and integrity), and the latter is primarily concerned with organizing and querying data (with respect to ease-of-use, speed and integrity).
People today seem to think relational databases were the first and only databases. But there many types of database: flat, hierarchical, dimensional, network, relational, entity–relationship, graph, object-oriented, object-relational, object-role, star, entity–attribute–value, navigational, document, time-series, semantic, and more.
The earliest filesystem, CP/M filesystem, was basically a flat database. Successive filesystems have taken on other data models, such as hierarchical, network and navigational. Since filesystems are used as a low-level interface to raw data, they didn't need more advanced data models or forms of query. On the other hand, IBM DB2, Hadoop, and Google File System are all forms of database filesystems, combining elements of both databases and filesystems.
"there’s not really a difference between a database and a file system."
It depends on how abstracted you're getting. I sometimes talk about the 30,000 foot view, but in this case, I might stretch the metaphor to say that from Low Earth Orbit, there is indeed not much difference between a database and a file system. In fact, there's not much difference between those things and some function calls. You put some parameters out, you get some stuff back.
From just slightly higher one realizes or remembers, it's all just numbers. You put some numbers into the system and get some other numbers out. Everything is built out of that.
You can build a database out of functions, a file system out of a database, functions out of a file system (albeit one beyond a blob store, think /proc or FUSE rather than ext2), you can mix network streams into any of these, anything you like.
And while it's helpful to be aware of that, at the same time, you are quite into architecture astronautics at that point and you are running low on metaphorical oxygen, and while the odd insight generated from this viewpoint might help here or there, if one wishes to actually build iCloud, one is going to have to come a great deal closer to Earth or one is going to fail.
Still, in the end, it's all just numbers in response to other numbers and the labels we humans put on exactly how the numbers are provided in response to other numbers are still the map and not the territory, even in the world of programming where arguably the map and the territory are as close as they can possibly be and still be in reality.
One can probably say that there exists a level of abstraction where there’s not really a difference between a database and a file system. That's not a lot :-)
And, of course, if you go the other way and get closer where databases and functions are different enough to be considered different things, the filesystem is still a database. It is meant to be a database in every sense of the word.
It's true. One of the projects in my little "Ridiculous Enough To Work" folder is SQLiteOS, which uses a giant SQLite database as the underlying filesystem.
That's the difference - the API; as much as you can store a lot of data in either, SQL is not much like Posix. The lower level "distributed" APIs are like OS implementations of the Posix API.
Thats how Amazon made Aurora. Move all state onto the object storage layer which is also at the end of processing (you go through the lb, than frontend, than backend, than database and land on disk).
Stateless is basically moving everything to the back.
Im pretty sure google is doing the same thing/started with it.
Also this makes it 'easily' scalable horizontal: As soon as you are able to abstract on object level, you can scale your underlying infrastructure to just handle 'objects'.
I remember back in the 80s thinking that a file system that was organized like a relational database¹ would be a really wonderful thing. Files could live in multiple places with little difficulty and any sort of metadata could be easily applied to files and queried against.
⸻
1. I had read the original paper on database normalization over the summer and was on a database high at the time. I was young.
The difference is that file systems need a lot of “mechanical sympathy” to account for the many quirks inside syscalls and actual physical disks.
There was a nice video about how it is really hard to implement file systems because disks just don’t do what you expect.
Databases are a layer up and assume that they can at least write a blob somewhere and retrieve it with certain guarantees. Those guarantees are a thousand hacks in the file system implementation.
Most non-trivial databases run on what is essentially their own purpose-built file system, bypassing many (or all) of the OS file services. Doing so is both higher performance and simpler than just going through the OS file system. Normal OS file systems are messy and complex because they are serving several unrelated and conflicting purposes simultaneously. A database file system has a fairly singular purpose and focused mission, and also doesn't have the massive legacy baggage of general purpose file systems, so there are fewer tradeoffs and edge cases to deal with.
The more sophisticated the database kernel, the more the OS is treated like little more than a device driver.
Unfortunately those mechanical sympathies related to spinning disks, and now we have SSDs that have to fake like they are spinning disks for file system compatibility and all the software that expects file systems to behave that way.
>Sadly I never got to work on this when I was at Apple (interviewed for it though!), but hearing about this a few years ago sort of made me realize something that should have been obvious: there’s not really a difference between a database and a file system.
Many years back I came to the realiziation that a database is just a fancy data structure. I guess a file system is too.
Database queries are a lot more complex than a pattern match search. In addition, grep et al aren’t part of the file system in both the simple sense (they ship separately) and the meaningful sense (filesystems are rarely designed to facilitate them).
> grep et al aren’t part of the file system in both the simple sense (they ship separately)
It seems you are confusing database with database engine or possibly database management system. Querying is not a function of a database.
In fairness, the lazy speaker often says "database" in place of "database engine" and "database management system" to save on the energy of having to make additional sounds when one can infer what is meant by context, but in this case as "database" means database...
> (filesystems are rarely designed to facilitate them)
Facilitating querying is a primary objective of a filesystem database. What do you think "path/to/file.txt" is? That's right. It's a query!
Pedantically, it is the file system that is a type of database. Traditionally, database is the low-level generic term, referring to any type of structured data stored on a computer. File system, also known as the hierarchical database, adds additional specificity, referring to a particular structuring of data. Another common one is the relational database, offering another particular structuring of data.
LDAP and the Windows registry are hierarchical databases, just like a traditional file system, so the “file system = database” makes a lot of sense to me.
> there’s not really a difference between a database and a file system.
That was the promise of WinFS back in the day, which would have been really something had MS managed to bring it to fruition.
I still remember the hype from back then, in my opinion totally justified, too bad that things didn't come to be. I legit think that that project could have changed the face of computing as we know it today.
They tried to adapt SQL Server iirc but it wasn't the right approach for a desktop OS.
The issue with the filesystem-as-database concept is that unless you're doing it as a serverside thing to get RDBMS features for files, it doesn't give you much more power without very serious changes to applications.
The first problem is that databases are most useful when they index things, but files are just binary blobs in arbitrary formats. To index the contents you have to figure out what they are and parse them to derive interesting data. This is not best done by the filesystem itself though - you want it to be asynchronous, running in userspace and (these days) ideally sandboxed. This is expensive and so you don't want to do it on the critical file write path. Nowadays there are tools like Spotlight that do it this way and are useful enough.
If you don't do that then when it comes time to sell your shiny fs-as-a-db feature for upgrade dollars, you have to admit that your db doesn't actually index anything because no apps are changed to use it. Making them do so requires rewriting big parts from scratch. In that era I think the Office format was still essentially just memory dumps of internal data structures, done for efficiency, so making Office store documents as native database tables would have been a huge project and not yielded much benefit over simple text indexing using asynchronous plugins to a userspace search service.
Another problem is that databases aren't always great at indexing into the middle of blobs and changing them. Sometimes db engines want to copy values if you change them, because they're optimised for lots of tiny values (row column values) and not small numbers of huge values. But apps often want to write into the middle of files or append to them.
Yet another problem is that apps are very sensitive to filesystem performance (that's why the fs runs in the kernel to begin with). But databases do more work, so can be slower, which would make everything feel laggy.
So yeah it was a beautiful vision but it didn't work out. Note that operating systems started with databases as their native data storage mechanism in the mainframe era, and that was moved away from, because there are lots of things you want to store that aren't naturally database-y (images, videos, audio, source code etc).
Even now we see many cases where "files are stored in the database" eventually migrates to "we store files on the filesystem and pointers to them in the database". I know at least a few projects that have done that migration at some point.
No. The takeaway is basically that there is no reason for Windows to use a relational database for storing information about files when a hierarchal database does it better for the vast majority of use cases its users encounter.
It is, perhaps, possible another product with a different set of users with different needs could still find value in a relational filesystem, but Microsoft was unable to find that fit.
Reiser argued that if you optimised a filesystem for very tiny files, then many cases where apps invent their own ad-hoc file-systems-in-a-file could be eliminated and apps would become easier to read/write and more composable.
For example, instead of an OpenOffice document being a zip of XMLs, you'd just use a directory of XMLs, and then replace the XMLs with directories of tiny files for the attributes and node contents. Instead of a daemon having a config file, you'd just have a directory of tiny files. He claimed that apps weren't written that way already because filesystems were wasteful when files got too tiny.
Git is an example of a program that uses this technique, to some extent at least (modulo packfiles).
In reality, although that may have contributed, there are other reasons why people bundle data up into individual files. To disaggregate things (which is a good place to start if you want a filesystem-db merge) you also have to solve all those other reasons, which ReiserFS never did and as a project that "only" wanted to reinvent the FS, could not have solved.
Apple hit some of those issues when they tried making iLife documents be NeXT bundles:
1. Filesystem explorers treat files and directories differently for UI purposes. Apple solved it nicely by teaching the Finder to show bundle directories as if they were files unless you right click and select "Show contents". Or rather partly solved ... until you send data to friends using Windows, or Google Drive, or anything other than the Finder.
2. Network protocols like HTTP and MIME only understand files, not directories. In particular there is no standardised serialisation format for a directory beyond zip. Not solved. iLife migrated from bundles to a custom file format partly due to this problem, I think.
3. Operating systems provide much richer APIs for files than directories. You can monitor a file for changes, but if you want to monitor a directory tree, you have to iterate and do it yourself. You can lock a file against changes, but not a directory tree. You can check if a file has been modified by looking at its mtime, but there's no recursive mtime for directory trees. You can update files transactionally by writing to a temporary file and renaming, but you can't atomically replace a directory tree. Etc.
So the ReiserFS concept wasn't fully fleshed out, even if it had been accepted into the kernel. Our foundational APIs and protocols just aren't geared up for it. I've sometimes thought it'd be a neat retirement project one day to build an OS where files and directories are more closely merged as a concept, so files can have sub-files that you can browse into using 'cd' and so on, and those API/protocol gaps are closed. It wouldn't give you a full relational database but it'd be much more feasible to port apps to such an OS than to rewrite everything to use classical database APIs and semantics
>>> 2. Network protocols like HTTP and MIME only understand files
Love when someone says something that makes my brain work!
For the most part you're spot on. HTTP has multipart messages that in theory could be extended to be composite of anything. So we could have those bundles! Oddly we can send to the server with a multipart message (forms)!!
I think that MIME is an interesting slice the OTHER way. You could store versions of the same document in a directory so HTML and JSON and XML OR a video or image in two formats and serve them up based on the MIME request.
Now if we could make one of those a multi part message...
The problem is the case where you want to upload or attach >1 document that's actually a directory. You need a way to signal that the first 3 files are a part of document A, and the next 5 are part of document B, and although you could invent a file name convention to express this nothing understands it. Email clients would show 7 attachments, web server APIs would show 7 files, browsers would need to be patched to let you select bundles in the file picker and then recursively upload them, how progress tracking works would need to change, etc.
And then how do you _download_ them? Browsers don't understand MIME at download time.
None of it is hard to solve. But, nobody ever did, and the value of doing things this new way is usually going to be lower than the value of smooth interop with everyone's different browser/OS/email/server combos.
AFAIK theoretically any database can be built on top of a key value store, and any transactional database on top of a key value store that also has transactions.
TiDB is an example of a distributed SQL on top of a transactional key value store called TiKV.
Filesystems are hierarchical databases, as opposed to relational databases (relational is usually implicit when people simply say "database", but this wasn't always the case.)
If you look up WinFS (which is a cancelled Windows file system originally intended to ship with Windows Longhorn), its basic principle is exactly that, be a database that happens to work as a file system.
Not sure why exactly it failed, I assume that it just wasn't a suitable idea at the time given that most consumer devices (especially laptops) had very slow traditional hard drives, but in the age of NVMe storage, maybe it would be worth revisiting, assuming that Microsoft is still interested in evolving Windows in meaningful ways outside of better Ad delivery mechanisms.
It did fail in devliering the actual product that was intended, but yeah, they did salvage a lot of it and also AFAIK helped the SQL Server team improve a few things. So it's a bit like Intel's Larrabee (which did technically come out as a product, Xeon Phi) as well: A high profile R&D project.
I leveraged FoundationDB and RecordLayer to build a transactional catalog system for all our data services at a previous company, and it was honestly just an amazing piece of software. Adding gRPC into the mix for the serving layer felt so natural since schemas / records are defined using Protobuf with RecordLayer.
The only real downside is that the onramp for running FoundationDB at scale is quite a bit higher than a traditional distributed database.
Sounds cool. Any write up on this? How did you approach the design? What was the motivation to use foundation db? How much did you/your team needed to learn while doing it?
No write up, but the main reason was reusing the existing database we were comfortable deploying at the time. We were already using FDB for an online aggregation / mutation store for ad-hoc time-series analytics...albeit, a custom layer that we wrote (not RecordLayer).
When RecordLayer launched, I tested it out by building a catalog system that we could evolve and add new services with a single repository of protobuf schemas.
We gave up on iCloud for file sync, it’s broken on dozens of devices trying to “optimize” storage even when asked not to. Imagine having 4Tb (size doesn’t matter) mostly empty hard drives and not being “allowed” to keep a file copy offline, because iCloud knows better…
Now Apple is asking all file sync products like Dropbox to do the same, see Fileprovider API, breaking those as well. Really annoying
With iCloud Apple indeed handles well update conflicts in Apple Notes.
I have tried to set up Obsidian or any other Markdown-based notetaking system, the sync is so often and I had to give up. Apple Notes does handle this pretty well. So I finally moved to Apple Notes.
I can't do without Obsidian now. Its default graph representation of knowledge matches how my scatterbrain works. It has the creature comforts I've come to expect: simple (local) text storage, a fast command/search palette, gobs of integrations (e.g. Excalidraw for my tablet). Watching one of my knowledge vaults evolve is incredibly satisfying.[1]
Obsidian is the only note app that I've stuck with. Notion/Apple Notes/Goodnotes/etc just had excessive pain points. Obsidian "just works" for my brain. Which is a relief, since the productivity app treadmill is exhausting.
Something I really appreciate about Obsidian is that they seem to be keeping the core application constrained and clearly defined. I worried they would adopt plugins into the application and have things kind of bloat out of control, but they've maintained a clear separation (even now with many plugins not working with Obsidian Publish). That can be a hard line to maintain and protect when you have paying customers and they're doing a great job sticking to what they're good at.
My experience is the opposite. I lost data twice with Apple iCloud Notes, once with its major upgrade deleted many of my notes, in the other case most my attachments became blank, I'm not on that boat ever again.
Honestly, Obsidian with iCloud is so bad, that I'm afraid to pay for Obsidian Sync because half the time the errors and freezing of the Obsidian app seem like they have nothing at all to do with iCloud. It's really hard to tell, because Obsidian doesn't surface any errors, it just randomly freezes and has trouble opening files that should be there.
I'm utterly baffled why my iOS backups can live in Apple's cloud but not my Mac ones.
I honestly expected them to launch it years ago. The fact they still haven't seems to mean they've firmly decided not to for some reason, but I'm totally clueless as to what the reason could be.
Especially when making more money off services is a strategic priority for the company.
It isn't as polished as whatever first-party solution Apple has the potential to develop, but I just use OneDrive to restore my personal data + chezmoi to reprovision my dotfiles and it works pretty well.
About every six months I do a fire drill and completely factory reset my macbook. Takes about 10 minutes for me to go from a fresh device to one that has all my apps, data, and developer tools ready to roll. Only annoying thing you can't really automate is signing into services like OneDrive or Dropbox, but this isn't a problem if you use iCloud Drive.
I'm mildly surprised they haven't, but the reasons seem pretty obvious. Redundancy (in offerings), storage costs, and home network upload speeds.
Redundancy because the thing most people care about backing up is media and important documents, which are likely already stored in iCloud. If you care about Time Machine back ups you probably want your whole filesystem with point-in-time restores. That's a lot more data for Apple to hang onto, for a small segment of its target market. Of course, Apple does have 2TB+ iCloud+ plans, but I would bet that the average iCloud+ subscriber is using nowhere near their limit.
But apple charges for storage space? Surely people needing more storage is a huge plus for Apple. Maybe they had worries about scaling storage capacity? A company like Aple could certainly figure it out though so that seems unlikely
My point is that I'm sure the only way iCloud is profitable or even break-even for Apple is if they rely on over-provisioning storage to users of the more paid plans. I started paying for the 200GB iCloud+ plan, and once my photos exceeded 200GB I ponied up for the 2TB plan. Unless I take up a photography hobby it'll be a long time until I get close to that 2TB, and I'd wager this is what Apple expects. Raising that baseline usage with Time Machine backups would mean it would need to be more expensive for end users, either by making iCloud+ more expensive or rolling out a new subscription product.
But sure then -- just charge more, or a new subscription product as you suggest just for Time Machine. They can even tie pricing to the size of your Mac's disk if they want. They can definitely make the economics work if they choose to.
> it'll be a long time until I get close to that 2TB
I thought the same thing until I realised that, with Family Sharing and a house with teenage kids sending each other embedded videos in iMessage, the time wouldn't be that long...
Suddenly I find myself 1TB in, and desperate to find a fix!
> Of course, Apple does have 2TB+ iCloud+ plans, but I would bet that the average iCloud+ subscriber is using nowhere near their limit.
But that's my point. To sell the 2TB plans to people who are merely on the free 5 GB or paid 50 GB plan.
And yes -- I don't even keep many files on my Mac, it's mostly in the cloud already. But if it gets lost/stolen, I want to restore all my apps and preferences the same way I do with my phone. Which is why I use Time Machine with a NAS, but it's silly to need a NAS at all. I just want to use the cloud.
I agree with you here and that's why I'm mildly surprised they haven't come up with a solution rolled into iCloud yet. Syncing apps and preferences shouldn't be that difficult, but unless they're App Store applications the binaries would take up a lot of space. Most of the apps I care about are from outside of the App Store. AFAIK our iOS backups don't actually back up application binaries.
The way I was looking at it is that Apple has successfully sold iCloud+ 2TB plans to a lot of people who don't need much more than 200GB. If everyone on the 2TB plan used even close to 2TB, I'd bet they'd have to charge me a lot more to make up the provisioning and usage costs of storage.
Wonder if there are economies of scale storing multiple users' backups that may partially contain a lot of the same data. If 10000 separate users' backups contain the same 10GB app binary...
Yeah, I mean Time Machine backs up the entire OS as well.
I would have no problem if Time Machine separated out OS and known signed application packages and basically just stored pointers to standard versions of them, as long as all that detection is done client-side.
There's no reason the backup would need to store anything but the list of those files (that list being encrypted), and then everything unique to me -- my configurations, my files, etc.
I haven't bothered with this in a while, but back in the day, I used to use Carbon Copy Cloner to get a true 1:1 backup. Time Machine was never exactly the same.
> I'm mildly surprised they haven't, but the reasons seem pretty obvious. Redundancy (in offerings), storage costs, and home network upload speeds.
I'd bet that the rigid APIs on iOS also play a huge role here.
Compared to the "anywhere you have permission to `open()` on disk" approach on macOS, iOS developers don't have as many options for where/how to store data. This probably makes backup / restore an order of magnitude simpler / reliable.
At one org, we went for the highest-tier Google Drive plan (with unlimited storage), because we've had this 1% of our internal users who would really, really benefit from having it. We could only go all or nothing (and the lower tier would meet the needs of the 99%), but the cost-benefit of enabling it for everyone was still pretty good.
I suppose Apple is keeping track of these numbers as well (keep in mind they know exactly how much storage each Mac has - because you can't expand it). I am also hoping it's under intensive internal testing; the quality of their software has been going downhill for a while, no power user would ever care if they shipped another broken product.
Even better, external NVMe SSD enclosures over Thunderbolt 3 can reliably read at 2500 Mbps and write at over 1500 Mbps. That's faster than internal SSD R/W speeds a few years ago. The newer generation of enclosures coming out claim to use the full bandwidth of USB4, 40 Gbps, and get >3000 Mpbs R/W.
Same here, but the “lots of Cassandra instances” approach isn’t really oriented for continuous versioning. One may notice the availability lags with the current iCloud implementation which sometimes come across as inconsistency.
On an unrelated note, having the original title edited by the system after being submitted without the OP being noticed really annoys me, especially when the title starts with How, Why and other terms. It just made it a little weird to read, and sometimes it breaks the meaning. I once submitted a story and had some people complaining about the title being somehow misleading. When I noticed this, it was too late to edit the title.
In the HN guidelines, you read: "Otherwise, please use the original title, unless it is misleading or linkbait; don't editorialize."
They have built their own storage engine named Redwood, which has some very FoundationDB-specific optimizations (like prefix compression). Check out the "Storage Servers" section in this doc: https://apple.github.io/foundationdb/architecture.html
This reminds me of years back when I worked in banking. I vaguely recall there was a report system called Hyperion(an?) (IBM?). The system generated a new database for every single report it made. I thought that was kinda crazy at the time but I guess it was ahead of the times.
Someone feel free to correct my memory if needed, I was not the primary person for this system or anything so I could be totally wrong.
Very cool. This is the architecture that inevitably results from when you start with boxed, native, desktop software and incrementally move towards cloud based storage and collaboration. You have to be really good at doing schema changes and version migrations, because they're happening at fantastic scale without administrator intervention: not when you launch, but when each individual customer chooses to use the next version.
Quite different from a SaaS-first approach where it actually makes sense to do "customer id column"-based multi-tenancy and one-migration-at-a-time schema changes that I think most of us at less-than-Apple scales are familiar with.
At least with Cassandra, there are cell-level timestamps which are very useful for doing data migrations while active writes are still incoming.
You can simply mirror the writes to both systems, and then migrate the old data underneath. As long as the data transfer preserves the cell level timestamps, the read path resolves any differences and compaction will eventually clean up any duplicates. (and sstable loads will have the timestamps)
Dynamodb does NOT have cell level timestamps, I believe they have row level timestamps. How it is doing globally replicated data and mutation merges: I have no idea. It seemed like a handwave when they were announcing it about two or three years ago.
Yeah I gave up on it trying to sync photos. The apps on the desktop and mobile gave no indication of its state processing files. So I was waiting after a large upload for replication to occur days later and I didn’t know if it would ever complete.
It depends on the layer, some of the layers might be able to take advantage of how the data is persisted. For example, if you use avro/protobuf, the decoder will handle it for you. If that's not the case, you would have to implement the migration by yourself. There is a paper[1] on this subject called "Online, asynchronous schema change in F1", which explains how to implement it.
Great! If only I could manage which of my files stay local, and which get offloaded to iCloud I might be impressed. But it seems that iCloud likes to offload recently used files, apps and photos to make room for my massive library of old photos. It frequently makes my iPhone unusable unless I'm on wifi, and then I still have to wait for everything I want to use to re-download from iCloud.
CouchDB implements a DB per user approach. Personally, I've found it much easier to use than an SQL DB for web apps I've made, but I've heard others who've always used SQL say they were frustrated with it.
The thing with SQL databases is that the API they offer is designed for low-latency operation. This is not a big deal (ideal, even!) when the application and database share the same memory space where latency is imperceptible. And when it was originally designed, that was the norm, but at some point someone got the idea that they could expose the same API over the network. The network where latency is higher. That is where things start to fall apart.
It is nothing you cannot overcome with the right hacks (what can't be overcome with the right hacks?), but it is frustrating that the network-based API wasn't designed for high latency use from the start. It didn't need to use the exact same API that was designed for a low-latency environment, but that's what we got. As SQL and web apps typically means MySQL or Postgres, that means you are apt to encounter the API design problems.
Granted, it seems there is renewed interest in SQLite to move the SQL database back to the way it was designed to be used. Which isn't surprising as all things in computing come and go in cycles. Once we round out that cycle and get back to "database on the network", maybe we can get a more well designed API meant for high latency to remove those frustrations.
I assume you mean AWS Athena - but no this is quite different from FoundationDB. Athena separates compute from storage (it’s Presto https://prestodb.io/ under the hood). Think of it as an on-demand SQL compute cluster. FoundationDB is a traditional combined storage/compute cluster. The Record Layer does provide some ability to scale-out the higher-level aspects of querying but it’s just a client library, not a separate compute service.
If you’re feeling adventurous, something you can try on the Mac is to trash `~/Library/Application Support/CloudDocs` and then restart the daemons by running `/usr/bin/killall bird cloudd`.
I only used that once, but it fixed all the months of odd syncing I had experienced.
I had a similar issue when I was trying to back up all my iCloud photos to S3 through the PhotoSync app[0]. I had about 600 photos that could not be downloaded from iCloud photos onto my iPhone. I ended up disabling iCloud Photos on the iPhone, then re-enabling it. This did end up making those photos available for download and the sync worked... it was rather nerve wracking though.
Mom had an issue where all her iCloud photos were syncing except the ones she'd taken after they renovated the kitchen. She had photos of everything but the kitchen synced.
I may be misremembering, but I think the deleted status does not sync on purpose unless you have “Messages in iCloud” turned on. On the Mac it’s under System Settings > [Your Name] > iCloud > Show More Apps…
Now sit back and wait for a million attempted interpretations on how this could be just not accepting it and that this is how it is suspected to behave. And just marking any message unread and read triggering probably some job fixing it so normal and needed step for this flawless feature to work.
You might also be told that you are supposed to delete those messages on every device and that if you expect it to work automatically then you don’t get it.
Fundamentally they do the same thing, and are sort of just optimizations for particular problem-sets. A database is great for data that has proper indexes, a file system is great for much more arbitrary data [1].
If you’re a clever enough engineer, you can define a file system in terms of a database, as evidenced by iCloud. Personally, I have used this knowledge to use Cassandra to store blobs of video for HLS streams. This buys me a lot of Cassandra’s distributed niceties, at the cost of having to sort of reinvent some file system stuff.
[1] I realize that this is very simplified; I am just speaking extremely high level.
reply