Announced back in 2019 at Build and Ignite, Microsoft’s Fluid Framework has finally arrived, in the shape of the first public release of its open source repository on GitHub (originally announced back in May at Build 2020).
Intended to support real-time, low-latency collaboration applications, Fluid Framework is one of the core technologies that will be part of a next generation of Office applications, providing a way to share content without requiring a complex set of back-end services. It’s not only for Office, though, as it’s a set of technologies you can build into your own code.
What is the Fluid Framework?
The underlying workflow of a Fluid application is simple enough, using a basic event broadcast model to manage messages between clients. It’s a familiar model to anyone building a many-to-many communications platform: data on a client changes, so a notification is sent to the Fluid service along with the change.
That change operation is carried out on data held on the Fluid service, generating a change notification that is then sent to all subscribed clients. Code running on all the clients receives the notification, handling the event and managing changes to local data and user experiences.
Perhaps the key difference between this and other collaboration platforms that tried to solve the same problem is that the server isn’t the source of truth in the network; it’s only the source of change.
Clients are expected to receive events and process them appropriately, in whatever order they’re received, converging on a common state. Instead of a complex server, all Fluid needs to implement is a low-latency message broker.
Fluid as distributed application architecture
You could compare this approach to building applications that work with some of Cosmos DB’s less-restrictive consistency models, especially session and bounded staleness, where what’s important is that a client has the right data for a user and that all users eventually see the same view within a set amount of time.
Looking at it this way, it’s clear that in Fluid Framework, Microsoft is building a collaboration model based on modern cloud event-based distributed application models, rather than older, more constrained architectures.
Although the overall system has fewer constraints, it does require that developers treat all events as essential. No messages can be ignored, and all must be processed and applied as they arrive. Otherwise there is no consistency between end points, and the network will never converge on a common understanding of the content being processed.
In the case of a word processing document, it’s not essential, say, for an edit that removes a word to be processed at the same time as one that edits it. But it is essential for the underlying data structure to ensure that the document history records the edit even if the edit event is received after the deletion.
What goes into a Fluid application?
The architecture that underlies Fluid is relatively straightforward. Applications need to implement both a Fluid container and a Fluid loader. The container hosts the Fluid runtime, which is best thought of as an event-driven serverless application, much like you’d find in Azure Functions.
Along with your application logic, it hosts the distributed data structures that manage state and are where your code implements merges.
The Fluid container is managed by the Fluid loader and, as it says on the box, loads containers from the Fluid service. Containers have URLs, like the Web, and are shared by all the clients using the same Fluid app. There should be no difference between the container code running on a desktop PC, on a phone, or in the browser.
This allows you to write your containers once, run then anywhere, and know that documents will be consistent. Again, like well-designed cloud applications, the Fluid loader resolves the URL of a container (much like a service broker client), then connects using a service driver before loading the container with your code.
The Fluid service is the simplest part of the system. If you’ve worked with financial services applications, it’s recognisable as the familiar message broker. Messages come in from clients containing details of changes to a document, which Fluid calls “operations” or “ops.”
Each op is given a sequential ID and sent to all the clients, which then need to parse their contents and apply them to their documents. There’s no requirement for a server to parse the data in an op. All it needs to do is receive it, timestamp it, and send it on. With ops likely to be encrypted for security, there’s no need to worry about an eavesdropper being able to reassemble a document from Fluid messages.
Understanding Distributed Data Structures
At the heart of the Fluid container, alongside your code, are its Distributed Data Structures (DDSes). DDS look like local objects to your code, but they’re not true local objects. Changes can be made over the Fluid service, with ops carrying change events that must be parsed and merged into the appropriate DDSes.
Applications can have multiple DDSes hosting different types of data structure. Different types have different merge strategies, handling how conflicts between different ops are managed.
Many are termed “optimistic,” in that local changes are applied before an op is reflected back from a server. For example, in a shared document, you see your changes as you make them, whereas remote ops are applied as they arrive.
More complex DDSes use “consensus-based” structures, where ops are only applied as they are received from the Fluid service. That doesn’t mean that all the clients have agreed on the changes made in the op, it’s that they are all applying the same ops in the order that they’re relayed, even if they originate locally.
DDSes fall into three buckets: key/value data, sequential data, and specialised data structures. There’s a lot to drill down into these structures, especially how they support different consistency models. The simpler options may seem limiting, but they’re actually a set of foundational data types that can be combined to build more complex structures.
The more specialised structures are perhaps less important, offering tools for sharing digital ink (perhaps more interesting with tablet devices and the new pen-enabled Surface Duo folding phone) and for working with shared text.
Building Fluid applications
This builds on a DataObject class to host DDSes in a SharedDirectory. Once initialised, DDSes are always available to your code, simplifying accessing and using shared data. The library includes a DataObjectFactory to construct DataObjects, giving you tools to quickly and consistently build new DDSes.
It’s important to note that the models Fluid uses are best suited to small and midsized collaborations. Using the last writer wins approach with hundreds of clients connected over international connections could be a problem if lots of edits are being made close together in a document.
But with a relatively small number of users working across a large document that’s being saved to a common store, it’s going to be easy for the Fluid-hosted content to converge on a consistent consensus.
These are still early days for the Fluid Framework, and this initial release of code is not intended as shipping code. That’s to be expected, as it’s very much still under development, with some scenarios still to be delivered. But there’s enough here to get started building trial collaborative applications, looking at how your users might need to share documents and other content.
It’s certainly interesting to see that Fluid supports many more data structures than strings. It’s possible to imagine many more collaborative scenarios than traditional co-editing, from spreadsheets, to simple databases, to drawing, and to editing and annotating images.
What’s perhaps more important is that these scenarios go a lot further than traditional desktop knowledge workers. You could easily build these into mobile applications for frontline workers who traditionally haven’t been given access to collaboration tools, but still need to work together to solve problems.