September 3, 2024

#14 – Matthew Weidner: Architectures for Central Server Collaboration

All episodes

September 3, 2024

#14 – Matthew Weidner: Architectures for Central Server Collaboration

Transcript

Dowload transcript

0:00:00 localfirst.fm #14 – Matthew Weidner: Architectures for Central Server Collaboration

0:00:00 And this also feeds into features that you might want to give to your

0:00:02 users, especially in productivity apps.

0:00:04 You want to have that change history where you can see what was everyone doing.

0:00:07 You also want to have undo's.

0:00:09 basically what you can do for undo is when you create this action or this mutation

0:00:13 describing, the high level intent, you can also tag along with it, a mutation saying,

0:00:17 here's how to undo this operation later.

0:00:19 And then you store that somewhere and then you just have a queue somewhere in

0:00:22 your app that's like the action queue.

0:00:24 You can go through that and undo things in a nice way.

0:00:27 Hopefully, users will be happier with this than if you just, you

0:00:31 know, revert states exactly, ignoring collaborators updates.

0:00:35 Welcome to the local-first FM podcast.

0:00:37 I'm your host, Johannes Schickling, and I'm a web developer, a

0:00:40 startup founder, and love the craft of software engineering.

0:00:43 For the past few years, I've been on a journey to build a modern, high quality

0:00:47 music app using web technologies.

0:00:49 And in doing so, I've been falling down the rabbit hole of local-first software.

0:00:54 This podcast is your invitation to join me on that journey.

0:00:57 In this episode, I'm speaking to Matthew Weidner, a computer science

0:01:01 PhD student at Carnegie Mellon University, focusing on distributed

0:01:05 systems and local-first software.

0:01:07 Matthew has recently published an extensive blog post about architectures

0:01:12 for central server collaboration, which we explore in depth in this conversation,

0:01:17 comparing different approaches, such as CRDTs and event sourcing.

0:01:21 Before getting started, also a big thank you to Rocicorp and

0:01:24 Expo for supporting this podcast.

0:01:27 And now my interview with Matthew.

0:01:29 Hey, Matthew.

0:01:30 Thank you so much for coming to the show.

0:01:32 How are you doing?

0:01:33 I'm good.

0:01:34 Yeah.

0:01:34 Thanks for inviting me.

0:01:36 Yeah.

0:01:36 Super excited to, to have you here.

0:01:38 I think, our shared friend, Geoffrey Litt introduced us and he and, Matt Wondlaw

0:01:44 and a few others have, when you were writing this blog post, the architectures

0:01:48 for central collaboration, all of my friends shared this blog post with me.

0:01:53 And it has since, like, served as a really, really reliable and

0:01:57 good foundation to just provide an orientation around, yeah, how do syncing

0:02:03 systems, et cetera, how do they work?

0:02:06 So this has been the, the initial touch point for me, but would you

0:02:10 mind briefly introducing yourself?

0:02:12 Background: PhD on collaborative apps

0:02:12 sure.

0:02:13 Yeah.

0:02:13 So I'm Matthew.

0:02:14 I'm a researcher and developer.

0:02:16 I've been thinking about, local-first software, more generally the problem

0:02:19 of how do we make collaborative software easier to program.

0:02:23 So that's been, I guess, five years of PhD work and now working full time on a

0:02:28 collaborative app, at a small company.

0:02:30 And yeah, the, the question for me has always been, how can we make

0:02:33 building a collaborative app in the style of Google Docs or Figma

0:02:36 as easy as making a smartphone app or a local only desktop app?

0:02:41 Amazing.

0:02:42 I'm curious, what led you, like when you say five years ago, you started working

0:02:46 on this, what led you to, to that point?

0:02:48 What motivated you to, to look into this?

0:02:51 yeah, so it actually started a little earlier.

0:02:53 So six years ago, I was doing a master's degree at the University of Cambridge.

0:02:57 I had to pick a master's thesis project, and some of the Ph.

0:02:59 D.

0:03:00 students talked about what their lab group was doing, the TrueData group,

0:03:03 where they were working on an end to end encrypted version of Google Docs.

0:03:06 The idea is that some professions, like lawyers or journalists, they want the

0:03:09 collaboration of Google Docs, but they don't trust their data to a third party.

0:03:13 where the, you know, the employees can look at it or

0:03:14 it's on someone else's servers.

0:03:16 So they wanted this end to end encryption where you say only

0:03:18 you and your collaborators can read the unencrypted data.

0:03:22 So I thought this sounded like a really interesting project.

0:03:23 I just joined them for my master's thesis.

0:03:25 Turned out to be working with Alastair Beresford and Martin Kleppmann, um,

0:03:30 mostly on the cryptography side.

0:03:31 Then after that, I decided that actually the collaboration side

0:03:34 sounded more interesting, and I wanted to work on that for my PhD.

0:03:37 Very interesting.

0:03:38 What did the technology landscape at that point look like?

0:03:41 I mean, today there's like Automerge and quite a few other technologies

0:03:45 that already try to attempt this.

0:03:47 what did the technology landscape back then look like?

0:03:50 So this was before the local-first essay.

0:03:52 I think I actually saw a draft of the local-first essay that

0:03:56 year, now as a master's student.

0:03:57 Automerge I believe had started, YJS had started, but I hadn't

0:04:01 heard of people using it yet.

0:04:03 but yes, people were just getting started to use this idea of.

0:04:07 collaborative data structures for the web, not necessarily with central

0:04:10 servers like these CRDT libraries were just getting started, and I don't

0:04:15 know if the local-first world had really even started yet at that point.

0:04:19 Right.

0:04:19 Yeah.

0:04:20 I think there are so many people who thought about similar problems

0:04:23 over like decades before then.

0:04:25 There was like CouchDB and PouchDB and like a lot of great minds

0:04:29 already thought about this, but I feel like the real momentum

0:04:32 started with the local-first essay.

0:04:35 So I'm curious, take me through a little bit of like the, the

0:04:37 five years working on that.

0:04:40 What were some of the milestones?

0:04:42 How did you go about starting this in the first place?

0:04:45 Sure.

0:04:45 So the, the main things I was coming at it from a more academic perspective, like

0:04:49 I really have a theory math background.

0:04:51 So I was looking at the, the theory of CRDTs, these conflict

0:04:54 free replicated data types.

0:04:56 Which, sort of, the idea is that it's a data structure that's

0:04:59 copied on multiple devices.

0:05:01 You put your data in it, like your app's data, and then one user can change their

0:05:04 copy of the data whenever they want.

0:05:06 At some point later, you'll sync up in the background and come to

0:05:09 a convergent copy where everyone's looking at the same document again.

0:05:12 This is really designed for the sort of peer to peer model where you don't

0:05:15 necessarily have central authority, it's just everyone updating their own data.

0:05:19 and also this local-first spirit, where you always update the local copy of

0:05:21 your data first, and then you talk to everyone else and say, here's my changes.

0:05:25 So I spent the first year really just reading the papers in that field.

0:05:29 So there's a classic paper by Mark Shapiro right now for 2011, a lot of papers by

0:05:34 Carlos Vaccaro and his collaborators, yeah, just trying to learn what are these

0:05:38 data structures, what can we do with them.

0:05:41 Got it.

0:05:41 And so after that, you started your own implementations of CRDTs.

0:05:46 And was there any sort of reference app that you oriented this around?

0:05:50 Not really.

0:05:51 So there's actually, there's a reference CRDT.

0:05:53 So we started with this paper, which is very theoretical about this way

0:05:56 that you could maybe combine two CRDTs.

0:05:59 So the example we use, which is a bit silly, is if you have.

0:06:02 a number that you can add things to, like maybe a bank account balance

0:06:06 you can add to, you can also multiply to if you're applying the interest.

0:06:09 How do you combine these two operations in a single CRDT that can

0:06:12 be updated with either add or multiply?

0:06:15 So then my advisor had this idea, let's implement this in a library.

0:06:18 and there that already set some sort of unique design principles,

0:06:23 which is that we're going to assume you're making your own CRDTs.

0:06:26 It's not just a collection of CRDTs we give to you, like map, list, et cetera,

0:06:30 actually going to be whatever, and then some way to combine them together.

0:06:34 So that was really the starting point, is that we want to make a place where you

0:06:38 can make your own CRDTs and compose them.

0:06:40 I don't think we really had a specific application in mind at the beginning.

0:06:44 Was that technology ever released or open source or talked about in some way?

0:06:50 Collabs

0:06:50 So we did make a open source library about it.

0:06:52 It's called Collabs.

0:06:53 So it's written in TypeScript.

0:06:55 we have a documentation site.

0:06:56 I think it's collabs.readthedocs.Io.

0:06:59 it's definitely still an academic project.

0:07:01 So it's really about, here are these data structures that you can play

0:07:04 with and you can make your own things.

0:07:06 we do have some basic demo apps, like your basic, uh, You know, text editor.

0:07:11 there's a to do list sort of thing somewhere.

0:07:13 And then there is an archive paper about it that you can read, which goes into

0:07:16 more detail about the system design and why we did things the way we did.

0:07:20 Got it.

0:07:20 And so it sounds like you've really gone super deep on this, mostly

0:07:25 oriented from the CRDT side of things.

0:07:28 But, as you read the papers, as you were working on this, you also got

0:07:33 a better understanding of the larger space and the other approaches.

0:07:37 And I think you got more curious about the other approaches and this

0:07:40 is what you've laid out so clearly and brilliantly in this blog post that will

0:07:45 be linked in the, in the show notes.

0:07:47 And I highly recommend anyone who's listening to read it in depth, if

0:07:51 you're curious about those topics.

0:07:53 so the, the blog post called Architectures for Central Server Collaboration, and

0:07:58 it provides a really nice way to think about this, like provides of like a.

0:08:03 Hierarchical structure of what are the design decisions?

0:08:07 What are the trade offs?

0:08:08 What are the concerns about the different approaches.

0:08:11 And so I've, I'd love to just go through that step by step.

0:08:17 Architectures for Central Server Collaboration

0:08:17 Sure.

0:08:18 Let's see.

0:08:19 Yeah.

0:08:19 So the, the idea of this blog post is we're thinking about.

0:08:24 Real time collaborative apps.

0:08:25 So these are apps like Google Docs, Figma, Notion, that sort of thing.

0:08:29 And sort of the distinguishing feature of these apps compared to

0:08:32 more traditional web apps is that, you know, when you make a change, it

0:08:36 updates your local copy immediately.

0:08:38 It's not just click a button, go back to the server, get a

0:08:41 new web page and show it to you.

0:08:43 It's click a button and something updates on your own screen

0:08:45 immediately and eventually it'll tell the server what you did.

0:08:48 So this blog post was trying to think about, in general, with these real

0:08:51 time collaborative apps, like, what are we doing in a semantic sense?

0:08:54 Like, what does it mean to be real time collaborative?

0:08:57 And then, what sort of, you know, the high level of how you can implement

0:09:00 that in the most flexible way possible.

0:09:03 And so you've derived a couple of like really nice ways to, to think about

0:09:08 that, like in terms of dimensions and later on you, you can nicely

0:09:12 summarize it in a nice overview table.

0:09:15 would you mind motivating some of the dimensions that you come up with here?

0:09:20 Sure.

0:09:21 Let's see.

0:09:21 So I guess just for context, my own background is, as I said, thinking

0:09:24 about it from a CRDT perspective.

0:09:26 This is very much the perspective if you have some data structures,

0:09:30 which are usually pretty low level, like maps and lists, and you have

0:09:33 some prescribed operations that you can perform on them, and then it'll

0:09:37 sync it for you under the hood.

0:09:39 And then also in the CRDT model, it's usually not really assuming a central

0:09:43 server, where the central server is doing basically the same thing as the clients.

0:09:47 So what the dimensions are thinking about is, okay, what can we do that's

0:09:51 different from just the CRDT model?

0:09:53 And there is Yeah, there's really three dimensions.

0:09:56 I guess maybe the most interesting one is the is how you describe

0:10:00 operations on the collaborative state.

0:10:03 So you have sort of the, the database or key value store model, which is,

0:10:06 you have these low level state changes.

0:10:09 Like when I check a box in to do list, that's creating a row in a database

0:10:14 that says, you know, to do list checked, true, that sort of thing.

0:10:18 And then there's also this opposite model, which is sort of the more event sourcing

0:10:21 approach where you have these high level operations, sometimes called mutations.

0:10:26 And this is where, when you change the data, you're actually telling the server

0:10:29 exactly what the user's intent was.

0:10:31 You say, the user wants to check this box and make it true, and

0:10:34 then you broadcast that high level intent back to the other users.

0:10:38 And tell them what to do and how to update their state.

0:10:41 And I think this is also like this distinction between the

0:10:45 intent of a mutation and the, the change more directly.

0:10:50 I think this can be, a little bit of a subtle difference for

0:10:55 people who haven't built something with either approaches yet.

0:10:59 But, uh, I think to draw an analogy from the web world, When you're working

0:11:05 with something like Redux, this is where I'm not sure whether you ever

0:11:09 built some, some front end apps with Redux, but this is where you have, for

0:11:12 example, if I remember correctly, the, the concept of an, of an action, which

0:11:17 is basically the idea of an event where you declaratively say like, okay, there

0:11:22 is an action or there is an event for, someone wants to complete this to do.

0:11:29 Then further down the road, there's like a reducer, which then in, for example,

0:11:34 maintains a list of to-dos and maybe kicks it out or maybe, overrides a property

0:11:41 in the to-dos array and says something is done as opposed to the other approach

0:11:46 where you directly mutate the, the state.

0:11:50 Which is, for example, in the web world, we're using something like MobX, etc.

0:11:55 And so now we're talking here about the equivalence for distributed states,

0:12:00 and where CRDTs, I think, give us more the analogy, this might be a stretch,

0:12:05 but give us more of like an equivalent of like something like MobX, where

0:12:09 you mutate the state more directly.

0:12:11 And the CRDT underpinnings nicely make that principled constrain

0:12:16 you in the in the right way and then also distribute the state.

0:12:20 Did I summarize this in the right way?

0:12:23 Yes, good description.

0:12:25 Maybe another way to think about it that's in more illustrative

0:12:28 than to do list is to think about like the the video game example.

0:12:31 So for example in a video game if you press an arrow key on your keyboard you

0:12:35 can do sort of the high level intent is I want my character to move forward.

0:12:40 And then your game server will interpret that intent.

0:12:42 It'll try to move your character forward, but if there's a wall

0:12:44 in the way, it'll stop you.

0:12:45 And if you step on a pressure plate, it'll do something.

0:12:48 and then ultimately compute the actual state changes, which are

0:12:51 the low level things of like, what coordinates are my player at now?

0:12:55 what is the state of the world in terms of, you know,

0:12:57 doors that are open or closed?

0:12:58 And it'll send those low level state changes back to clients.

0:13:02 So that's another example of this distinction between high

0:13:04 level versus low level intent.

0:13:06 Right, and I think this is now also a really important distinction because in

0:13:11 the Redux or MobX example, it's, like, all of that is happening on the local device.

0:13:18 There's no cheating in that regard, but when you're talking about games,

0:13:22 they can actually be cheating.

0:13:23 And how do you prevent that particularly in a multiplayer context?

0:13:27 And this is where you, what do you do on the client and what do you do on the

0:13:32 server, maybe need to be different things where the server acts more than authority.

0:13:38 And the client rather provides, instructions as opposed to providing

0:13:42 the authoritative source of truth for the actual state of a world.

0:13:47 And so this is where the intent is not equal to the reality

0:13:53 that is coming out of it.

0:13:55 And I think this is nicely illustrated in your article through this game example,

0:14:01 where you can basically send to the server, like, Hey, I want to move forward.

0:14:05 The server knows where you were before, and the server tells you

0:14:09 afterwards, like, now you're here.

0:14:11 The client locally can probably, if everything is in an okay state, has

0:14:16 probably already arrived at the same conclusion, but, at least this way the

0:14:21 client can't override to say the player position is somewhere in an illegal state.

0:14:27 Server-side rebasing

0:14:27 Maybe this sort of transitions into the next point or another

0:14:30 dimension in the article.

0:14:32 Which is, what does the server actually do when it receives

0:14:35 an operation, in particular an operation that's out of date?

0:14:38 So the classic example is if you have a like counter, like a post has

0:14:42 some number of likes on it, if it has six likes and I send a command

0:14:45 to the server that says, I like it, change the number of likes to seven.

0:14:48 But what if someone else also liked the post in the meantime, and their

0:14:52 like made it to the server first?

0:14:54 So now the like count's already seven, I don't want to set it to seven

0:14:56 again, I want to increase it to eight.

0:14:58 And there's a, yeah, so basically there's a few philosophies in how the

0:15:01 server should process this operation so that it still makes sense.

0:15:04 I mean, technically it's legal to keep the original operation as

0:15:08 just set the count to 7, but that's not really what the users expect.

0:15:11 So the one philosophy, sort of the CRDT way, is to say, I'm going to phrase

0:15:15 my operations in such a way that the server will know what I want it to

0:15:19 do, And it'll do the correct thing.

0:15:21 So for a light counter, the classic way is you say, increase the light count by one.

0:15:26 The server can get that, and even if the count has gone up since what you

0:15:29 originally thought it was, it's still going to add one and do the proper thing.

0:15:32 So you're going to end up with eight lights instead of seven.

0:15:34 And sort of the other spirit is the operational transformation spirit.

0:15:38 So this is an older technique for collaborative apps that's used by

0:15:41 Google Docs and was developed in the 90s for the Jupyter collaboration system.

0:15:45 And here the spirit is, the server is going to look at your operation, it's

0:15:48 going to look at all the intervening operations that you didn't know about but

0:15:52 the server has received already, and it's going to use those to sort of compute

0:15:56 what your new intent is supposed to be.

0:15:58 So this example, you would tell the server, change the like count

0:16:01 to seven, but the server would see that there was an intervening change

0:16:04 the like count operation already.

0:16:06 It's going to rewrite your operation as change the like count to eight, and

0:16:09 actually apply that to its state and send that operation to the other users.

0:16:13 Got it.

0:16:14 So, and this is basically about the, the convergence aspect And I

0:16:18 suppose where this code is running, this can equally work on the

0:16:22 client as well as on the server.

0:16:25 So this is sort of orthogonal to the, the game example case that we talked

0:16:30 about, which is more about the authority.

0:16:33 Yeah.

0:16:34 Yeah.

0:16:34 So this isn't about how does the server.

0:16:36 interpret operations from, like, a correctness permissions perspective.

0:16:40 It's just how does the server handle operations that are sort of

0:16:43 stale, in the sense that the client originally applied them one state,

0:16:46 but by the time they arrived at the server, the state had updated because

0:16:49 other people were doing things.

0:16:50 Now the server has to figure out what to do.

0:16:52 Yes, this is the server side rebasing.

0:16:55 This is where the server has to rebase your operation, or

0:16:58 the incoming operations, on top of whatever its new state is.

0:17:02 And sort of the analogy is to git rebasing, where you might try to apply

0:17:05 a commit on top of some new commits that weren't there when you first tried it.

0:17:10 Got it.

0:17:11 Okay, so that is one dimension that you've nicely dissected

0:17:15 here in this, in this blog post.

0:17:18 Optimistic Local Updates

0:17:19 So the next one is the the optimistic local updates on the client.

0:17:23 So now if we assume there's an central server, everyone's taking

0:17:26 these updates, they're sending these operations to the server, the server

0:17:29 knows what the state's supposed to be.

0:17:31 And what you could say is just the traditional, web app model.

0:17:34 If I submit an operation to the server, it processes it, it sends back, sends me

0:17:38 back the result, and now I get to see it.

0:17:40 So if you think like, um, you know, traditional HTML form, you submit your

0:17:43 operation to the server, it gives you a new page back saying what it is.

0:17:46 But with modern apps, we want to do better than that.

0:17:48 We want to say that when I perform an operation on the client, it's going

0:17:52 to update my own state immediately.

0:17:54 And that's an optimistic update because I'm sort of optimistically

0:17:57 assuming that the server is actually going to receive my update.

0:18:00 It's going to process it in the way I expected.

0:18:02 No one else is going to interfere.

0:18:04 this is just a nice property in terms of making the app feel more responsive.

0:18:07 You want to see your key presses immediately.

0:18:08 You want to see that button get checked immediately.

0:18:10 So the question is then, how do we actually do that?

0:18:13 Or, I guess the first question is even, what is the correct answer?

0:18:17 What does it mean to optimistically update my state?

0:18:20 And I guess, yeah, sort of the conclusion I came to that, you know,

0:18:23 people have come to in computer games as well, is that you want to take

0:18:27 the latest state you've received from the server, plus your own optimistic

0:18:32 local operations on top of that.

0:18:34 And that's always what the correct state is.

0:18:36 And even as you receive or perform new operations, you're

0:18:38 just maintaining that state.

0:18:40 Like from your first dimension, which is about server side rebasing, now it's

0:18:45 a lot of the same ideas, but applied on the client where you need to make

0:18:50 the same trade off decisions again, you might come up with different conclusions

0:18:56 based on the server and based on the client, depending on your use cases.

0:18:59 So that, that is the second dimension.

0:19:03 And, then you're, you talk about the, the form of operations.

0:19:07 So how, a state is changing based on mutations, based on state changes.

0:19:15 Can you go a little bit more into, into detail here?

0:19:18 Form of operations

0:19:18 Sure.

0:19:18 Yes.

0:19:19 This is what we were talking about at the beginning, where when you, you check

0:19:22 a box in a to do list, you want to say, Am I updating a row in a database that

0:19:25 doesn't know anything about to do lists, or am I sending a high level mutation

0:19:28 that says, like, this user wants to check the to do list and, you know,

0:19:32 do that action or maybe do something else if that's not valid anymore.

0:19:36 So here we get to choose which form of operations we want.

0:19:38 We want to send these high or low level from the client to the server.

0:19:42 Then once the server updates its state, does it want to send high or

0:19:45 low level changes back to the clients?

0:19:48 yeah, so the video game example is an interesting one where you actually

0:19:50 make different choices usually.

0:19:52 So usually you'll send the high level operations from clients to the server.

0:19:55 You say, I want to move forward, I want to shoot my crossbow.

0:19:58 And then on the way back from the server to the client, usually it

0:20:01 won't send those actual actions.

0:20:02 It'll just send the results, which are changes to some basic key value store.

0:20:06 But you can also make different choices, like you can say, you

0:20:10 know, Git is an example where it's sort of high level mutations.

0:20:14 You're saying, like, I want to, you know, change this text paragraph in

0:20:17 a specific file, and Git will send those exact operations to every client.

0:20:21 It's not going to interpret them at all on the server and change

0:20:24 them into a low level change.

0:20:26 Whereas if you use something like the Firebase database, that's all low level.

0:20:30 You send low level changes to Google servers.

0:20:32 Where you say, I want to, you know, set this key to this value or I want

0:20:35 to delete this object in the database.

0:20:38 And it's going to send that change back to clients without having any idea what

0:20:41 the keys and values actually represent.

0:20:43 That makes sense.

0:20:44 And so I think this is also nicely drawing a boundary between the more declarative

0:20:51 approaches that you have in mutations that you can reason more clearly about,

0:20:56 like in the context of your domain.

0:20:58 But it also only makes sense in the context of your domain.

0:21:02 Whereas with state changes, this is the appeal of CRDTs.

0:21:06 This is you just mutate a document and, the, the underlying mechanics, make

0:21:12 sure that the state changes are behaving in, in a useful way since I, I suppose

0:21:17 like listening to the state changes yourself in your app, that's no fun.

0:21:22 So you really want, a system like CRDTs to make sense of that

0:21:26 . So now with those three dimensions and I go through them again, the

0:21:30 server side rebasing, the optimistic updates and the form of operations

0:21:34 like declarative versus state based, now you've combined all of that in a

0:21:39 really nice, classification table where we get a whole bunch of like matrix

0:21:45 cells here with different technologies.

0:21:48 So, Again, highly recommend, actually reading this and looking at the

0:21:52 beautiful table for yourself, but in the different cells, you've

0:21:56 also filled in a couple of existing technologies and see where they slot in.

0:22:01 So would you mind going through the different technologies and maybe

0:22:07 Classification table

0:22:07 Sure.

0:22:08 So I guess first I can talk about the one cell just near the bottom, right

0:22:11 in the table, if you're looking at it.

0:22:12 Which is the CRDTxCRDT cell.

0:22:18 So this is basically the place where I spent my most time reading

0:22:21 about CRDTs, working on this academic open source library.

0:22:24 And that's where the operations that users send are really these low level state

0:22:29 changes to some sort of magical replicated database, where you update the database,

0:22:33 like normally on your local device, and it promises to do this synchronization in

0:22:37 the background and make sure that everyone converges to the same state immediately

0:22:40 without really caring about what specifically your data or operations are.

0:22:44 So that some prominent examples.

0:22:45 So Firebase Realtime Database, I think of as an example, also

0:22:49 the CRDT ish libraries, like YJS.

0:22:52 also, yeah, Triplit, InstantDB, those are all sort of in this quadrant

0:22:56 or in this cell thing that we're going to replicate low level changes

0:23:00 for you, just like as they are.

0:23:02 another cell on this table, which is sort of near the bottom left, we

0:23:05 mentioned in the computer game example.

0:23:07 In a computer game, you're going to send these high level actions to the

0:23:10 server, which is going to figure out what to do with them, and then communicate

0:23:14 the state changes back to clients.

0:23:16 that's another interesting cell, both because it's sort of old, like, you know,

0:23:20 this is, starts with the Half Life game engine in the 1990s, so people have been

0:23:23 using this technique forever, just not in web apps, it's in computer games.

0:23:28 But more recently, Replicache implements this model as a data sync

0:23:32 layer for web applications, which I know a number of companies are using.

0:23:36 and I found that really inspirational reading about how Replicache works.

0:23:39 I'm glad to have learned about it.

0:23:41 Right.

0:23:41 And I love like how you compare those technologies.

0:23:44 Both technologies.

0:23:45 I love, love like the Half Life game engine spent way too much

0:23:49 time, playing various Half Life game engine games, where it's very, very

0:23:54 intuitive that if you play, press the W key, which moves you forward.

0:23:59 That's like communicating the intent.

0:24:01 To the server, you don't tell the server like, Oh, I'm at these coordinates.

0:24:04 You just give it like a history of like which keys you pressed

0:24:08 and therefore like how you moved.

0:24:10 and it does some validation of like whether all of that is okay.

0:24:13 And it sends you back the location.

0:24:16 And it's the same about Replicache where you send it a few mutations And on the

0:24:20 Replicache server, it interprets all of that and sends back to you the state

0:24:25 using the server side knowledge, which might be different than the client side

0:24:29 implementation, so it's the authority.

0:24:31 So that is very clear and very nicely laid out here, where you send the intent,

0:24:36 you send the declarative mutations, and the server sends you back some state

0:24:40 changes, as opposed to what you before mentioned, with a CRDT times CRDT,

0:24:46 where Both on the client, on the server, you run the same CRDT convergence.

0:24:52 And, uh, so those two, those two cells are very clear.

0:24:55 Yes, exactly.

0:24:56 And then, yeah, so I guess the remaining cells of the table, they

0:25:00 mostly, they either use state changes in both directions or they use high

0:25:04 level mutations in both directions.

0:25:06 So, let's see.

0:25:07 Two interesting ones.

0:25:08 Automerge in ShareDB.

0:25:10 They're both doing a similar idea to the CRDT libraries, like YJS, where they're

0:25:15 sending these low level state changes around and making sure everyone converges

0:25:18 to the same state, but they have a different way of doing this internally.

0:25:22 So with Automerge, what you're actually doing is you're performing these state

0:25:26 based Automerges that a library is basically a JSON CRDT, but the way

0:25:30 it works is more of an, like an event sourcing model, where you have this

0:25:35 total order of CRDT style operations.

0:25:38 All clients are going to make sure that they eventually

0:25:40 confer to the same total order.

0:25:42 So everyone will agree what operation 1, operation 2, operation 3, etc.

0:25:46 The state is the result of applying all of these operations in that fixed order.

0:25:50 And if, you know, people do operations concurrently on their different devices

0:25:54 because the network's not working, then we'll just sort those operations

0:25:57 into some order later, make sure everyone agrees on the same order,

0:26:00 and that's giving you your state.

0:26:01 Interesting.

0:26:02 So given that Yjs and Automerge, which I think are in the web ecosystem, the, the

0:26:07 two most popular CRDT implementations, they actually do differ in this dimension

0:26:12 of like how state changes are implemented.

0:26:15 again, Firebase, as well as Yjs.

0:26:17 following more strictly the CRDT approach and Automerge using server reconciliation.

0:26:23 is there an example that comes to mind where this, in a example app

0:26:27 use case would differ and where you would use Automerge or Yjs,

0:26:32 intentionally because of this?

0:26:34 I think in terms of the, the external.

0:26:37 the API, or what you see as a user of these libraries, it doesn't really differ.

0:26:41 It's more just in terms of the implementation, I guess, in, in this

0:26:45 totally ordered model like Automerge uses, you don't have to worry as much

0:26:48 about getting the math exactly right.

0:26:50 Like, am I sure that these two operations actually do the same thing

0:26:53 if I apply them in different orders, which is this mathematical requirement

0:26:57 that you have to satisfy for CRDTs.

0:26:59 So that makes it a bit easier on the, to like the correctness and

0:27:04 sureness of the implementation.

0:27:06 Whereas with the YJS or CRDT style, if I'm just going to apply my operations

0:27:10 directly, in principle that can be a bit faster because you don't have to

0:27:14 worry about rewinding your total order of operations and then applying a new

0:27:18 thing and walking it forward again.

0:27:21 That said, usually if you're making a collaborative application with CRDTs,

0:27:24 you don't really need to process more than a handful of operations

0:27:28 every second, so it doesn't matter if it takes a little bit longer.

0:27:31 Got it.

0:27:32 Okay.

0:27:32 That, that makes sense.

0:27:33 So in the CRDT approach, wherever I am currently in my state, I can just apply

0:27:38 on top the existing or the new events.

0:27:41 And, with a server side reconciliation approach, this is where depending

0:27:45 on what the new events are, where they sit in terms of the timeline.

0:27:49 I might need to, uh, Wind back, apply them, and that might take a little

0:27:54 bit longer, but possibly also makes the implementation a bit easier.

0:27:58 Yeah, I guess just one note.

0:27:59 So, you've been saying server side reconciliation.

0:28:01 Automerge does not actually require a server.

0:28:03 It's a completely decentralized model.

0:28:05 The name is just sort of by analogy to what you would do

0:28:07 if you would have a server.

0:28:09 You would put all the things in the order that the server receives them.

0:28:11 Automerge instead infers a sort of order in a decentralized way.

0:28:15 That makes sense.

0:28:16 So, we've now mostly talked about the state changes side of it.

0:28:21 And, we talked about how our optimistic, locally, how are

0:28:26 the state changes applied.

0:28:28 But we didn't talk too much about the mutations times mutations quadrant, which

0:28:32 also has couple of, like, Subsections.

0:28:38 Event sourcing

0:28:38 Yeah, so this, this mutations, mutations quadrant, this is sort of the event

0:28:42 sourcing idea where instead of sending around low level changes, we're going

0:28:45 to send around the actual user actions, both from users to the server and

0:28:49 from the server back to other users.

0:28:51 So an example would be like, if you do a find and replace operation, or maybe

0:28:56 you rename a variable in VS code, the operation that you're going to send

0:28:59 to the server actually says, you know, rename this variable from foo to bar.

0:29:03 As opposed to a bunch of low level edits where you go through and change

0:29:06 the actual characters, F O O to B A R, in every place they happen to exist.

0:29:10 So this quadrant is interesting because it gives you a lot more flexibility

0:29:15 in terms of what You can communicate this really high level intent, like

0:29:20 code refactors or actions in a computer game, and then the server can interpret

0:29:25 that intent in a reasonable way.

0:29:27 You know, applying permissions, maybe you can see that someone else has also been.

0:29:31 you know, added a new reference to that variable.

0:29:33 So it's going to rename that reference as well.

0:29:36 and you can do this a lot more flexibly as opposed to if you just see the low

0:29:38 level intent and have to sort of, or the low level operations and sort of have to

0:29:42 guess what intent that corresponded to.

0:29:44 So there's a few systems along these lines.

0:29:47 So one of them, which I link here, which is not as well known is called Actyx.

0:29:51 It's actually a company in Europe, which does, Like iot, coordination in factories.

0:29:58 So if you have some, you know, robots moving around a factory floor, they're

0:30:01 talking to each other over the local network and they might say things like,

0:30:05 oh, someone needs to go pick up this box and move it from point A to point B.

0:30:09 one of the robots can say, okay, I'm going to go pick up,

0:30:11 pick up this box and move it.

0:30:13 And that way the other robots know not to move it themselves.

0:30:15 And these, these actions or messages, they just get put into a log that

0:30:19 all the devices in the factory see.

0:30:21 And that way they sort of know what's going on, what tasks are

0:30:24 outstanding, that sort of thing.

0:30:26 Right, and I think one very nice benefit of that as well, is that if there's

0:30:31 some real world stuff happening, and whether in a factory a robot has moved,

0:30:37 or you've now like manufactured a new part, or destroyed a certain thing.

0:30:43 Now you have like a real log of those events.

0:30:46 So in case something goes wrong or in case there's an audit, now you have

0:30:50 some hard facts that you can look at.

0:30:52 So it's not just useful for an app and a machine, but it's also useful for human

0:30:57 purposes to understand what has happened.

0:31:00 Exactly.

0:31:01 Yeah.

0:31:01 And this really feeds into the idea of business logic.

0:31:04 You know, in a lot of applications, we have this.

0:31:07 Business logic that we want to do in terms of, you know, what happens

0:31:10 when a user clicks this button.

0:31:12 And it can often be more complicated than you can express

0:31:15 with simple database changes.

0:31:17 And keeping these actions around gets you really first look at what the, the

0:31:20 business logic was supposed to do and also have the server customize its response.

0:31:25 Like you can check permissions at a very fine grained level.

0:31:28 You can make decisions about, you know, bank balances going below

0:31:31 zero and that sort of thing.

0:31:32 yeah, sort of tossing to some of Pat Helland's articles, if you've

0:31:35 seen like building on quicksand or, immutability changes everything,

0:31:38 this idea of, you know, accountants don't use erasers, all those ideas.

0:31:43 Yeah, exactly.

0:31:44 And I think for web developers, this is also very intuitive, where if

0:31:48 you build a React app, for example, and you have Some complex state

0:31:54 that you express in react use state.

0:31:56 And now you try to somehow do the right thing based on how the state changes

0:32:02 using some react use effect, for example.

0:32:05 They're like, you should use better, better mechanisms and better foundations

0:32:10 for that, for example, using XState for like some, some state machines, et cetera.

0:32:15 This is where you.

0:32:16 Very explicitly and declaratively deal with the state changes as opposed to

0:32:21 like, trying to somehow, reinterpret how some, like, nitty gritty state

0:32:27 things have changed, whereas, like, if you just have a beautiful, simple

0:32:30 event that is easy to understand, okay.

0:32:33 That thing has changed.

0:32:34 The robot has entered this room.

0:32:37 that's much easier to understand than interpreting the

0:32:40 coordinates of a certain thing.

0:32:43 And this also feeds into features that you might want to give to your

0:32:45 users, especially in productivity apps.

0:32:47 You want to have that change history where you can see what was everyone doing.

0:32:51 You also want to have undo's.

0:32:52 basically what you can do for undo is when you create this action or this mutation

0:32:56 describing, the high level intent, you can also tag along with it, a mutation saying,

0:33:01 here's how to undo this operation later.

0:33:03 And then you store that somewhere and then you just have a queue somewhere in

0:33:06 your app that's like the action queue.

0:33:07 You can go through that and undo things in a nice way.

0:33:11 Hopefully, you know, the users will be happier with this than if you

0:33:19 Text & list editing

0:33:19 Right.

0:33:20 So, in this quadrant of the event sourcing quadrant here, there's still

0:33:24 a couple of like sub cells, um, how the mutations are applied, namely the

0:33:31 serializable, CRDT ish, and OT ish.

0:33:34 Can you give a little bit of an intuition how they differ in the implementation and

0:33:39 when you would choose one or the other?

0:33:41 Yeah.

0:33:41 So the examples here mostly concern text editing, which is not a coincidence.

0:33:45 So in text editing, when you're doing any sort of collaborative text editing,

0:33:48 like in Google Docs, you have this problem that your operation might say, I

0:33:52 want to type, you know, the word hello.

0:33:55 After, you know, maybe I want to type the word world after hello.

0:33:58 So the, this message that you're going to send to the server might

0:34:01 say something like insert world at index five, because you know, hello

0:34:04 is five characters long, but someone else might also edit this world.

0:34:07 Hello, or this word.

0:34:09 Hello.

0:34:09 Before your change makes it to the server.

0:34:12 So maybe now it's like, hello there world.

0:34:15 It's what you want to happen.

0:34:16 But your edit is still trying to target index 5, so it's going to

0:34:19 go in sort of the wrong place.

0:34:21 You want it to shift over to accommodate edits that have been

0:34:24 before yours in the document.

0:34:26 And of course, this gets worse if you're, like, editing the bottom

0:34:28 of the document, someone else is editing the paragraph on top.

0:34:31 All of your array indices are going to get horribly messed up

0:34:34 by the time they reach the server.

0:34:35 Like, they're not going to be accurate anymore.

0:34:37 So the three choices here are basically different ways to patch up those

0:34:41 indices so that they make sense again.

0:34:43 That makes sense.

0:34:44 And, I think this is a common theme for local-first software is that

0:34:50 there are a couple of like special buckets that deserve special treatment,

0:34:55 namely text editing and also lists.

0:34:58 And those, the, the latter two are, I think also like closely related.

0:35:03 So on that note, the article, you also went, went a bit more in depth.

0:35:08 on possible approaches to tame lists in this distributed setting.

0:35:14 Will you mind sharing a little more context about that?

0:35:17 Sure.

0:35:18 Yeah, so if the list, as you said, it's hard and it's hard specifically

0:35:22 because of this index problem where your obvious choice for what operations

0:35:25 you're going to send over the network often don't make sense anymore by

0:35:28 the time they reach the server.

0:35:30 Um, and the solutions really fall into two camps.

0:35:33 There's the operational transformation camp, which is used by Google Docs

0:35:37 Which is where you're going to send, you know, index five, that sort of

0:35:41 thing, a raw number, and the server is going to look at these, this index.

0:35:44 It's going to look at all the intervening operations that arrived

0:35:48 that you didn't know about, but have already reached the server.

0:35:50 And it's going to sort of like walk through those one by one to try to

0:35:54 figure out what index you actually meant.

0:35:56 Because it's going to see, okay, if you inserted something at index five and

0:36:00 three other characters have been inserted before that, I'm going to change it from

0:36:04 five to eight, just adding five and three.

0:36:06 Got it.

0:36:07 So a very common app use case for this is, let's imagine Notion where

0:36:11 on the left sidebar, you can have your, your favorite, pages pinned

0:36:17 and those you control the order.

0:36:19 So you can move them around or also on a Notion page, all the

0:36:23 blocks you can reorder yourself.

0:36:26 And a very naive approach would be, whenever you reordered something, you send

0:36:32 to the server a full copy of the entire document, and that contains the order.

0:36:37 But that is not very useful in the collaborative setting where

0:36:41 now the merge radius of the entire thing is the document.

0:36:44 And it doesn't really allow for collaboration on a per block level.

0:36:48 And the another naive approach would be to send the block and say

0:36:54 like, oh, now I'm at position three.

0:36:57 But something else might've already, moved and it's no

0:37:00 longer in reality position three.

0:37:02 So this is what this is all about and, uh, the different approaches for this.

0:37:06 Figma has written also a really nice blog post about this, how they,

0:37:11 tamed this problem, where I think they call it fractional indexing.

0:37:14 And I think you connected the dots here.

0:37:17 can you, draw a line between the different approaches here, the CRDTish approach

0:37:27 Fractional indexing

0:37:27 Yeah.

0:37:27 So the, the OT ish approach, that's what I was describing with, you know, you send

0:37:32 index five to the server, but the server's going to rewrite it to index eight.

0:37:36 So this is really this idea that the server is going to.

0:37:39 Mutate your operation to try to make it still make sense.

0:37:43 Then the CRDT ish approach, which is used by fractional indexing and YDS and

0:37:46 those sort of things, is actually the clients, instead of sending, you know,

0:37:50 index 5 to the server, they're going to rewrite this message in a way so

0:37:54 that it still makes sense, even if it reaches the server a little bit late.

0:37:58 So, for example, you could have, in fractional indexing, you might label

0:38:02 your characters with these decimal numbers instead, where you say, like,

0:38:05 the characters are at 0.1.2.3, etc.

0:38:09 And then if you want to add a new character in between 0.

0:38:12 4 and 0.

0:38:13 5, you give it the label 0.

0:38:14 45.

0:38:16 So this isn't really a list index, it's what they call a fractional index.

0:38:19 And the idea is that this will still go in between the characters at 0.

0:38:22 4 and 0.

0:38:22 5, even if some other changes happen elsewhere in the list.

0:38:27 Because those other changes don't actually change your fractional index.

0:38:29 You're keeping the characters at the same 0.4.5.6, etc.

0:38:34 Right.

0:38:34 And now the 0.

0:38:35 45, this is what you use to derive the, the real.

0:38:40 Integer indexes from by lexicographically ordering it.

0:38:45 Got it.

0:38:45 Yeah.

0:38:45 So I'm using the same mechanism inspired by the ideas of like the, the Figma

0:38:51 blog posts, et cetera, for Overtone.

0:38:53 And I'm even using it before I started implementing syncing, just because

0:38:57 I found it to be the Easiest way to keeping a list ordered in an event

0:39:03 source system, since this is what I'm also already using to circumvent schema

0:39:07 migrations for the, the app I'm building.

0:39:10 So it's, I think it's actually a very simple self contained concept that can

0:39:15 be applied even outside of the scope of a full blown local-first data stack.

0:39:20 Yes, exactly.

0:39:21 Yeah.

0:39:22 It turns out so what.

0:39:23 Text editing CRDTs are doing is very similar to factional indexing,

0:39:27 just with some extra changes to solve some bugs, basically, like

0:39:30 what happens if two people try to insert a character at the same place.

0:39:33 Factional indexing breaks down, CRDTs just have the smallest change

0:39:36 needed to make this not break down.

0:39:38 I agree with your point that this isn't really a collaborative thing.

0:39:41 This is just a general data structures thing.

0:39:43 It's like the way we describe text and as an array is sort of flawed because

0:39:48 array indexes are changing all the time, even though the character is staying

0:39:52 the same and staying in the same place.

0:39:54 Intuitive sense.

0:39:56 So what we really want is an abstraction where the characters keep the same

0:39:59 identifier at all times, whether that's a fractional index or whether it's part of

0:40:03 the list CRDT internals, and then that's how we should represent sequences that

0:40:07 can move around, which is basically any list in a GUI where you can drag something

0:40:11 in between two existing elements.

0:40:13 Combining approaches

0:40:13 That makes a lot of sense.

0:40:14 And what's also so cool about like seeing all of the different options in this

0:40:19 classification table is that you don't have to choose exactly one for your app.

0:40:24 what I'm planning to do for, for Overtone is mostly follow the event

0:40:28 sourcing idea for collaborative state.

0:40:30 However, in the places where I have.

0:40:33 complex, particular problems such as a description text or like a document

0:40:39 text, this is where I most likely will resort to something like Automerge

0:40:43 or Yjs to let those technologies deal with the text editing, the

0:40:49 collaborative text editing use case.

0:40:51 But, and with that, I'm gonna I think I get the best of both worlds where I get

0:40:57 all the benefits from event sourcing for the, the more high level data structure

0:41:02 of my app and for the specificness of the text editing, I embed a little CRDT

0:41:08 use case in the broader document use case that I tame with event sourcing.

0:41:14 Do you think that general approach makes sense?

0:41:16 Yes, that's exactly the way to do it.

0:41:18 Yeah, if you look, there's a lot of, you know, blog posts saying

0:41:20 about how CRDTs are complicated or they're hard to implement.

0:41:23 Usually these blog posts are talking specifically about the text editing part.

0:41:27 That's sort of the hard part where you want to let someone else do

0:41:29 it and have their nice battle tested, fuzz tested implementation.

0:41:32 But for other data structures, like if you have, you know, sort of a database table

0:41:37 sort of structure or a map structure, it's easier to make your own sync engine for

0:41:42 that and just drop in an existing library to handle the lists and text editing.

0:41:46 Right.

0:41:46 So it's funny that you came from like going super deep on CRDTs of like spanning

0:41:52 this, broader table of possibilities.

0:41:56 And it seems like now you're actually much more drawn.

0:41:59 So the first quadrant around event sourcing, what, led

0:42:02 to, to this interest for you?

0:42:05 Let's see.

0:42:05 So it might just be, you know, the grass is greener on the other side.

0:42:08 I haven't tried to make an app or a library using the

0:42:11 event sourcing approach yet.

0:42:12 So maybe I just don't know what's wrong with it.

0:42:14 but it really started out about a year ago.

0:42:16 I was thinking about version control.

0:42:19 This was around the same time that Ink and Switch was thinking about version

0:42:21 control with their, upwelling essay.

0:42:24 and the idea was like, what if we could do this Git style model where

0:42:28 you make changes to an app, like a text document or a spreadsheet.

0:42:31 We just put these into linear branches, and then when we merge them, you

0:42:35 copy from one branch to another.

0:42:37 And originally, the idea is we're going to put CRDT operations in these

0:42:40 branches, because that's what I'm familiar with, but I eventually realized like,

0:42:43 actually, because the branches put the operations in a total order anyway, we

0:42:47 don't care about the CRDT correctness properties that say that you can

0:42:51 apply operations in different orders.

0:42:53 So we might as well just use arbitrary operations.

0:42:55 And that unlocks a whole lot of possibilities that would have been

0:42:59 hard to do in a CRDT system, like you can do these rename variable or find

0:43:03 and replace operations, maybe even like a change tone with AI operation.

0:43:07 Just put these in a log, have the log be in a fixed order, and

0:43:11 run the operations in that order.

0:43:12 That makes sense.

0:43:13 so aside from the versioning use case, can you think how, using a CRDT approach

0:43:20 versus an event sourcing approach might be a good or a bad fit for different

0:43:26 categories of apps that you can think of?

0:43:29 Sure.

0:43:29 Yeah, so I think the advantages of a CRDT approach, well first off,

0:43:33 you can do this more database model.

0:43:34 If I'm going to put my data in a magic box that says database, and

0:43:37 it's going to synchronize it for me, I don't have to worry about it.

0:43:39 Whereas you're doing an event sourcing approach, you have to think

0:43:42 more carefully about what are my mutations that I'm sending around?

0:43:46 How do I process them?

0:43:47 How do I make sure that they still make sense, even if someone else's

0:43:50 mutation reach the server first?

0:43:52 So that's a bit harder.

0:43:54 the other advantage of CRDTs is the efficiency perspective.

0:43:57 You can have, the CRDTs can implement operations in a very efficient way so

0:44:01 that you're not going to accidentally say, you know, I'm sending this mutation

0:44:06 to the server that's going to take.

0:44:07 an entire second to process is going to slow everyone down.

0:44:10 It's sort of the, the general trade offs that CRDTs behave more like a database.

0:44:14 They, they just work and they're optimized to be fast.

0:44:17 Which, with an event sourcing model, you get flexibility.

0:44:21 You can send arbitrary mutations around, you can have arbitrary business

0:44:25 logic on the server, it can even differ from the logic on the clients.

0:44:29 Just coming back to the video game example, you have a lot of logic that

0:44:32 the server needs to step through, checking permissions, checking

0:44:34 collisions, that sort of thing.

0:44:36 Which would be hard to do with a CRDT or with a database model.

0:44:40 Event sourcing challenges

0:44:40 So you mentioned that you haven't yet built larger systems with the

0:44:45 event sourcing approach, but I think you've still done a little

0:44:48 bit of research on what might await you in the event sourcing world.

0:44:53 So could you outline a little bit of like the potential concerns

0:44:57 you see on the horizon when going all in on event sourcing?

0:45:01 Yeah, so I guess the main concern always is if you're Sending around

0:45:06 this log of events to clients.

0:45:09 And if you're storing this as your single source of truth, then

0:45:13 storing all these events forever, it might take up a lot of space.

0:45:15 If you could imagine a text document, if each text character corresponds to 100

0:45:20 bytes of JSON, then the history of all the events is going to be a hundred times

0:45:25 bigger than the actual text document.

0:45:26 Even if you've since cleared out the entire text document, now it's empty.

0:45:29 You still have all this state.

0:45:30 So that's the main challenge is just how do we store the events efficiently, how

0:45:35 do we maybe compact them, say I don't need these events anymore, I'm going to

0:45:38 throw them away and replace the state, while still making that play nicely

0:45:42 with, you know, clients who have been offline for a month, that sort of thing.

0:45:45 Which sort of mechanisms do you think will mostly help to

0:45:49 overcome some of those issues?

0:45:51 I'm hoping the main mechanism is just To give up, basically say text is

0:45:56 very small for any, the main sources of lots of data in your app are

0:46:01 blobs like images or videos, which you can put somewhere else anyway.

0:46:05 And then for the actual event describing the fine grained changes, just store

0:46:08 them all and it's only going to be a few megabytes per document anyway.

0:46:11 Got it.

0:46:13 Yeah.

0:46:13 And I think on top of that, there's also the compaction use case.

0:46:17 Now that I have a little bit more, insight on, on that

0:46:21 approach with building Overtone.

0:46:23 for example, given that everything you do within Overtone, whether it's playing

0:46:28 a track, whether it's navigating within the app, whether it's adding a track

0:46:31 to your playlist or follow an artist, all of those are an event and Adding

0:46:39 a track to a playlist, there you do a lot less of those than, for example,

0:46:45 in the background, the app auto playing the next track, which is also an event.

0:46:52 And another kind of event is if the app tries to authenticate with a music service

0:46:58 such as Spotify to exchange tokens, which it needs to do at least Once an hour.

0:47:05 So it does so a little bit ahead of time.

0:47:07 So, also when you reload the app, it needs to do that.

0:47:11 So just by the fact by, the app running in the background over time, it Racks

0:47:18 up quite a lot of different events.

0:47:21 And I think they're the interesting part is the nature of the events

0:47:25 and the nature of those events also allows for different trade offs.

0:47:28 So me putting a track into a playlist, A, there's going to be

0:47:33 like way fewer events of those.

0:47:35 and it's fine to keep the entire history of this around.

0:47:38 What's so cool about this also, the fact.

0:47:41 That, I have this event allows me to trivially implement a feature like that.

0:47:46 I can hover over the track and I see the information when was it added by

0:47:51 whom was it added to, to the playlist.

0:47:53 It also makes implementing things such as undo much easier, but the other kind

0:47:59 of events, which might be implicit or which might just be a lot more, higher

0:48:05 quantity, what I've seen is that, it's not as crucial to keep those events

0:48:11 around for eternity, but some of those events are then also made irrelevant by

0:48:17 follow up events of the, the same type.

0:48:20 So for example, if your app has authenticated and overrides sort of like

0:48:24 an off state into the database, and.

0:48:27 two hours later, it has already done so 10 more times.

0:48:31 I don't need to keep the entire history before that, maybe besides auditing

0:48:35 reasons, so I can just at some point remove the old events, which keeps

0:48:41 an otherwise always growing event log at a, for this given event type

0:48:47 at a much more like constant size, which makes it much more feasible.

0:48:52 Another thing that I, started thinking about is like, what if you have not

0:48:57 just like one event log, but what if you have multiple event logs?

0:49:01 And what if you have, a hierarchy of event logs?

0:49:04 This is something that I also want to think a little bit more about, Let's

0:49:08 say you have a, a tree of, playlists, like a, a folder of playlists.

0:49:13 So you have a, a playlist.

0:49:15 And that playlist could also, possibly be a folder of other playlists.

0:49:20 So now what does the event log exist for?

0:49:23 Does it exist for like, everything in my library?

0:49:26 Does it exist for a broken down to.

0:49:30 only giving information about which playlists I have, and then I need to

0:49:34 subscribe to another playlist, but what if that playlist is a folder?

0:49:38 So this hierarchical aspect of it, I think this will keep me busy

0:49:42 for, for a little bit as well.

0:49:43 Do you have thoughts on those problems?

0:49:46 Yeah, I mean, this, the, what you're saying is really interesting.

0:49:48 It makes me think of the problem of ephemeral presence.

0:49:52 So, you know, in Figma, when your collaborators are moving their

0:49:54 mouse cursors around, you can see where they're at it every time.

0:49:58 I would imagine Figma is not actually persisting those mouse movements,

0:50:01 it's just sending them over the usual channels so that you can see them live,

0:50:04 but then you forget about these events because they don't matter anymore.

0:50:07 So I wonder if you could maybe do that for a lot of the events that don't

0:50:11 matter as much, or even in a text editor.

0:50:13 So one thing that's really hard with a collaborative text editor is you'd like it

0:50:17 so that whenever you press a key, that key is immediately sent to your collaborators.

0:50:21 But if that actually creates an event that's persisted in the log, then you have

0:50:24 this issue of, you know, 100 times as much storage as key presses, but maybe what

0:50:28 you could say is when you press a key, that's like an ephemeral presence message.

0:50:32 It's not actually stored, it's just sent over the same

0:50:34 channel as the mouse movements.

0:50:36 And this is sort of like an ephemeral mini log that's stacked on top of the actual

0:50:40 event log, and then every 10 seconds or so you send a compacted version of

0:50:44 like the entire sentence that the person typed as a single event, and that's

0:50:47 what's actually stored on the backend.

0:50:49 I wonder if that could help at all, or if this is even possible to implement.

0:50:52 Right.

0:50:53 I've actually implemented a small version of that already,

0:50:57 which I call local only events.

0:50:59 The idea of that is that, there's kind of like hierarchies of syncing as well.

0:51:05 There's like syncing, just from the main thread to the workers thread, which is

0:51:11 responsible for persisting the data, but also from one tab to another tab.

0:51:18 And, those two tabs should in some regards, Converge, and in

0:51:23 some regards, allow divergence.

0:51:25 so for example, if you have Notion open in two tabs, you want to be able to navigate

0:51:32 to different documents and those different tabs, but if you're in the same document,

0:51:36 you probably want to see the same thing.

0:51:38 So it's the same that applies to a music app.

0:51:41 Maybe in one tab you want to have.

0:51:43 The playback of one track and the another one, you want to not have the same

0:51:48 playback, otherwise you hear it twice.

0:51:50 but you want to maybe work on a playlist.

0:51:53 And so keeping things in sync is important, but I don't want to,

0:51:58 constantly as the playback progresses, have persistent events for this.

0:52:02 So I try to A, have like, very Deliberately small events.

0:52:08 And the other thing is where I have events that are broadcasted around.

0:52:12 But, if the app reloads, it doesn't rehydrate from those.

0:52:16 It either catches them midway or it's not important enough.

0:52:21 that it shows it so very similar to the presence feature in Figma.

0:52:25 So I have implemented a first version of this, but I think there can be

0:52:29 use cases where you might want to keep them around for like 10 minutes

0:52:34 or 10 seconds, like you say, and then have a version of compaction.

0:52:37 I think that that's really interesting.

0:52:40 What you're describing sounds really cool.

0:52:41 I'll be interested to see this code someday.

0:52:47 Local-first ideal are still hard to reach

0:52:47 So you've now been in the local-first space for over five

0:52:50 years, and I'm sure you've seen many technologies come along over time.

0:52:56 I'm curious whether you have certain strong opinions about the local-first

0:53:00 space or the web ecosystem more broadly.

0:53:02 Yes, I guess one.

0:53:04 Well, this isn't really an opinion, but just I'll make an observation that the

0:53:06 local-first movement has really exploded just within the past 12 or 18 months.

0:53:11 Like, starting out five years ago reading CRDT papers and going to CRDT

0:53:15 conferences, it was much more, you know, mellow academic atmosphere.

0:53:19 But now there's just so many tools popping up, I can't keep

0:53:21 track of them in my browser tabs.

0:53:23 you know, the local-first discord, all that stuff.

0:53:25 Just a lot more activity.

0:53:26 So it's both exciting and also a bit scary, because now I can't read all

0:53:29 the papers that come out anymore.

0:53:31 yeah, in terms of opinions, I guess the The strong opinion I've had in the

0:53:35 past year or so is that the local-first ideal, I think, is too hard right now.

0:53:41 There's just too many problems we'd have to solve to actually make like

0:53:43 a local-first app where the hosting provider can go away and you'll still be

0:53:47 able to collaborate and keep your data.

0:53:49 So the problem that I've been focusing on for the past year is the narrow

0:53:53 goal, like the baby step, of how do we make traditional central server SaaS

0:53:58 collaboration easier to implement, and maybe a bit easier to deploy.

0:54:02 So that's working on primitives like what you were describing with LiveStore.

0:54:05 We want some way to have events that you send around and persist IndexedDB.

0:54:10 broadcast channel between different tabs and then eventually send it

0:54:13 to a server that stores them and broadcasts them back to the client.

0:54:16 Just make some really good implementation of that that people can reuse so they

0:54:19 don't have to reinvent it every time.

0:54:22 and I think that'll be.

0:54:23 Both useful for, you know, developers and also a good stepping stone

0:54:26 towards the eventual goal of we want to get rid of this server and

0:54:29 have our, have our data forever.

0:54:31 I love that observation, and that opinion.

0:54:34 I think that's also one of my key takeaways from talking to many folks

0:54:38 at the local-first conference we had this year in Berlin, where Everyone

0:54:42 gets excited about all the goals and all the ideals of local-first, but

0:54:48 going after a few of those already is technically very complicated.

0:54:54 And then going like all the way to making sure that the software still

0:54:58 works if the vendor goes away, etc.

0:55:01 That is, I think, right now achieved by only a very, very few

0:55:07 set of products and technologies.

0:55:09 I hope that in five years from now, it will be table stakes.

0:55:13 But, I think it's a little bit like Maslow's hierarchy of needs.

0:55:17 And like we, here we have like the hierarchy of ideals and we haven't, Yet

0:55:21 quite made it as easy to achieve all of it, hopefully we'll, we'll get closer

0:55:26 to that over the next couple of years.

0:55:29 So those technologies that you've, now mentioned, is there anything

0:55:37 List-positions

0:55:37 Let's see.

0:55:37 So the main project I've had recently is it's a library called list-positions.

0:55:42 So you can read about it on my blog post or look at the docs on GitHub.

0:55:45 But it's basically trying to solve this fractional index generalization problem.

0:55:49 You can think of it like a fractional index library that also

0:55:52 implements the extra features that CRDTs have to prevent some bugs.

0:55:57 The idea is that you can use this as a drop in part to do just the text and

0:56:01 list collaboration in some arbitrary data structure . So I built examples on top

0:56:06 of Triplit, Electric SQL, Replicache.

0:56:09 So these are our collaborative data stores that don't talk

0:56:11 about lists or texts at all.

0:56:13 They're basically syncing maps or database tables.

0:56:15 And I said here, if we just stick these souped up fractional

0:56:18 indices on top, we can actually do text to rich text collaboration.

0:56:23 Outro

0:56:23 Very interesting.

0:56:24 I will check this out.

0:56:25 Maybe I can use it for Overtone.

0:56:27 Maybe I could even integrate it with LiveStore.

0:56:30 I will certainly check this out and we'll put the link in the show notes.

0:56:34 Great.

0:56:34 Matthew, is there anything else you want to share with the audience?

0:56:38 No, I don't think so.

0:56:39 It's been a really good chat.

0:56:40 Thank you so much for sharing all of your knowledge about different

0:56:44 approaches to syncing state.

0:56:46 I think this is the most in depth we've gone on those topics so

0:56:49 far, and it provided a brilliant overview for future conversations.

0:56:53 Has helped me a ton to, to better understand this, both your blog

0:56:57 posts as well as this conversation.

0:56:59 So thank you so much for taking time today and coming on to chat.

0:57:03 Yeah, thanks so much for having me.

0:57:04 Thank you for listening to the local-first FM podcast.

0:57:07 If you've enjoyed this episode and haven't done so already, please subscribe and

0:57:10 leave a review wherever you're listening.

0:57:12 Please also share this episode with others.

0:57:15 Spreading the word about the podcast is a great way to

0:57:17 support it and to keep it going.

0:57:19 A special thanks again to Rocicorp and Expo for supporting this podcast.

0:57:24 See you next time.