localfirst.fm
All episodes
September 3, 2024

#14 – Matthew Weidner: Architectures for Central Server Collaboration

#14 – Matthew Weidner: Architectures for Central Server Collaboration
Sponsored byExpoRocicorp
Show notes

Transcript

0:00:00 localfirst.fm #14 – Matthew Weidner: Architectures for Central Server Collaboration
0:00:00 And this also feeds into features that you might want to give to your
0:00:02 users, especially in productivity apps.
0:00:04 You want to have that change history where you can see what was everyone doing.
0:00:07 You also want to have undo's.
0:00:09 basically what you can do for undo is when you create this action or this mutation
0:00:13 describing, the high level intent, you can also tag along with it, a mutation saying,
0:00:17 here's how to undo this operation later.
0:00:19 And then you store that somewhere and then you just have a queue somewhere in
0:00:22 your app that's like the action queue.
0:00:24 You can go through that and undo things in a nice way.
0:00:27 Hopefully, users will be happier with this than if you just, you
0:00:31 know, revert states exactly, ignoring collaborators updates.
0:00:35 Welcome to the local-first FM podcast.
0:00:37 I'm your host, Johannes Schickling, and I'm a web developer, a
0:00:40 startup founder, and love the craft of software engineering.
0:00:43 For the past few years, I've been on a journey to build a modern, high quality
0:00:47 music app using web technologies.
0:00:49 And in doing so, I've been falling down the rabbit hole of local-first software.
0:00:54 This podcast is your invitation to join me on that journey.
0:00:57 In this episode, I'm speaking to Matthew Weidner, a computer science
0:01:01 PhD student at Carnegie Mellon University, focusing on distributed
0:01:05 systems and local-first software.
0:01:07 Matthew has recently published an extensive blog post about architectures
0:01:12 for central server collaboration, which we explore in depth in this conversation,
0:01:17 comparing different approaches, such as CRDTs and event sourcing.
0:01:21 Before getting started, also a big thank you to Rocicorp and
0:01:24 Expo for supporting this podcast.
0:01:27 And now my interview with Matthew.
0:01:29 Hey, Matthew.
0:01:30 Thank you so much for coming to the show.
0:01:32 How are you doing?
0:01:33 I'm good.
0:01:34 Yeah.
0:01:34 Thanks for inviting me.
0:01:36 Yeah.
0:01:36 Super excited to, to have you here.
0:01:38 I think, our shared friend, Geoffrey Litt introduced us and he and, Matt Wondlaw
0:01:44 and a few others have, when you were writing this blog post, the architectures
0:01:48 for central collaboration, all of my friends shared this blog post with me.
0:01:53 And it has since, like, served as a really, really reliable and
0:01:57 good foundation to just provide an orientation around, yeah, how do syncing
0:02:03 systems, et cetera, how do they work?
0:02:06 So this has been the, the initial touch point for me, but would you
0:02:10 mind briefly introducing yourself?
0:02:12 Background: PhD on collaborative apps
0:02:12 sure.
0:02:13 Yeah.
0:02:13 So I'm Matthew.
0:02:14 I'm a researcher and developer.
0:02:16 I've been thinking about, local-first software, more generally the problem
0:02:19 of how do we make collaborative software easier to program.
0:02:23 So that's been, I guess, five years of PhD work and now working full time on a
0:02:28 collaborative app, at a small company.
0:02:30 And yeah, the, the question for me has always been, how can we make
0:02:33 building a collaborative app in the style of Google Docs or Figma
0:02:36 as easy as making a smartphone app or a local only desktop app?
0:02:41 Amazing.
0:02:42 I'm curious, what led you, like when you say five years ago, you started working
0:02:46 on this, what led you to, to that point?
0:02:48 What motivated you to, to look into this?
0:02:51 yeah, so it actually started a little earlier.
0:02:53 So six years ago, I was doing a master's degree at the University of Cambridge.
0:02:57 I had to pick a master's thesis project, and some of the Ph.
0:02:59 D.
0:03:00 students talked about what their lab group was doing, the TrueData group,
0:03:03 where they were working on an end to end encrypted version of Google Docs.
0:03:06 The idea is that some professions, like lawyers or journalists, they want the
0:03:09 collaboration of Google Docs, but they don't trust their data to a third party.
0:03:13 where the, you know, the employees can look at it or
0:03:14 it's on someone else's servers.
0:03:16 So they wanted this end to end encryption where you say only
0:03:18 you and your collaborators can read the unencrypted data.
0:03:22 So I thought this sounded like a really interesting project.
0:03:23 I just joined them for my master's thesis.
0:03:25 Turned out to be working with Alastair Beresford and Martin Kleppmann, um,
0:03:30 mostly on the cryptography side.
0:03:31 Then after that, I decided that actually the collaboration side
0:03:34 sounded more interesting, and I wanted to work on that for my PhD.
0:03:37 Very interesting.
0:03:38 What did the technology landscape at that point look like?
0:03:41 I mean, today there's like Automerge and quite a few other technologies
0:03:45 that already try to attempt this.
0:03:47 what did the technology landscape back then look like?
0:03:50 So this was before the local-first essay.
0:03:52 I think I actually saw a draft of the local-first essay that
0:03:56 year, now as a master's student.
0:03:57 Automerge I believe had started, YJS had started, but I hadn't
0:04:01 heard of people using it yet.
0:04:03 but yes, people were just getting started to use this idea of.
0:04:07 collaborative data structures for the web, not necessarily with central
0:04:10 servers like these CRDT libraries were just getting started, and I don't
0:04:15 know if the local-first world had really even started yet at that point.
0:04:19 Right.
0:04:19 Yeah.
0:04:20 I think there are so many people who thought about similar problems
0:04:23 over like decades before then.
0:04:25 There was like CouchDB and PouchDB and like a lot of great minds
0:04:29 already thought about this, but I feel like the real momentum
0:04:32 started with the local-first essay.
0:04:35 So I'm curious, take me through a little bit of like the, the
0:04:37 five years working on that.
0:04:40 What were some of the milestones?
0:04:42 How did you go about starting this in the first place?
0:04:45 Sure.
0:04:45 So the, the main things I was coming at it from a more academic perspective, like
0:04:49 I really have a theory math background.
0:04:51 So I was looking at the, the theory of CRDTs, these conflict
0:04:54 free replicated data types.
0:04:56 Which, sort of, the idea is that it's a data structure that's
0:04:59 copied on multiple devices.
0:05:01 You put your data in it, like your app's data, and then one user can change their
0:05:04 copy of the data whenever they want.
0:05:06 At some point later, you'll sync up in the background and come to
0:05:09 a convergent copy where everyone's looking at the same document again.
0:05:12 This is really designed for the sort of peer to peer model where you don't
0:05:15 necessarily have central authority, it's just everyone updating their own data.
0:05:19 and also this local-first spirit, where you always update the local copy of
0:05:21 your data first, and then you talk to everyone else and say, here's my changes.
0:05:25 So I spent the first year really just reading the papers in that field.
0:05:29 So there's a classic paper by Mark Shapiro right now for 2011, a lot of papers by
0:05:34 Carlos Vaccaro and his collaborators, yeah, just trying to learn what are these
0:05:38 data structures, what can we do with them.
0:05:41 Got it.
0:05:41 And so after that, you started your own implementations of CRDTs.
0:05:46 And was there any sort of reference app that you oriented this around?
0:05:50 Not really.
0:05:51 So there's actually, there's a reference CRDT.
0:05:53 So we started with this paper, which is very theoretical about this way
0:05:56 that you could maybe combine two CRDTs.
0:05:59 So the example we use, which is a bit silly, is if you have.
0:06:02 a number that you can add things to, like maybe a bank account balance
0:06:06 you can add to, you can also multiply to if you're applying the interest.
0:06:09 How do you combine these two operations in a single CRDT that can
0:06:12 be updated with either add or multiply?
0:06:15 So then my advisor had this idea, let's implement this in a library.
0:06:18 and there that already set some sort of unique design principles,
0:06:23 which is that we're going to assume you're making your own CRDTs.
0:06:26 It's not just a collection of CRDTs we give to you, like map, list, et cetera,
0:06:30 actually going to be whatever, and then some way to combine them together.
0:06:34 So that was really the starting point, is that we want to make a place where you
0:06:38 can make your own CRDTs and compose them.
0:06:40 I don't think we really had a specific application in mind at the beginning.
0:06:44 Was that technology ever released or open source or talked about in some way?
0:06:50 Collabs
0:06:50 So we did make a open source library about it.
0:06:52 It's called Collabs.
0:06:53 So it's written in TypeScript.
0:06:55 we have a documentation site.
0:06:56 I think it's collabs.readthedocs.Io.
0:06:59 it's definitely still an academic project.
0:07:01 So it's really about, here are these data structures that you can play
0:07:04 with and you can make your own things.
0:07:06 we do have some basic demo apps, like your basic, uh, You know, text editor.
0:07:11 there's a to do list sort of thing somewhere.
0:07:13 And then there is an archive paper about it that you can read, which goes into
0:07:16 more detail about the system design and why we did things the way we did.
0:07:20 Got it.
0:07:20 And so it sounds like you've really gone super deep on this, mostly
0:07:25 oriented from the CRDT side of things.
0:07:28 But, as you read the papers, as you were working on this, you also got
0:07:33 a better understanding of the larger space and the other approaches.
0:07:37 And I think you got more curious about the other approaches and this
0:07:40 is what you've laid out so clearly and brilliantly in this blog post that will
0:07:45 be linked in the, in the show notes.
0:07:47 And I highly recommend anyone who's listening to read it in depth, if
0:07:51 you're curious about those topics.
0:07:53 so the, the blog post called Architectures for Central Server Collaboration, and
0:07:58 it provides a really nice way to think about this, like provides of like a.
0:08:03 Hierarchical structure of what are the design decisions?
0:08:07 What are the trade offs?
0:08:08 What are the concerns about the different approaches.
0:08:11 And so I've, I'd love to just go through that step by step.
0:08:17 Architectures for Central Server Collaboration
0:08:17 Sure.
0:08:18 Let's see.
0:08:19 Yeah.
0:08:19 So the, the idea of this blog post is we're thinking about.
0:08:24 Real time collaborative apps.
0:08:25 So these are apps like Google Docs, Figma, Notion, that sort of thing.
0:08:29 And sort of the distinguishing feature of these apps compared to
0:08:32 more traditional web apps is that, you know, when you make a change, it
0:08:36 updates your local copy immediately.
0:08:38 It's not just click a button, go back to the server, get a
0:08:41 new web page and show it to you.
0:08:43 It's click a button and something updates on your own screen
0:08:45 immediately and eventually it'll tell the server what you did.
0:08:48 So this blog post was trying to think about, in general, with these real
0:08:51 time collaborative apps, like, what are we doing in a semantic sense?
0:08:54 Like, what does it mean to be real time collaborative?
0:08:57 And then, what sort of, you know, the high level of how you can implement
0:09:00 that in the most flexible way possible.
0:09:03 And so you've derived a couple of like really nice ways to, to think about
0:09:08 that, like in terms of dimensions and later on you, you can nicely
0:09:12 summarize it in a nice overview table.
0:09:15 would you mind motivating some of the dimensions that you come up with here?
0:09:20 Sure.
0:09:21 Let's see.
0:09:21 So I guess just for context, my own background is, as I said, thinking
0:09:24 about it from a CRDT perspective.
0:09:26 This is very much the perspective if you have some data structures,
0:09:30 which are usually pretty low level, like maps and lists, and you have
0:09:33 some prescribed operations that you can perform on them, and then it'll
0:09:37 sync it for you under the hood.
0:09:39 And then also in the CRDT model, it's usually not really assuming a central
0:09:43 server, where the central server is doing basically the same thing as the clients.
0:09:47 So what the dimensions are thinking about is, okay, what can we do that's
0:09:51 different from just the CRDT model?
0:09:53 And there is Yeah, there's really three dimensions.
0:09:56 I guess maybe the most interesting one is the is how you describe
0:10:00 operations on the collaborative state.
0:10:03 So you have sort of the, the database or key value store model, which is,
0:10:06 you have these low level state changes.
0:10:09 Like when I check a box in to do list, that's creating a row in a database
0:10:14 that says, you know, to do list checked, true, that sort of thing.
0:10:18 And then there's also this opposite model, which is sort of the more event sourcing
0:10:21 approach where you have these high level operations, sometimes called mutations.
0:10:26 And this is where, when you change the data, you're actually telling the server
0:10:29 exactly what the user's intent was.
0:10:31 You say, the user wants to check this box and make it true, and
0:10:34 then you broadcast that high level intent back to the other users.
0:10:38 And tell them what to do and how to update their state.
0:10:41 And I think this is also like this distinction between the
0:10:45 intent of a mutation and the, the change more directly.
0:10:50 I think this can be, a little bit of a subtle difference for
0:10:55 people who haven't built something with either approaches yet.
0:10:59 But, uh, I think to draw an analogy from the web world, When you're working
0:11:05 with something like Redux, this is where I'm not sure whether you ever
0:11:09 built some, some front end apps with Redux, but this is where you have, for
0:11:12 example, if I remember correctly, the, the concept of an, of an action, which
0:11:17 is basically the idea of an event where you declaratively say like, okay, there
0:11:22 is an action or there is an event for, someone wants to complete this to do.
0:11:29 Then further down the road, there's like a reducer, which then in, for example,
0:11:34 maintains a list of to-dos and maybe kicks it out or maybe, overrides a property
0:11:41 in the to-dos array and says something is done as opposed to the other approach
0:11:46 where you directly mutate the, the state.
0:11:50 Which is, for example, in the web world, we're using something like MobX, etc.
0:11:55 And so now we're talking here about the equivalence for distributed states,
0:12:00 and where CRDTs, I think, give us more the analogy, this might be a stretch,
0:12:05 but give us more of like an equivalent of like something like MobX, where
0:12:09 you mutate the state more directly.
0:12:11 And the CRDT underpinnings nicely make that principled constrain
0:12:16 you in the in the right way and then also distribute the state.
0:12:20 Did I summarize this in the right way?
0:12:23 Yes, good description.
0:12:25 Maybe another way to think about it that's in more illustrative
0:12:28 than to do list is to think about like the the video game example.
0:12:31 So for example in a video game if you press an arrow key on your keyboard you
0:12:35 can do sort of the high level intent is I want my character to move forward.
0:12:40 And then your game server will interpret that intent.
0:12:42 It'll try to move your character forward, but if there's a wall
0:12:44 in the way, it'll stop you.
0:12:45 And if you step on a pressure plate, it'll do something.
0:12:48 and then ultimately compute the actual state changes, which are
0:12:51 the low level things of like, what coordinates are my player at now?
0:12:55 what is the state of the world in terms of, you know,
0:12:57 doors that are open or closed?
0:12:58 And it'll send those low level state changes back to clients.
0:13:02 So that's another example of this distinction between high
0:13:04 level versus low level intent.
0:13:06 Right, and I think this is now also a really important distinction because in
0:13:11 the Redux or MobX example, it's, like, all of that is happening on the local device.
0:13:18 There's no cheating in that regard, but when you're talking about games,
0:13:22 they can actually be cheating.
0:13:23 And how do you prevent that particularly in a multiplayer context?
0:13:27 And this is where you, what do you do on the client and what do you do on the
0:13:32 server, maybe need to be different things where the server acts more than authority.
0:13:38 And the client rather provides, instructions as opposed to providing
0:13:42 the authoritative source of truth for the actual state of a world.
0:13:47 And so this is where the intent is not equal to the reality
0:13:53 that is coming out of it.
0:13:55 And I think this is nicely illustrated in your article through this game example,
0:14:01 where you can basically send to the server, like, Hey, I want to move forward.
0:14:05 The server knows where you were before, and the server tells you
0:14:09 afterwards, like, now you're here.
0:14:11 The client locally can probably, if everything is in an okay state, has
0:14:16 probably already arrived at the same conclusion, but, at least this way the
0:14:21 client can't override to say the player position is somewhere in an illegal state.
0:14:27 Server-side rebasing
0:14:27 Maybe this sort of transitions into the next point or another
0:14:30 dimension in the article.
0:14:32 Which is, what does the server actually do when it receives
0:14:35 an operation, in particular an operation that's out of date?
0:14:38 So the classic example is if you have a like counter, like a post has
0:14:42 some number of likes on it, if it has six likes and I send a command
0:14:45 to the server that says, I like it, change the number of likes to seven.
0:14:48 But what if someone else also liked the post in the meantime, and their
0:14:52 like made it to the server first?
0:14:54 So now the like count's already seven, I don't want to set it to seven
0:14:56 again, I want to increase it to eight.
0:14:58 And there's a, yeah, so basically there's a few philosophies in how the
0:15:01 server should process this operation so that it still makes sense.
0:15:04 I mean, technically it's legal to keep the original operation as
0:15:08 just set the count to 7, but that's not really what the users expect.
0:15:11 So the one philosophy, sort of the CRDT way, is to say, I'm going to phrase
0:15:15 my operations in such a way that the server will know what I want it to
0:15:19 do, And it'll do the correct thing.
0:15:21 So for a light counter, the classic way is you say, increase the light count by one.
0:15:26 The server can get that, and even if the count has gone up since what you
0:15:29 originally thought it was, it's still going to add one and do the proper thing.
0:15:32 So you're going to end up with eight lights instead of seven.
0:15:34 And sort of the other spirit is the operational transformation spirit.
0:15:38 So this is an older technique for collaborative apps that's used by
0:15:41 Google Docs and was developed in the 90s for the Jupyter collaboration system.
0:15:45 And here the spirit is, the server is going to look at your operation, it's
0:15:48 going to look at all the intervening operations that you didn't know about but
0:15:52 the server has received already, and it's going to use those to sort of compute
0:15:56 what your new intent is supposed to be.
0:15:58 So this example, you would tell the server, change the like count
0:16:01 to seven, but the server would see that there was an intervening change
0:16:04 the like count operation already.
0:16:06 It's going to rewrite your operation as change the like count to eight, and
0:16:09 actually apply that to its state and send that operation to the other users.
0:16:13 Got it.
0:16:14 So, and this is basically about the, the convergence aspect And I
0:16:18 suppose where this code is running, this can equally work on the
0:16:22 client as well as on the server.
0:16:25 So this is sort of orthogonal to the, the game example case that we talked
0:16:30 about, which is more about the authority.
0:16:33 Yeah.
0:16:34 Yeah.
0:16:34 So this isn't about how does the server.
0:16:36 interpret operations from, like, a correctness permissions perspective.
0:16:40 It's just how does the server handle operations that are sort of
0:16:43 stale, in the sense that the client originally applied them one state,
0:16:46 but by the time they arrived at the server, the state had updated because
0:16:49 other people were doing things.
0:16:50 Now the server has to figure out what to do.
0:16:52 Yes, this is the server side rebasing.
0:16:55 This is where the server has to rebase your operation, or
0:16:58 the incoming operations, on top of whatever its new state is.
0:17:02 And sort of the analogy is to git rebasing, where you might try to apply
0:17:05 a commit on top of some new commits that weren't there when you first tried it.
0:17:10 Got it.
0:17:11 Okay, so that is one dimension that you've nicely dissected
0:17:15 here in this, in this blog post.
0:17:18 Optimistic Local Updates
0:17:19 So the next one is the the optimistic local updates on the client.
0:17:23 So now if we assume there's an central server, everyone's taking
0:17:26 these updates, they're sending these operations to the server, the server
0:17:29 knows what the state's supposed to be.
0:17:31 And what you could say is just the traditional, web app model.
0:17:34 If I submit an operation to the server, it processes it, it sends back, sends me
0:17:38 back the result, and now I get to see it.
0:17:40 So if you think like, um, you know, traditional HTML form, you submit your
0:17:43 operation to the server, it gives you a new page back saying what it is.
0:17:46 But with modern apps, we want to do better than that.
0:17:48 We want to say that when I perform an operation on the client, it's going
0:17:52 to update my own state immediately.
0:17:54 And that's an optimistic update because I'm sort of optimistically
0:17:57 assuming that the server is actually going to receive my update.
0:18:00 It's going to process it in the way I expected.
0:18:02 No one else is going to interfere.
0:18:04 this is just a nice property in terms of making the app feel more responsive.
0:18:07 You want to see your key presses immediately.
0:18:08 You want to see that button get checked immediately.
0:18:10 So the question is then, how do we actually do that?
0:18:13 Or, I guess the first question is even, what is the correct answer?
0:18:17 What does it mean to optimistically update my state?
0:18:20 And I guess, yeah, sort of the conclusion I came to that, you know,
0:18:23 people have come to in computer games as well, is that you want to take
0:18:27 the latest state you've received from the server, plus your own optimistic
0:18:32 local operations on top of that.
0:18:34 And that's always what the correct state is.
0:18:36 And even as you receive or perform new operations, you're
0:18:38 just maintaining that state.
0:18:40 Like from your first dimension, which is about server side rebasing, now it's
0:18:45 a lot of the same ideas, but applied on the client where you need to make
0:18:50 the same trade off decisions again, you might come up with different conclusions
0:18:56 based on the server and based on the client, depending on your use cases.
0:18:59 So that, that is the second dimension.
0:19:03 And, then you're, you talk about the, the form of operations.
0:19:07 So how, a state is changing based on mutations, based on state changes.
0:19:15 Can you go a little bit more into, into detail here?
0:19:18 Form of operations
0:19:18 Sure.
0:19:18 Yes.
0:19:19 This is what we were talking about at the beginning, where when you, you check
0:19:22 a box in a to do list, you want to say, Am I updating a row in a database that
0:19:25 doesn't know anything about to do lists, or am I sending a high level mutation
0:19:28 that says, like, this user wants to check the to do list and, you know,
0:19:32 do that action or maybe do something else if that's not valid anymore.
0:19:36 So here we get to choose which form of operations we want.
0:19:38 We want to send these high or low level from the client to the server.
0:19:42 Then once the server updates its state, does it want to send high or
0:19:45 low level changes back to the clients?
0:19:48 yeah, so the video game example is an interesting one where you actually
0:19:50 make different choices usually.
0:19:52 So usually you'll send the high level operations from clients to the server.
0:19:55 You say, I want to move forward, I want to shoot my crossbow.
0:19:58 And then on the way back from the server to the client, usually it
0:20:01 won't send those actual actions.
0:20:02 It'll just send the results, which are changes to some basic key value store.
0:20:06 But you can also make different choices, like you can say, you
0:20:10 know, Git is an example where it's sort of high level mutations.
0:20:14 You're saying, like, I want to, you know, change this text paragraph in
0:20:17 a specific file, and Git will send those exact operations to every client.
0:20:21 It's not going to interpret them at all on the server and change
0:20:24 them into a low level change.
0:20:26 Whereas if you use something like the Firebase database, that's all low level.
0:20:30 You send low level changes to Google servers.
0:20:32 Where you say, I want to, you know, set this key to this value or I want
0:20:35 to delete this object in the database.
0:20:38 And it's going to send that change back to clients without having any idea what
0:20:41 the keys and values actually represent.
0:20:43 That makes sense.
0:20:44 And so I think this is also nicely drawing a boundary between the more declarative
0:20:51 approaches that you have in mutations that you can reason more clearly about,
0:20:56 like in the context of your domain.
0:20:58 But it also only makes sense in the context of your domain.
0:21:02 Whereas with state changes, this is the appeal of CRDTs.
0:21:06 This is you just mutate a document and, the, the underlying mechanics, make
0:21:12 sure that the state changes are behaving in, in a useful way since I, I suppose
0:21:17 like listening to the state changes yourself in your app, that's no fun.
0:21:22 So you really want, a system like CRDTs to make sense of that
0:21:26 . So now with those three dimensions and I go through them again, the
0:21:30 server side rebasing, the optimistic updates and the form of operations
0:21:34 like declarative versus state based, now you've combined all of that in a
0:21:39 really nice, classification table where we get a whole bunch of like matrix
0:21:45 cells here with different technologies.
0:21:48 So, Again, highly recommend, actually reading this and looking at the
0:21:52 beautiful table for yourself, but in the different cells, you've
0:21:56 also filled in a couple of existing technologies and see where they slot in.
0:22:01 So would you mind going through the different technologies and maybe
0:22:07 Classification table
0:22:07 Sure.
0:22:08 So I guess first I can talk about the one cell just near the bottom, right
0:22:11 in the table, if you're looking at it.
0:22:12 Which is the CRDTxCRDT cell.
0:22:18 So this is basically the place where I spent my most time reading
0:22:21 about CRDTs, working on this academic open source library.
0:22:24 And that's where the operations that users send are really these low level state
0:22:29 changes to some sort of magical replicated database, where you update the database,
0:22:33 like normally on your local device, and it promises to do this synchronization in
0:22:37 the background and make sure that everyone converges to the same state immediately
0:22:40 without really caring about what specifically your data or operations are.
0:22:44 So that some prominent examples.
0:22:45 So Firebase Realtime Database, I think of as an example, also
0:22:49 the CRDT ish libraries, like YJS.
0:22:52 also, yeah, Triplit, InstantDB, those are all sort of in this quadrant
0:22:56 or in this cell thing that we're going to replicate low level changes
0:23:00 for you, just like as they are.
0:23:02 another cell on this table, which is sort of near the bottom left, we
0:23:05 mentioned in the computer game example.
0:23:07 In a computer game, you're going to send these high level actions to the
0:23:10 server, which is going to figure out what to do with them, and then communicate
0:23:14 the state changes back to clients.
0:23:16 that's another interesting cell, both because it's sort of old, like, you know,
0:23:20 this is, starts with the Half Life game engine in the 1990s, so people have been
0:23:23 using this technique forever, just not in web apps, it's in computer games.
0:23:28 But more recently, Replicache implements this model as a data sync
0:23:32 layer for web applications, which I know a number of companies are using.
0:23:36 and I found that really inspirational reading about how Replicache works.
0:23:39 I'm glad to have learned about it.
0:23:41 Right.
0:23:41 And I love like how you compare those technologies.
0:23:44 Both technologies.
0:23:45 I love, love like the Half Life game engine spent way too much
0:23:49 time, playing various Half Life game engine games, where it's very, very
0:23:54 intuitive that if you play, press the W key, which moves you forward.
0:23:59 That's like communicating the intent.
0:24:01 To the server, you don't tell the server like, Oh, I'm at these coordinates.
0:24:04 You just give it like a history of like which keys you pressed
0:24:08 and therefore like how you moved.
0:24:10 and it does some validation of like whether all of that is okay.
0:24:13 And it sends you back the location.
0:24:16 And it's the same about Replicache where you send it a few mutations And on the
0:24:20 Replicache server, it interprets all of that and sends back to you the state
0:24:25 using the server side knowledge, which might be different than the client side
0:24:29 implementation, so it's the authority.
0:24:31 So that is very clear and very nicely laid out here, where you send the intent,
0:24:36 you send the declarative mutations, and the server sends you back some state
0:24:40 changes, as opposed to what you before mentioned, with a CRDT times CRDT,
0:24:46 where Both on the client, on the server, you run the same CRDT convergence.
0:24:52 And, uh, so those two, those two cells are very clear.
0:24:55 Yes, exactly.
0:24:56 And then, yeah, so I guess the remaining cells of the table, they
0:25:00 mostly, they either use state changes in both directions or they use high
0:25:04 level mutations in both directions.
0:25:06 So, let's see.
0:25:07 Two interesting ones.
0:25:08 Automerge in ShareDB.
0:25:10 They're both doing a similar idea to the CRDT libraries, like YJS, where they're
0:25:15 sending these low level state changes around and making sure everyone converges
0:25:18 to the same state, but they have a different way of doing this internally.
0:25:22 So with Automerge, what you're actually doing is you're performing these state
0:25:26 based Automerges that a library is basically a JSON CRDT, but the way
0:25:30 it works is more of an, like an event sourcing model, where you have this
0:25:35 total order of CRDT style operations.
0:25:38 All clients are going to make sure that they eventually
0:25:40 confer to the same total order.
0:25:42 So everyone will agree what operation 1, operation 2, operation 3, etc.
0:25:46 The state is the result of applying all of these operations in that fixed order.
0:25:50 And if, you know, people do operations concurrently on their different devices
0:25:54 because the network's not working, then we'll just sort those operations
0:25:57 into some order later, make sure everyone agrees on the same order,
0:26:00 and that's giving you your state.
0:26:01 Interesting.
0:26:02 So given that Yjs and Automerge, which I think are in the web ecosystem, the, the
0:26:07 two most popular CRDT implementations, they actually do differ in this dimension
0:26:12 of like how state changes are implemented.
0:26:15 again, Firebase, as well as Yjs.
0:26:17 following more strictly the CRDT approach and Automerge using server reconciliation.
0:26:23 is there an example that comes to mind where this, in a example app
0:26:27 use case would differ and where you would use Automerge or Yjs,
0:26:32 intentionally because of this?
0:26:34 I think in terms of the, the external.
0:26:37 the API, or what you see as a user of these libraries, it doesn't really differ.
0:26:41 It's more just in terms of the implementation, I guess, in, in this
0:26:45 totally ordered model like Automerge uses, you don't have to worry as much
0:26:48 about getting the math exactly right.
0:26:50 Like, am I sure that these two operations actually do the same thing
0:26:53 if I apply them in different orders, which is this mathematical requirement
0:26:57 that you have to satisfy for CRDTs.
0:26:59 So that makes it a bit easier on the, to like the correctness and
0:27:04 sureness of the implementation.
0:27:06 Whereas with the YJS or CRDT style, if I'm just going to apply my operations
0:27:10 directly, in principle that can be a bit faster because you don't have to
0:27:14 worry about rewinding your total order of operations and then applying a new
0:27:18 thing and walking it forward again.
0:27:21 That said, usually if you're making a collaborative application with CRDTs,
0:27:24 you don't really need to process more than a handful of operations
0:27:28 every second, so it doesn't matter if it takes a little bit longer.
0:27:31 Got it.
0:27:32 Okay.
0:27:32 That, that makes sense.
0:27:33 So in the CRDT approach, wherever I am currently in my state, I can just apply
0:27:38 on top the existing or the new events.
0:27:41 And, with a server side reconciliation approach, this is where depending
0:27:45 on what the new events are, where they sit in terms of the timeline.
0:27:49 I might need to, uh, Wind back, apply them, and that might take a little
0:27:54 bit longer, but possibly also makes the implementation a bit easier.
0:27:58 Yeah, I guess just one note.
0:27:59 So, you've been saying server side reconciliation.
0:28:01 Automerge does not actually require a server.
0:28:03 It's a completely decentralized model.
0:28:05 The name is just sort of by analogy to what you would do
0:28:07 if you would have a server.
0:28:09 You would put all the things in the order that the server receives them.
0:28:11 Automerge instead infers a sort of order in a decentralized way.
0:28:15 That makes sense.
0:28:16 So, we've now mostly talked about the state changes side of it.
0:28:21 And, we talked about how our optimistic, locally, how are
0:28:26 the state changes applied.
0:28:28 But we didn't talk too much about the mutations times mutations quadrant, which
0:28:32 also has couple of, like, Subsections.
0:28:38 Event sourcing
0:28:38 Yeah, so this, this mutations, mutations quadrant, this is sort of the event
0:28:42 sourcing idea where instead of sending around low level changes, we're going
0:28:45 to send around the actual user actions, both from users to the server and
0:28:49 from the server back to other users.
0:28:51 So an example would be like, if you do a find and replace operation, or maybe
0:28:56 you rename a variable in VS code, the operation that you're going to send
0:28:59 to the server actually says, you know, rename this variable from foo to bar.
0:29:03 As opposed to a bunch of low level edits where you go through and change
0:29:06 the actual characters, F O O to B A R, in every place they happen to exist.
0:29:10 So this quadrant is interesting because it gives you a lot more flexibility
0:29:15 in terms of what You can communicate this really high level intent, like
0:29:20 code refactors or actions in a computer game, and then the server can interpret
0:29:25 that intent in a reasonable way.
0:29:27 You know, applying permissions, maybe you can see that someone else has also been.
0:29:31 you know, added a new reference to that variable.
0:29:33 So it's going to rename that reference as well.
0:29:36 and you can do this a lot more flexibly as opposed to if you just see the low
0:29:38 level intent and have to sort of, or the low level operations and sort of have to
0:29:42 guess what intent that corresponded to.
0:29:44 So there's a few systems along these lines.
0:29:47 So one of them, which I link here, which is not as well known is called Actyx.
0:29:51 It's actually a company in Europe, which does, Like iot, coordination in factories.
0:29:58 So if you have some, you know, robots moving around a factory floor, they're
0:30:01 talking to each other over the local network and they might say things like,
0:30:05 oh, someone needs to go pick up this box and move it from point A to point B.
0:30:09 one of the robots can say, okay, I'm going to go pick up,
0:30:11 pick up this box and move it.
0:30:13 And that way the other robots know not to move it themselves.
0:30:15 And these, these actions or messages, they just get put into a log that
0:30:19 all the devices in the factory see.
0:30:21 And that way they sort of know what's going on, what tasks are
0:30:24 outstanding, that sort of thing.
0:30:26 Right, and I think one very nice benefit of that as well, is that if there's
0:30:31 some real world stuff happening, and whether in a factory a robot has moved,
0:30:37 or you've now like manufactured a new part, or destroyed a certain thing.
0:30:43 Now you have like a real log of those events.
0:30:46 So in case something goes wrong or in case there's an audit, now you have
0:30:50 some hard facts that you can look at.
0:30:52 So it's not just useful for an app and a machine, but it's also useful for human
0:30:57 purposes to understand what has happened.
0:31:00 Exactly.
0:31:01 Yeah.
0:31:01 And this really feeds into the idea of business logic.
0:31:04 You know, in a lot of applications, we have this.
0:31:07 Business logic that we want to do in terms of, you know, what happens
0:31:10 when a user clicks this button.
0:31:12 And it can often be more complicated than you can express
0:31:15 with simple database changes.
0:31:17 And keeping these actions around gets you really first look at what the, the
0:31:20 business logic was supposed to do and also have the server customize its response.
0:31:25 Like you can check permissions at a very fine grained level.
0:31:28 You can make decisions about, you know, bank balances going below
0:31:31 zero and that sort of thing.
0:31:32 yeah, sort of tossing to some of Pat Helland's articles, if you've
0:31:35 seen like building on quicksand or, immutability changes everything,
0:31:38 this idea of, you know, accountants don't use erasers, all those ideas.
0:31:43 Yeah, exactly.
0:31:44 And I think for web developers, this is also very intuitive, where if
0:31:48 you build a React app, for example, and you have Some complex state
0:31:54 that you express in react use state.
0:31:56 And now you try to somehow do the right thing based on how the state changes
0:32:02 using some react use effect, for example.
0:32:05 They're like, you should use better, better mechanisms and better foundations
0:32:10 for that, for example, using XState for like some, some state machines, et cetera.
0:32:15 This is where you.
0:32:16 Very explicitly and declaratively deal with the state changes as opposed to
0:32:21 like, trying to somehow, reinterpret how some, like, nitty gritty state
0:32:27 things have changed, whereas, like, if you just have a beautiful, simple
0:32:30 event that is easy to understand, okay.
0:32:33 That thing has changed.
0:32:34 The robot has entered this room.
0:32:37 that's much easier to understand than interpreting the
0:32:40 coordinates of a certain thing.
0:32:43 And this also feeds into features that you might want to give to your
0:32:45 users, especially in productivity apps.
0:32:47 You want to have that change history where you can see what was everyone doing.
0:32:51 You also want to have undo's.
0:32:52 basically what you can do for undo is when you create this action or this mutation
0:32:56 describing, the high level intent, you can also tag along with it, a mutation saying,
0:33:01 here's how to undo this operation later.
0:33:03 And then you store that somewhere and then you just have a queue somewhere in
0:33:06 your app that's like the action queue.
0:33:07 You can go through that and undo things in a nice way.
0:33:11 Hopefully, you know, the users will be happier with this than if you
0:33:19 Text & list editing
0:33:19 Right.
0:33:20 So, in this quadrant of the event sourcing quadrant here, there's still
0:33:24 a couple of like sub cells, um, how the mutations are applied, namely the
0:33:31 serializable, CRDT ish, and OT ish.
0:33:34 Can you give a little bit of an intuition how they differ in the implementation and
0:33:39 when you would choose one or the other?
0:33:41 Yeah.
0:33:41 So the examples here mostly concern text editing, which is not a coincidence.
0:33:45 So in text editing, when you're doing any sort of collaborative text editing,
0:33:48 like in Google Docs, you have this problem that your operation might say, I
0:33:52 want to type, you know, the word hello.
0:33:55 After, you know, maybe I want to type the word world after hello.
0:33:58 So the, this message that you're going to send to the server might
0:34:01 say something like insert world at index five, because you know, hello
0:34:04 is five characters long, but someone else might also edit this world.
0:34:07 Hello, or this word.
0:34:09 Hello.
0:34:09 Before your change makes it to the server.
0:34:12 So maybe now it's like, hello there world.
0:34:15 It's what you want to happen.
0:34:16 But your edit is still trying to target index 5, so it's going to
0:34:19 go in sort of the wrong place.
0:34:21 You want it to shift over to accommodate edits that have been
0:34:24 before yours in the document.
0:34:26 And of course, this gets worse if you're, like, editing the bottom
0:34:28 of the document, someone else is editing the paragraph on top.
0:34:31 All of your array indices are going to get horribly messed up
0:34:34 by the time they reach the server.
0:34:35 Like, they're not going to be accurate anymore.
0:34:37 So the three choices here are basically different ways to patch up those
0:34:41 indices so that they make sense again.
0:34:43 That makes sense.
0:34:44 And, I think this is a common theme for local-first software is that
0:34:50 there are a couple of like special buckets that deserve special treatment,
0:34:55 namely text editing and also lists.
0:34:58 And those, the, the latter two are, I think also like closely related.
0:35:03 So on that note, the article, you also went, went a bit more in depth.
0:35:08 on possible approaches to tame lists in this distributed setting.
0:35:14 Will you mind sharing a little more context about that?
0:35:17 Sure.
0:35:18 Yeah, so if the list, as you said, it's hard and it's hard specifically
0:35:22 because of this index problem where your obvious choice for what operations
0:35:25 you're going to send over the network often don't make sense anymore by
0:35:28 the time they reach the server.
0:35:30 Um, and the solutions really fall into two camps.
0:35:33 There's the operational transformation camp, which is used by Google Docs
0:35:37 Which is where you're going to send, you know, index five, that sort of
0:35:41 thing, a raw number, and the server is going to look at these, this index.
0:35:44 It's going to look at all the intervening operations that arrived
0:35:48 that you didn't know about, but have already reached the server.
0:35:50 And it's going to sort of like walk through those one by one to try to
0:35:54 figure out what index you actually meant.
0:35:56 Because it's going to see, okay, if you inserted something at index five and
0:36:00 three other characters have been inserted before that, I'm going to change it from
0:36:04 five to eight, just adding five and three.
0:36:06 Got it.
0:36:07 So a very common app use case for this is, let's imagine Notion where
0:36:11 on the left sidebar, you can have your, your favorite, pages pinned
0:36:17 and those you control the order.
0:36:19 So you can move them around or also on a Notion page, all the
0:36:23 blocks you can reorder yourself.
0:36:26 And a very naive approach would be, whenever you reordered something, you send
0:36:32 to the server a full copy of the entire document, and that contains the order.
0:36:37 But that is not very useful in the collaborative setting where
0:36:41 now the merge radius of the entire thing is the document.
0:36:44 And it doesn't really allow for collaboration on a per block level.
0:36:48 And the another naive approach would be to send the block and say
0:36:54 like, oh, now I'm at position three.
0:36:57 But something else might've already, moved and it's no
0:37:00 longer in reality position three.
0:37:02 So this is what this is all about and, uh, the different approaches for this.
0:37:06 Figma has written also a really nice blog post about this, how they,
0:37:11 tamed this problem, where I think they call it fractional indexing.
0:37:14 And I think you connected the dots here.
0:37:17 can you, draw a line between the different approaches here, the CRDTish approach
0:37:27 Fractional indexing
0:37:27 Yeah.
0:37:27 So the, the OT ish approach, that's what I was describing with, you know, you send
0:37:32 index five to the server, but the server's going to rewrite it to index eight.
0:37:36 So this is really this idea that the server is going to.
0:37:39 Mutate your operation to try to make it still make sense.
0:37:43 Then the CRDT ish approach, which is used by fractional indexing and YDS and
0:37:46 those sort of things, is actually the clients, instead of sending, you know,
0:37:50 index 5 to the server, they're going to rewrite this message in a way so
0:37:54 that it still makes sense, even if it reaches the server a little bit late.
0:37:58 So, for example, you could have, in fractional indexing, you might label
0:38:02 your characters with these decimal numbers instead, where you say, like,
0:38:05 the characters are at 0.1.2.3, etc.
0:38:09 And then if you want to add a new character in between 0.
0:38:12 4 and 0.
0:38:13 5, you give it the label 0.
0:38:14 45.
0:38:16 So this isn't really a list index, it's what they call a fractional index.
0:38:19 And the idea is that this will still go in between the characters at 0.
0:38:22 4 and 0.
0:38:22 5, even if some other changes happen elsewhere in the list.
0:38:27 Because those other changes don't actually change your fractional index.
0:38:29 You're keeping the characters at the same 0.4.5.6, etc.
0:38:34 Right.
0:38:34 And now the 0.
0:38:35 45, this is what you use to derive the, the real.
0:38:40 Integer indexes from by lexicographically ordering it.
0:38:45 Got it.
0:38:45 Yeah.
0:38:45 So I'm using the same mechanism inspired by the ideas of like the, the Figma
0:38:51 blog posts, et cetera, for Overtone.
0:38:53 And I'm even using it before I started implementing syncing, just because
0:38:57 I found it to be the Easiest way to keeping a list ordered in an event
0:39:03 source system, since this is what I'm also already using to circumvent schema
0:39:07 migrations for the, the app I'm building.
0:39:10 So it's, I think it's actually a very simple self contained concept that can
0:39:15 be applied even outside of the scope of a full blown local-first data stack.
0:39:20 Yes, exactly.
0:39:21 Yeah.
0:39:22 It turns out so what.
0:39:23 Text editing CRDTs are doing is very similar to factional indexing,
0:39:27 just with some extra changes to solve some bugs, basically, like
0:39:30 what happens if two people try to insert a character at the same place.
0:39:33 Factional indexing breaks down, CRDTs just have the smallest change
0:39:36 needed to make this not break down.
0:39:38 I agree with your point that this isn't really a collaborative thing.
0:39:41 This is just a general data structures thing.
0:39:43 It's like the way we describe text and as an array is sort of flawed because
0:39:48 array indexes are changing all the time, even though the character is staying
0:39:52 the same and staying in the same place.
0:39:54 Intuitive sense.
0:39:56 So what we really want is an abstraction where the characters keep the same
0:39:59 identifier at all times, whether that's a fractional index or whether it's part of
0:40:03 the list CRDT internals, and then that's how we should represent sequences that
0:40:07 can move around, which is basically any list in a GUI where you can drag something
0:40:11 in between two existing elements.
0:40:13 Combining approaches
0:40:13 That makes a lot of sense.
0:40:14 And what's also so cool about like seeing all of the different options in this
0:40:19 classification table is that you don't have to choose exactly one for your app.
0:40:24 what I'm planning to do for, for Overtone is mostly follow the event
0:40:28 sourcing idea for collaborative state.
0:40:30 However, in the places where I have.
0:40:33 complex, particular problems such as a description text or like a document
0:40:39 text, this is where I most likely will resort to something like Automerge
0:40:43 or Yjs to let those technologies deal with the text editing, the
0:40:49 collaborative text editing use case.
0:40:51 But, and with that, I'm gonna I think I get the best of both worlds where I get
0:40:57 all the benefits from event sourcing for the, the more high level data structure
0:41:02 of my app and for the specificness of the text editing, I embed a little CRDT
0:41:08 use case in the broader document use case that I tame with event sourcing.
0:41:14 Do you think that general approach makes sense?
0:41:16 Yes, that's exactly the way to do it.
0:41:18 Yeah, if you look, there's a lot of, you know, blog posts saying
0:41:20 about how CRDTs are complicated or they're hard to implement.
0:41:23 Usually these blog posts are talking specifically about the text editing part.
0:41:27 That's sort of the hard part where you want to let someone else do
0:41:29 it and have their nice battle tested, fuzz tested implementation.
0:41:32 But for other data structures, like if you have, you know, sort of a database table
0:41:37 sort of structure or a map structure, it's easier to make your own sync engine for
0:41:42 that and just drop in an existing library to handle the lists and text editing.
0:41:46 Right.
0:41:46 So it's funny that you came from like going super deep on CRDTs of like spanning
0:41:52 this, broader table of possibilities.
0:41:56 And it seems like now you're actually much more drawn.
0:41:59 So the first quadrant around event sourcing, what, led
0:42:02 to, to this interest for you?
0:42:05 Let's see.
0:42:05 So it might just be, you know, the grass is greener on the other side.
0:42:08 I haven't tried to make an app or a library using the
0:42:11 event sourcing approach yet.
0:42:12 So maybe I just don't know what's wrong with it.
0:42:14 but it really started out about a year ago.
0:42:16 I was thinking about version control.
0:42:19 This was around the same time that Ink and Switch was thinking about version
0:42:21 control with their, upwelling essay.
0:42:24 and the idea was like, what if we could do this Git style model where
0:42:28 you make changes to an app, like a text document or a spreadsheet.
0:42:31 We just put these into linear branches, and then when we merge them, you
0:42:35 copy from one branch to another.
0:42:37 And originally, the idea is we're going to put CRDT operations in these
0:42:40 branches, because that's what I'm familiar with, but I eventually realized like,
0:42:43 actually, because the branches put the operations in a total order anyway, we
0:42:47 don't care about the CRDT correctness properties that say that you can
0:42:51 apply operations in different orders.
0:42:53 So we might as well just use arbitrary operations.
0:42:55 And that unlocks a whole lot of possibilities that would have been
0:42:59 hard to do in a CRDT system, like you can do these rename variable or find
0:43:03 and replace operations, maybe even like a change tone with AI operation.
0:43:07 Just put these in a log, have the log be in a fixed order, and
0:43:11 run the operations in that order.
0:43:12 That makes sense.
0:43:13 so aside from the versioning use case, can you think how, using a CRDT approach
0:43:20 versus an event sourcing approach might be a good or a bad fit for different
0:43:26 categories of apps that you can think of?
0:43:29 Sure.
0:43:29 Yeah, so I think the advantages of a CRDT approach, well first off,
0:43:33 you can do this more database model.
0:43:34 If I'm going to put my data in a magic box that says database, and
0:43:37 it's going to synchronize it for me, I don't have to worry about it.
0:43:39 Whereas you're doing an event sourcing approach, you have to think
0:43:42 more carefully about what are my mutations that I'm sending around?
0:43:46 How do I process them?
0:43:47 How do I make sure that they still make sense, even if someone else's
0:43:50 mutation reach the server first?
0:43:52 So that's a bit harder.
0:43:54 the other advantage of CRDTs is the efficiency perspective.
0:43:57 You can have, the CRDTs can implement operations in a very efficient way so
0:44:01 that you're not going to accidentally say, you know, I'm sending this mutation
0:44:06 to the server that's going to take.
0:44:07 an entire second to process is going to slow everyone down.
0:44:10 It's sort of the, the general trade offs that CRDTs behave more like a database.
0:44:14 They, they just work and they're optimized to be fast.
0:44:17 Which, with an event sourcing model, you get flexibility.
0:44:21 You can send arbitrary mutations around, you can have arbitrary business
0:44:25 logic on the server, it can even differ from the logic on the clients.
0:44:29 Just coming back to the video game example, you have a lot of logic that
0:44:32 the server needs to step through, checking permissions, checking
0:44:34 collisions, that sort of thing.
0:44:36 Which would be hard to do with a CRDT or with a database model.
0:44:40 Event sourcing challenges
0:44:40 So you mentioned that you haven't yet built larger systems with the
0:44:45 event sourcing approach, but I think you've still done a little
0:44:48 bit of research on what might await you in the event sourcing world.
0:44:53 So could you outline a little bit of like the potential concerns
0:44:57 you see on the horizon when going all in on event sourcing?
0:45:01 Yeah, so I guess the main concern always is if you're Sending around
0:45:06 this log of events to clients.
0:45:09 And if you're storing this as your single source of truth, then
0:45:13 storing all these events forever, it might take up a lot of space.
0:45:15 If you could imagine a text document, if each text character corresponds to 100
0:45:20 bytes of JSON, then the history of all the events is going to be a hundred times
0:45:25 bigger than the actual text document.
0:45:26 Even if you've since cleared out the entire text document, now it's empty.
0:45:29 You still have all this state.
0:45:30 So that's the main challenge is just how do we store the events efficiently, how
0:45:35 do we maybe compact them, say I don't need these events anymore, I'm going to
0:45:38 throw them away and replace the state, while still making that play nicely
0:45:42 with, you know, clients who have been offline for a month, that sort of thing.
0:45:45 Which sort of mechanisms do you think will mostly help to
0:45:49 overcome some of those issues?
0:45:51 I'm hoping the main mechanism is just To give up, basically say text is
0:45:56 very small for any, the main sources of lots of data in your app are
0:46:01 blobs like images or videos, which you can put somewhere else anyway.
0:46:05 And then for the actual event describing the fine grained changes, just store
0:46:08 them all and it's only going to be a few megabytes per document anyway.
0:46:11 Got it.
0:46:13 Yeah.
0:46:13 And I think on top of that, there's also the compaction use case.
0:46:17 Now that I have a little bit more, insight on, on that
0:46:21 approach with building Overtone.
0:46:23 for example, given that everything you do within Overtone, whether it's playing
0:46:28 a track, whether it's navigating within the app, whether it's adding a track
0:46:31 to your playlist or follow an artist, all of those are an event and Adding
0:46:39 a track to a playlist, there you do a lot less of those than, for example,
0:46:45 in the background, the app auto playing the next track, which is also an event.
0:46:52 And another kind of event is if the app tries to authenticate with a music service
0:46:58 such as Spotify to exchange tokens, which it needs to do at least Once an hour.
0:47:05 So it does so a little bit ahead of time.
0:47:07 So, also when you reload the app, it needs to do that.
0:47:11 So just by the fact by, the app running in the background over time, it Racks
0:47:18 up quite a lot of different events.
0:47:21 And I think they're the interesting part is the nature of the events
0:47:25 and the nature of those events also allows for different trade offs.
0:47:28 So me putting a track into a playlist, A, there's going to be
0:47:33 like way fewer events of those.
0:47:35 and it's fine to keep the entire history of this around.
0:47:38 What's so cool about this also, the fact.
0:47:41 That, I have this event allows me to trivially implement a feature like that.
0:47:46 I can hover over the track and I see the information when was it added by
0:47:51 whom was it added to, to the playlist.
0:47:53 It also makes implementing things such as undo much easier, but the other kind
0:47:59 of events, which might be implicit or which might just be a lot more, higher
0:48:05 quantity, what I've seen is that, it's not as crucial to keep those events
0:48:11 around for eternity, but some of those events are then also made irrelevant by
0:48:17 follow up events of the, the same type.
0:48:20 So for example, if your app has authenticated and overrides sort of like
0:48:24 an off state into the database, and.
0:48:27 two hours later, it has already done so 10 more times.
0:48:31 I don't need to keep the entire history before that, maybe besides auditing
0:48:35 reasons, so I can just at some point remove the old events, which keeps
0:48:41 an otherwise always growing event log at a, for this given event type
0:48:47 at a much more like constant size, which makes it much more feasible.
0:48:52 Another thing that I, started thinking about is like, what if you have not
0:48:57 just like one event log, but what if you have multiple event logs?
0:49:01 And what if you have, a hierarchy of event logs?
0:49:04 This is something that I also want to think a little bit more about, Let's
0:49:08 say you have a, a tree of, playlists, like a, a folder of playlists.
0:49:13 So you have a, a playlist.
0:49:15 And that playlist could also, possibly be a folder of other playlists.
0:49:20 So now what does the event log exist for?
0:49:23 Does it exist for like, everything in my library?
0:49:26 Does it exist for a broken down to.
0:49:30 only giving information about which playlists I have, and then I need to
0:49:34 subscribe to another playlist, but what if that playlist is a folder?
0:49:38 So this hierarchical aspect of it, I think this will keep me busy
0:49:42 for, for a little bit as well.
0:49:43 Do you have thoughts on those problems?
0:49:46 Yeah, I mean, this, the, what you're saying is really interesting.
0:49:48 It makes me think of the problem of ephemeral presence.
0:49:52 So, you know, in Figma, when your collaborators are moving their
0:49:54 mouse cursors around, you can see where they're at it every time.
0:49:58 I would imagine Figma is not actually persisting those mouse movements,
0:50:01 it's just sending them over the usual channels so that you can see them live,
0:50:04 but then you forget about these events because they don't matter anymore.
0:50:07 So I wonder if you could maybe do that for a lot of the events that don't
0:50:11 matter as much, or even in a text editor.
0:50:13 So one thing that's really hard with a collaborative text editor is you'd like it
0:50:17 so that whenever you press a key, that key is immediately sent to your collaborators.
0:50:21 But if that actually creates an event that's persisted in the log, then you have
0:50:24 this issue of, you know, 100 times as much storage as key presses, but maybe what
0:50:28 you could say is when you press a key, that's like an ephemeral presence message.
0:50:32 It's not actually stored, it's just sent over the same
0:50:34 channel as the mouse movements.
0:50:36 And this is sort of like an ephemeral mini log that's stacked on top of the actual
0:50:40 event log, and then every 10 seconds or so you send a compacted version of
0:50:44 like the entire sentence that the person typed as a single event, and that's
0:50:47 what's actually stored on the backend.
0:50:49 I wonder if that could help at all, or if this is even possible to implement.
0:50:52 Right.
0:50:53 I've actually implemented a small version of that already,
0:50:57 which I call local only events.
0:50:59 The idea of that is that, there's kind of like hierarchies of syncing as well.
0:51:05 There's like syncing, just from the main thread to the workers thread, which is
0:51:11 responsible for persisting the data, but also from one tab to another tab.
0:51:18 And, those two tabs should in some regards, Converge, and in
0:51:23 some regards, allow divergence.
0:51:25 so for example, if you have Notion open in two tabs, you want to be able to navigate
0:51:32 to different documents and those different tabs, but if you're in the same document,
0:51:36 you probably want to see the same thing.
0:51:38 So it's the same that applies to a music app.
0:51:41 Maybe in one tab you want to have.
0:51:43 The playback of one track and the another one, you want to not have the same
0:51:48 playback, otherwise you hear it twice.
0:51:50 but you want to maybe work on a playlist.
0:51:53 And so keeping things in sync is important, but I don't want to,
0:51:58 constantly as the playback progresses, have persistent events for this.
0:52:02 So I try to A, have like, very Deliberately small events.
0:52:08 And the other thing is where I have events that are broadcasted around.
0:52:12 But, if the app reloads, it doesn't rehydrate from those.
0:52:16 It either catches them midway or it's not important enough.
0:52:21 that it shows it so very similar to the presence feature in Figma.
0:52:25 So I have implemented a first version of this, but I think there can be
0:52:29 use cases where you might want to keep them around for like 10 minutes
0:52:34 or 10 seconds, like you say, and then have a version of compaction.
0:52:37 I think that that's really interesting.
0:52:40 What you're describing sounds really cool.
0:52:41 I'll be interested to see this code someday.
0:52:47 Local-first ideal are still hard to reach
0:52:47 So you've now been in the local-first space for over five
0:52:50 years, and I'm sure you've seen many technologies come along over time.
0:52:56 I'm curious whether you have certain strong opinions about the local-first
0:53:00 space or the web ecosystem more broadly.
0:53:02 Yes, I guess one.
0:53:04 Well, this isn't really an opinion, but just I'll make an observation that the
0:53:06 local-first movement has really exploded just within the past 12 or 18 months.
0:53:11 Like, starting out five years ago reading CRDT papers and going to CRDT
0:53:15 conferences, it was much more, you know, mellow academic atmosphere.
0:53:19 But now there's just so many tools popping up, I can't keep
0:53:21 track of them in my browser tabs.
0:53:23 you know, the local-first discord, all that stuff.
0:53:25 Just a lot more activity.
0:53:26 So it's both exciting and also a bit scary, because now I can't read all
0:53:29 the papers that come out anymore.
0:53:31 yeah, in terms of opinions, I guess the The strong opinion I've had in the
0:53:35 past year or so is that the local-first ideal, I think, is too hard right now.
0:53:41 There's just too many problems we'd have to solve to actually make like
0:53:43 a local-first app where the hosting provider can go away and you'll still be
0:53:47 able to collaborate and keep your data.
0:53:49 So the problem that I've been focusing on for the past year is the narrow
0:53:53 goal, like the baby step, of how do we make traditional central server SaaS
0:53:58 collaboration easier to implement, and maybe a bit easier to deploy.
0:54:02 So that's working on primitives like what you were describing with LiveStore.
0:54:05 We want some way to have events that you send around and persist IndexedDB.
0:54:10 broadcast channel between different tabs and then eventually send it
0:54:13 to a server that stores them and broadcasts them back to the client.
0:54:16 Just make some really good implementation of that that people can reuse so they
0:54:19 don't have to reinvent it every time.
0:54:22 and I think that'll be.
0:54:23 Both useful for, you know, developers and also a good stepping stone
0:54:26 towards the eventual goal of we want to get rid of this server and
0:54:29 have our, have our data forever.
0:54:31 I love that observation, and that opinion.
0:54:34 I think that's also one of my key takeaways from talking to many folks
0:54:38 at the local-first conference we had this year in Berlin, where Everyone
0:54:42 gets excited about all the goals and all the ideals of local-first, but
0:54:48 going after a few of those already is technically very complicated.
0:54:54 And then going like all the way to making sure that the software still
0:54:58 works if the vendor goes away, etc.
0:55:01 That is, I think, right now achieved by only a very, very few
0:55:07 set of products and technologies.
0:55:09 I hope that in five years from now, it will be table stakes.
0:55:13 But, I think it's a little bit like Maslow's hierarchy of needs.
0:55:17 And like we, here we have like the hierarchy of ideals and we haven't, Yet
0:55:21 quite made it as easy to achieve all of it, hopefully we'll, we'll get closer
0:55:26 to that over the next couple of years.
0:55:29 So those technologies that you've, now mentioned, is there anything
0:55:37 List-positions
0:55:37 Let's see.
0:55:37 So the main project I've had recently is it's a library called list-positions.
0:55:42 So you can read about it on my blog post or look at the docs on GitHub.
0:55:45 But it's basically trying to solve this fractional index generalization problem.
0:55:49 You can think of it like a fractional index library that also
0:55:52 implements the extra features that CRDTs have to prevent some bugs.
0:55:57 The idea is that you can use this as a drop in part to do just the text and
0:56:01 list collaboration in some arbitrary data structure . So I built examples on top
0:56:06 of Triplit, Electric SQL, Replicache.
0:56:09 So these are our collaborative data stores that don't talk
0:56:11 about lists or texts at all.
0:56:13 They're basically syncing maps or database tables.
0:56:15 And I said here, if we just stick these souped up fractional
0:56:18 indices on top, we can actually do text to rich text collaboration.
0:56:23 Outro
0:56:23 Very interesting.
0:56:24 I will check this out.
0:56:25 Maybe I can use it for Overtone.
0:56:27 Maybe I could even integrate it with LiveStore.
0:56:30 I will certainly check this out and we'll put the link in the show notes.
0:56:34 Great.
0:56:34 Matthew, is there anything else you want to share with the audience?
0:56:38 No, I don't think so.
0:56:39 It's been a really good chat.
0:56:40 Thank you so much for sharing all of your knowledge about different
0:56:44 approaches to syncing state.
0:56:46 I think this is the most in depth we've gone on those topics so
0:56:49 far, and it provided a brilliant overview for future conversations.
0:56:53 Has helped me a ton to, to better understand this, both your blog
0:56:57 posts as well as this conversation.
0:56:59 So thank you so much for taking time today and coming on to chat.
0:57:03 Yeah, thanks so much for having me.
0:57:04 Thank you for listening to the local-first FM podcast.
0:57:07 If you've enjoyed this episode and haven't done so already, please subscribe and
0:57:10 leave a review wherever you're listening.
0:57:12 Please also share this episode with others.
0:57:15 Spreading the word about the podcast is a great way to
0:57:17 support it and to keep it going.
0:57:19 A special thanks again to Rocicorp and Expo for supporting this podcast.
0:57:24 See you next time.