0:35:13 Right.
0:35:13 And I definitely did hit it and.
0:35:16 You will find in the resource, it's
not addressing that issue right now.
0:35:20 It's in a very naive state of when change
is made, send, fetch, call to server.
0:35:26 And there are a few problems with
that, even if you're the only client.
0:35:30 Because first off, if you send one event
per request, it's very possible that you
0:35:35 send a lot of very quick secession events
in like an order, and then they reach the
0:35:41 server in a different order because maybe
you pushed the first change on low network
0:35:47 latency, the next one on high latency,
or actually the reverse of that where
0:35:50 the first change hits after the second
change because that's just the speed of
0:35:54 the network and that's what happened.
0:35:56 And also you need to think
about offline capabilities.
0:36:00 If you push changes when they
happen, how do you queue them
0:36:04 when you're not connected to the
internet and then run through that
0:36:07 queue once you're back online?
0:36:09 That's another consideration
you kind of have to think about.
0:36:12 Could be solved with just like
an in-memory event log and
0:36:15 just kind of work with that.
0:36:16 But you still have the order issue.
0:36:19 I'm familiar with atomic
clocks as a method to do this.
0:36:23 There are even SQLite extensions
that'll sort of enforce that, having
0:36:28 not implemented atomic clocks.
0:36:30 Is it kind of this silver bullet
to that problem or are there more
0:36:34 considerations to think about than
just reaching for something like that?
0:36:38 Right.
0:36:39 I suppose you're referring
to vector clocks or logical
0:36:42 clocks on a more higher level?
0:36:43 Yeah.
0:36:44 since the atomic clocks, at least
my understanding is like that's
0:36:47 actually what's, at least in some
super high-end hardware is like
0:36:51 an atomic clock that is, like that
actually gives us like the wall clock.
0:36:56 So Right, right now is like.
0:36:58 Uh, 6:30 PM on my time, but
this clock might drift, and this
0:37:03 is what makes it so difficult.
0:37:04 So what you were referring to with logical
clocks, this is where it basically,
0:37:09 instead of saying like, Hey, it's
6:30 with this time zone, which makes
0:37:14 everything even more complicated, I'm
keeping track of my time is like 1, 2, 3.
0:37:20 It like might just be a logical
counter, like much simpler
0:37:24 actually than wall clock time.
0:37:26 but this is easier to reason about
and there might be no weird issues of
0:37:31 like, Daylight saving where certainly
like the, the clock is going backwards
0:37:36 or someone tinkers with the time,
this is why you need logical clocks.
0:37:40 And, there, at least the mechanism
that I've also landed on to
0:37:44 implement, to impose a total order.
0:37:47 But then it's also tricky,
how do you exchange that?
0:37:50 how does your client know what like
three means in my client, et cetera?
0:37:54 And the answer that I found to
this is to like that we all trust.
0:38:00 A single, authority in the system.
0:38:02 So this is where, and I think this is also
what you're going for, and with the Git
0:38:07 analogy, what we are trusting as authority
in that system is GitHub or GitLab.
0:38:13 And this is where we are basically,
we could theoretically, you could
0:38:17 send me your IP address and I could
try to like pull directly from you.
0:38:20 It would work, and that would also
work with the system that you've built.
0:38:25 However, there might still be,
they're called network petitions,
0:38:29 where like the two of us have like,
synced up, but some others haven't.
0:38:33 So as long as we're all connected to
the same, like main upstream node, that
0:38:39 is the easiest way to, to model this.
0:38:41 An alternative would be to go full on
peer to peer, which makes everything
0:38:46 a lot, lot, lot more complicated.
0:38:49 And this is where like something, like
an extension of logical clocks called
0:38:53 vector clocks, can come in handy.
0:38:55 you've mentioned the, the book, designing
dataset intensive application by Martin
0:39:00 Kleppman had him on the show before.
0:39:02 he's actually working on the version two
of that book right now, but he's also done
0:39:06 a fantastic free course about distributed
systems where he is walking through all of
0:39:12 that, with a whiteboard, I actually think
so, I think does what, what the two of
0:39:18 you have very much like you've both nailed
the, craft of like showing with simple
0:39:24 strokes, some very complicated matters.
0:39:27 so highly recommend to anyone
who wants to learn more there.
0:39:31 Like, learn it from, from Martin.
0:39:33 He's, like an absolute master
of explaining those difficult
0:39:37 concepts in a simple way.
0:39:40 But, yeah, a lot of things go kind
of downstream from that total order.
0:39:45 So just to, go together on like one little
journey to understand like a downstream
0:39:51 problem of this, let's say we have
implemented the queuing of those events.
0:39:56 So let's say you're currently on
a plane ride and, you're like.
0:40:00 Writing your blog post,
you're very happy with it.
0:40:03 You have now like a thousand
of events of like change
0:40:07 events that captures your work.
0:40:09 Your SQLite database is up to date.
0:40:12 but you didn't just create this new blog
post, but you maybe while you're still at
0:40:16 the airport, like you created the initial
version with it with like TBD in the body.
0:40:21 And your coworker thought like, oh,
actually I have a lot of thoughts on this.
0:40:26 And they also started writing
down some notes in there.
0:40:29 And now, the worlds have
like, kind of drifted apart.
0:40:33 Your coworker.
0:40:35 Has written down some important
things they don't want to lose,
0:40:38 and you've written down some things
you are not aware of the other ones
0:40:42 neither are they, and at some point
the semantic merge needs to happen.
0:40:48 But how do you even make that happen
in this sync engine thing here?
0:40:52 And this is where you need the total
order, where you basically, in the worst
0:40:57 case, this is what decides, like who, gets
a say in this, who gets the last say, in
0:41:04 which order those events have happened.
0:41:07 The model that I've landed on, and
I think that's similar to what Git
0:41:12 does with rebasing, is basically that
before you get to push your own stuff,
0:41:18 you need to pull down the events
first, and then you need to reconcile
0:41:22 your kind of stash local changes.
0:41:26 On top of the work that whoever has
gotten the, who got lucky enough to push
0:41:32 first without being told to pull first.
0:41:35 So in that case, it might have
been your coworker because they've
0:41:39 stayed online and kept pushing.
0:41:41 And now it sort of like falls
on you to reconcile that.
0:41:46 And I've implemented a, like an
actual rebase mechanism for this,
0:41:51 where you now have this set of
new events that your coworker has
0:41:56 produced and you still have your set
of events that, reflect your changes.
0:42:01 And now you need to reconcile this.
0:42:03 So that is purely on the.
0:42:05 Event log level, but given that we
both, want to use SQLite now, we don't
0:42:12 need to just think about going forward
with SQLite, but we also now need to
0:42:17 think about like, Hey, how do we go?
0:42:19 Like in Git you have like, you
have this stack of events, right?
0:42:24 So you have like a commit, which has
a parent of another commit, which
0:42:27 has a parent of another commit.
0:42:29 It's very similar to how your events and
this event log look like, except it's now
0:42:36 no longer just one event log, but you also
get this little branch from your coworker.
0:42:41 So now you need to go to
the last common ancestor.
0:42:44 And from there you need
to figure out like.
0:42:46 How do I linearize this?
0:42:49 I've opted for a model where everything
that was pushed once cannot be
0:42:53 overwritten, so there's no force push.
0:42:55 So you basically just get
to append stuff at the end.
0:42:59 But, in order to get there, you need
to first roll back your own stuff, then
0:43:05 play forward what you've gotten first.
0:43:08 and then on top add those.
0:43:10 And the rolling back with SQLite is
a, thing that I've like put a lot of
0:43:15 time into where I've been using another
SQLite extension, called the SQLite
0:43:21 Sessions extension, which allows you,
per SQLite write, to basically, record
0:43:27 what has the thing actually done.
0:43:30 So instead of storing, insert.
0:43:33 Into issues, blah, blah, blah.
0:43:35 when running that, you get a blob
of let's say 30 bytes, and that has
0:43:40 recorded on SQLite level, what has
happened to the SQLite database.
0:43:46 And I store that alongside of each
change event, that sits in the event log.
0:43:53 And the very cool thing about this
is, I can use that to replay it
0:43:57 on top of another database, but to
kind of catch it up more quickly.
0:44:01 But I can also invert it.
0:44:03 So now I have basically this
like, let's say 20 events.
0:44:07 And for each, I've recorded what
has happened on SQLite level,
0:44:11 and now I can basically say.
0:44:13 When I need to roll back, I can revisit
each of those, invert each of those
0:44:17 change sets, apply them again on the
SQLite database, and then I'll end up
0:44:22 where I was before and that's how I've
implemented rollback on top of SQL Lite.
0:44:27 So this is as mentioned when
you're going, down the, rabbit hole
0:44:32 of like imposing a total order.
0:44:34 There's a lot of downstream
things you need to do that makes
0:44:37 this even more complicated.
0:44:39 But, from what I can see,
you're, on the right track if
0:44:43 you wanna pursue this further.
0:44:45 Yeah.
0:44:45 And I do have a rebasing mechanism
in place in mind that's more,
0:44:52 just kind of a sledgehammer.
0:44:53 I got two SQLite databases in mind.
0:44:56 in the same way that on Git you have like
your local copy of the main line and your
0:45:00 local copy of your work, there's always
this local copy of Main, that's just
0:45:05 whatever events have come from the server.
0:45:07 So this is the source of truth that the
server has told me about and that was
0:45:12 something I forgot to mention earlier.
0:45:13 Explaining all of this is the
server is the source of truth.
0:45:16 It has that main line of the order
of all of the events, and that is
0:45:21 what all the clients use to trust.
0:45:23 But yeah, it has like that local
copy, and then when it pulls from
0:45:27 the server, it'll update that copy.
0:45:29 It'll look at all the events that
are kind of ahead in the client,
0:45:33 and then it'll say, okay, I'm gonna
roll back my client copy of my
0:45:39 branch to whatever the server is.
0:45:41 And it's literally just a file right call.
0:45:43 So it just overwrites.
0:45:46 Your like client SQLlite file
with a copy of the server one.
0:45:50 And then we look at the events that
the server didn't acknowledge yet
0:45:53 and then we replay those on top as
a very basic way to pull and make
0:45:58 sure, because it's very possible that
you made some changes locally that
0:46:02 the server hasn't acknowledged yet.
0:46:04 Like you've pushed them up still
in process and you pull down the
0:46:08 latest changes and you don't see
all of that stuff that you pushed
0:46:11 up yet because of network latency.
0:46:14 So this sort of avoids that problem
where you pull down from the server
0:46:18 and now you need to replay whatever
you did on the client that the
0:46:21 server hasn't acknowledged yet.
0:46:23 It hasn't received that network request.
0:46:25 So that was a very basic need to
have some rebasing, but it does
0:46:30 get a lot more complicated when you
have collaborators on a document.
0:46:34 I've seen a few different
versions of this.
0:46:37 CRDTs is the fun, like magic wand.
0:46:40 It does everything.
0:46:42 but there are also solutions from
Figma, for example, where they
0:46:47 say everything in Figma is kind
of its own little data structure.
0:46:50 Like you can put some text and
that's its own little data field.
0:46:54 You have rectangles.
0:46:54 Those are a data field.
0:46:56 And whenever you update a rectangle,
like you update the pixel width of
0:47:01 a rectangle, that's like an update
event on some SQL table that stores
0:47:05 all the rectangles for this document.
0:47:07 So whenever you make that update, it'll
update the pixel value of whatever
0:47:12 that row entry is, and then it'll push
it up for other people to receive.
0:47:17 And when you pull it down,
it's last right wins.
0:47:20 In other words, whoever the last
person is in that order that the
0:47:24 server decided on that total order.
0:47:26 That's a new word I know about now.
0:47:28 Didn't know it was called total order,
but yeah, that, once you pull it down,
0:47:31 whatever the server said was the order
of events, that's gonna be the final
0:47:35 state of that rectangle on your device.
0:47:38 The only time it becomes a problem, and
you may have experienced this, if you're
0:47:41 ever working on like a fig jam together
with a bunch of people, if you're all
0:47:45 typing in the same text box, everyone's
just like overriding each other and a
0:47:48 text box glitches out and changes to
whatever's on the other person's screen.
0:47:52 You can't see people's cursors
because you're fighting to update
0:47:55 the exact same entry in the database
and it can't reconcile those changes.
0:48:00 so it only works up to, like
you're editing different things
0:48:04 in the file and you're not really
stepping on each other too much.
0:48:08 As soon as you're stepping on each other
trying to edit like the same text field,
0:48:12 then you wanna reach for something
that's very, very fancy, like CRDTs.
0:48:17 Which will try to merge elegantly
all of the changes that you're
0:48:20 typing into the same database field.
0:48:23 It's maybe over-prescribed because of how
powerful it is, but for those specific
0:48:28 scenarios, it's really nice to reach for,
and we can talk about them if you want.
0:48:32 I only have a high level understanding
of what CRDTs do, but it would be
0:48:36 something to apply that kind of problem.
0:48:39 my takeaway from where to apply, CRDTs
versus where I would apply event sourcing
0:48:45 is, CR DTs great for in two scenarios.
0:48:51 One, if you don't quite know
yet where you want to go.
0:48:54 And where in the past you might've
reached for, let's say, Firebase to
0:48:59 just like have a backend of service.
0:49:00 You know, you might want to change
it later, but you just, for now,
0:49:04 you just want to get going and,
you can, particularly if you
0:49:08 don't have like a strict schema
across your entire application.
0:49:12 So you just try to like, not go off
the rails too much, but at least the
0:49:17 data is like, mostly, like across
the applications in a good spot.
0:49:22 But as you roll this out in
production, and, we are shipping
0:49:26 an iOS app as well, that someone
is, running an old version on.
0:49:31 Now you don't quite know, oh, this
document, this data document that has
0:49:35 been synced around here, this might
not yet have this field that the
0:49:39 newer application version depends on.
0:49:42 So now you have, like, this is where
time drifts in a more significant
0:49:47 way and in the more traditional
application architecture approach
0:49:52 you would, this way you don't trust
the client in the first place.
0:49:54 Then you have like your API endpoint
and the APIs, versioned, et cetera, and
0:49:58 everything is governed through the, API.
0:50:01 But now you also need to
tame that problem somehow.
0:50:03 So at this point you're already,
going a little bit beyond where I
0:50:07 think CRDTs shine right now, which
brings me to my next kind of more
0:50:12 evergreen scenario for CRDTs, which
are like very specific, tasks.
0:50:19 And so text editing,
particularly rich text editing.
0:50:22 Is such a scenario where I think CRDTs
are just like a very, very good, approach.
0:50:28 There's also like, you can also use
ot, like operational transform, which
0:50:32 is, somewhat related under the covers,
works a bit differently, but the way how
0:50:37 you would use it is pretty similarly.
0:50:40 And, related to rich text editing
is also when you have like complex
0:50:45 list structures where you wanna
move things within the list.
0:50:49 So if you want to go for the, Figma
scenario, let's say you change the
0:50:55 order of like multiple rectangles, like
where do they sit in that layer order?
0:51:01 how do you convey how
you wanna change that?
0:51:04 You could always, have like maybe
an array of all the IDs that give
0:51:08 you this perfect order, but if
this kind of happens concurrently,
0:51:13 then you need to reconcile that.
0:51:14 So that's not great.
0:51:16 And this is where CRDTs are also
like a very, special purpose
0:51:20 tool, which works super well.
0:51:23 And so what I've landed on is use
event sourcing for everything except
0:51:28 where I need those special purpose
tools, and this is where them reach
0:51:33 for CRDTs or for something else.
0:51:35 That's kind of the conclusion I, took away
if you like the event sourcing approach.
0:51:41 But, I think ultimately it really
comes down to what is the application
0:51:46 that you're building and what are,
like, what is the domain of what
0:51:51 you're building and which sort
of trade-offs does this require?
0:51:54 So I think in Figma.
0:51:56 The real timeness is really important
and it is recognized that those different
0:52:02 pieces that are floating around, they're
like pretty, independent from each other.
0:52:07 So, and if they're independent,
then you don't need that total order
0:52:10 between that, which makes everything
a lot easier in terms of scalability,
0:52:14 in terms of correctness, and then
you don't need to rebase as much.
0:52:18 distributed systems is the
ultimate case of it depends.
0:52:22 and I think trying to build one like
you did, I think is a very good way
0:52:28 to like build a better understanding.
0:52:30 And also I think that opens your eyes
of like, ah, now I understand why Figma
0:52:35 has this shortcoming or Notion if we are
trying to change the same line, change the
0:52:40 same block as where last writers, applies.
0:52:43 Whereas in Google Docs, for example, we
could easily change the, same word even.
0:52:49 And it would reconcile
that in a, in a better way.
0:52:52 But, maybe you have some advice for
people like yourself when you're
0:52:57 just getting started on that journey.
0:53:00 What would you tell people what they
should do maybe shouldn't yet do?
0:53:05 today 2025?
0:53:07 There's more technologies out there now.