1 00:00:00,000 --> 00:00:03,920 So I think it's a lot easier to develop when your queries are 2 00:00:03,950 --> 00:00:05,059 co located with your components. 3 00:00:05,381 --> 00:00:08,611 So every component is responsible for getting the data it needs. 4 00:00:08,892 --> 00:00:11,492 And this makes your app like super composable, right? 5 00:00:11,492 --> 00:00:16,172 Like you can add and remove components to your app or your component tree. 6 00:00:16,442 --> 00:00:19,202 And because they're responsible for getting their data, like, you 7 00:00:19,202 --> 00:00:21,672 don't have to do any additional wiring or prop drilling. 8 00:00:21,985 --> 00:00:24,135 Welcome to the localfirst.fm podcast. 9 00:00:24,465 --> 00:00:27,375 I'm your host, Johannes Schickling, and I'm a web developer, a 10 00:00:27,375 --> 00:00:30,325 startup founder, and love the craft of software engineering. 11 00:00:30,865 --> 00:00:34,855 For the past few years, I've been on a journey to build a modern, high quality 12 00:00:34,855 --> 00:00:36,875 music app using web technologies. 13 00:00:37,055 --> 00:00:41,235 And in doing so, I've been falling down the rabbit hole of local-first software. 14 00:00:41,755 --> 00:00:44,755 This podcast is your invitation to join me in that journey. 15 00:00:44,868 --> 00:00:48,538 In this episode, I'm speaking to Matt Wanlaw, a prolific local-first tool 16 00:00:48,538 --> 00:00:52,758 builder who's behind projects such as Vlcn, CR-SQLite, and Materialite. 17 00:00:53,098 --> 00:00:56,948 Most recently, Matt also joined Rocicorp to work on their new product. 18 00:00:57,378 --> 00:01:02,278 In this conversation, we go deep on his projects covering CRDTs, SQLite, 19 00:01:02,328 --> 00:01:04,028 and incremental view maintenance. 20 00:01:04,262 --> 00:01:08,532 Before getting started, also a big thank you to Rocicorp and 21 00:01:08,532 --> 00:01:10,052 Expo for supporting this podcast. 22 00:01:10,472 --> 00:01:12,212 And now my interview with Matt. 23 00:01:13,188 --> 00:01:14,868 Hey Matt, welcome to the show. 24 00:01:14,878 --> 00:01:15,588 How are you doing? 25 00:01:16,338 --> 00:01:16,748 Good. 26 00:01:16,808 --> 00:01:17,098 Yeah. 27 00:01:17,098 --> 00:01:17,648 Thanks for having me. 28 00:01:17,932 --> 00:01:20,102 I'm super excited to have you on the show. 29 00:01:20,322 --> 00:01:23,572 You and I have been collaborating on a few projects now over the 30 00:01:23,572 --> 00:01:25,002 course of the last few years. 31 00:01:25,312 --> 00:01:29,072 Most notably, you've been working on CR-SQLite and also 32 00:01:29,072 --> 00:01:30,852 Materialite most recently. 33 00:01:31,332 --> 00:01:35,182 But for those of you who don't know Matt would you briefly 34 00:01:35,182 --> 00:01:36,332 mind introducing yourself? 35 00:01:36,747 --> 00:01:41,037 Yeah um, Matt Wonlaw, and I guess I've been involved in the local-first 36 00:01:41,057 --> 00:01:42,927 community for about two years now. 37 00:01:43,257 --> 00:01:46,247 It originally started with me trying to resurrect an ancient project 38 00:01:46,247 --> 00:01:50,817 of mine and trying to add, like, multiplayer and sync between devices. 39 00:01:51,047 --> 00:01:54,663 And I saw, like, that's actually a pretty hard problem, and there weren't, like, 40 00:01:54,663 --> 00:01:59,293 many off the shelf solutions, so then I just, like, dove into this , and that's 41 00:01:59,293 --> 00:02:02,143 where all these projects like CR-SQLite, Materialite and other things came from. 42 00:02:02,330 --> 00:02:06,780 Yeah, and I think this is also how you and I have met quite a while ago. 43 00:02:07,050 --> 00:02:10,270 And before that, you've been working at Facebook for many years. 44 00:02:10,300 --> 00:02:13,560 And even before that, I think you had your first engineering role 45 00:02:13,777 --> 00:02:18,107 working on some software that by definition had to be very local. 46 00:02:18,747 --> 00:02:20,247 There was no local-first yet. 47 00:02:20,537 --> 00:02:25,027 Would you mind telling us a little bit more about what those first apps were that 48 00:02:25,027 --> 00:02:26,427 you've been working on professionally? 49 00:02:26,900 --> 00:02:31,510 Yeah, so like my first job out of college was at a defense contractor 50 00:02:31,580 --> 00:02:34,960 called Lockheed Martin, but essentially what we were doing was we were building 51 00:02:35,187 --> 00:02:37,327 software for submarines, so like U. 52 00:02:37,327 --> 00:02:37,457 S. 53 00:02:37,457 --> 00:02:41,497 Navy submarines, and yeah, obviously a submarine has no internet connection, 54 00:02:41,497 --> 00:02:45,657 everything has to be self contained, but even beyond that, so the software I built 55 00:02:45,677 --> 00:02:50,327 was for, Like these thick clients for the workstations that sonar operators used. 56 00:02:50,927 --> 00:02:54,377 And, you know, there, there are servers on the sub, you know, servers that 57 00:02:54,377 --> 00:02:58,307 provide storage and some that provide the signal processing and whatnot. 58 00:02:58,907 --> 00:03:02,907 But like an extra requirement was, you know, a sub is in a hostile environment. 59 00:03:03,007 --> 00:03:03,997 Things can break. 60 00:03:04,117 --> 00:03:07,687 So the work, workstations need to be able to work even if all the servers were down. 61 00:03:08,027 --> 00:03:10,567 So, you know, they need to hold the last 24 hours of data. 62 00:03:10,947 --> 00:03:12,667 So the sonar operators can, you know. 63 00:03:12,977 --> 00:03:16,027 If every other system's down, they can at least work with the last 64 00:03:16,027 --> 00:03:17,357 24 hours of data that they had. 65 00:03:18,177 --> 00:03:23,120 So that, you know, I guess was my first brief introduction to local 66 00:03:23,120 --> 00:03:26,250 apps that eventually do need to do some sort of syncing once connections 67 00:03:26,250 --> 00:03:28,960 are re established and, you know, reconcile the changes that were made. 68 00:03:29,360 --> 00:03:30,490 That's fascinating. 69 00:03:30,510 --> 00:03:35,970 Have you learned some patterns back then that you still see out in the wild today? 70 00:03:36,020 --> 00:03:39,380 Or do you think the approach that you've taken back then is entirely 71 00:03:39,380 --> 00:03:40,740 different compared to today? 72 00:03:41,115 --> 00:03:44,355 Uh, Back then the data was very well segmented. 73 00:03:44,395 --> 00:03:46,355 So there was like never really conflicts. 74 00:03:46,885 --> 00:03:48,965 But event sourcing was the main pattern. 75 00:03:49,515 --> 00:03:52,525 So just have an event log of everything that has occurred. 76 00:03:52,935 --> 00:03:54,820 And when connectivity is reestablished, do you know. 77 00:03:55,040 --> 00:03:57,930 Play that event log forward from, you know, the time you lost 78 00:03:57,930 --> 00:03:59,170 connection to the current time. 79 00:03:59,597 --> 00:04:00,037 Yeah. 80 00:04:00,077 --> 00:04:04,247 I mean, event sourcing, I feel like is still a super active topic. 81 00:04:04,257 --> 00:04:09,287 I'm actually looking into it much more closely myself for the use case of 82 00:04:09,307 --> 00:04:14,277 building Overtone, and I'm also exploring to embrace it a bit more for like syncing 83 00:04:14,277 --> 00:04:16,147 and migration strategy for LiveStore. 84 00:04:16,737 --> 00:04:20,587 But I also think that there might be a renaissance for event sourcing. 85 00:04:20,597 --> 00:04:26,187 We had it for a while with Redux in a much more self contained space, but I 86 00:04:26,207 --> 00:04:30,917 think there's a, that's an interesting topic for research and production as well. 87 00:04:31,317 --> 00:04:34,087 Curious to hear your thoughts on event sourcing and how it 88 00:04:34,087 --> 00:04:35,417 could fit into local-first. 89 00:04:36,431 --> 00:04:37,451 I mean, I really like it. 90 00:04:37,848 --> 00:04:41,488 , you have a source of truth, like of all the facts, all 91 00:04:41,488 --> 00:04:42,498 the things that ever happened. 92 00:04:43,068 --> 00:04:46,885 Whereas, you know, a regular app or state or database that 93 00:04:46,885 --> 00:04:48,025 doesn't do event sourcing. 94 00:04:48,715 --> 00:04:51,435 You're losing all the things that happen, right? 95 00:04:51,435 --> 00:04:54,855 And you just have this snapshot of what things are right at this instant. 96 00:04:55,355 --> 00:04:56,875 So yeah there's no way to rewind. 97 00:04:57,305 --> 00:05:01,395 There's no way to, merge other people's changes if they did come in the past. 98 00:05:01,975 --> 00:05:05,841 , so I guess event sourcing, you know, probably fell out of favor in some 99 00:05:05,841 --> 00:05:08,668 ways because, you know, storage requirements, like are you really 100 00:05:08,668 --> 00:05:09,748 going to store all the event logs? 101 00:05:10,088 --> 00:05:12,718 And then also, you know, processing those event logs. 102 00:05:12,758 --> 00:05:16,398 You do eventually need to process them into some you know, snapshot of 103 00:05:16,398 --> 00:05:17,858 state so your app can work with it. 104 00:05:18,428 --> 00:05:20,888 So I think, you know, people seeing that, oh, I have to process the events 105 00:05:20,888 --> 00:05:23,718 and turn it into like these tables anyway, like why even do the event logs? 106 00:05:23,738 --> 00:05:24,968 Why not just mutate in place? 107 00:05:25,395 --> 00:05:30,426 Yeah, I think in case anyone of the listeners is interested in local-first 108 00:05:30,426 --> 00:05:32,354 and event sourcing, please get in touch. 109 00:05:32,354 --> 00:05:37,768 I think this is a very interesting topic to research a little bit After 110 00:05:37,798 --> 00:05:43,568 building those apps for like submarines which sounds super fascinating, you've 111 00:05:43,568 --> 00:05:49,288 at some point also decided to move on and join Facebook nowadays, Meta. 112 00:05:49,738 --> 00:05:52,568 I'm curious to hear more what you've been working on there and 113 00:05:52,568 --> 00:05:57,688 whether there was any intersection with your local-first topics now. 114 00:05:58,072 --> 00:05:58,322 Yeah. 115 00:05:58,322 --> 00:05:59,261 So Meta, 116 00:05:59,484 --> 00:06:04,384 , I guess, like, it shaped the way I approached the local-first problems after 117 00:06:04,384 --> 00:06:07,914 I left, but the problems there were very , , not related to local-first at all. 118 00:06:08,084 --> 00:06:11,344 So I guess, yeah, what I worked on, it was like three different teams, but 119 00:06:11,344 --> 00:06:12,694 they all kind of shared the same theme. 120 00:06:12,940 --> 00:06:16,130 And a lot of what I was doing, we had a system called Download Your Information. 121 00:06:16,130 --> 00:06:19,962 So this is like any Facebook user can go click a button and get 122 00:06:19,962 --> 00:06:21,392 an archive of all their data. 123 00:06:21,922 --> 00:06:25,442 So we'd have to like crawl your entire Facebook graph 124 00:06:25,682 --> 00:06:27,537 and find Everything you own. 125 00:06:27,597 --> 00:06:29,037 So we call it the ownership graph. 126 00:06:29,293 --> 00:06:33,273 And then another thing one of my team's built was the deletion framework. 127 00:06:33,463 --> 00:06:35,313 So this is like you delete your account. 128 00:06:35,523 --> 00:06:38,873 So go delete every post you've ever uploaded, every like you've ever 129 00:06:38,873 --> 00:06:43,043 made, every comment, every like ad you've ever run every message you've 130 00:06:43,043 --> 00:06:46,013 ever sent across all the products, Instagram, Messenger, Facebook. 131 00:06:46,053 --> 00:06:46,943 What's I guess not WhatsApp. 132 00:06:47,205 --> 00:06:50,225 And then, yeah, so that, that was like traversing this deletion graph. 133 00:06:50,308 --> 00:06:54,248 And then a third product, like, variation on all these themes was 134 00:06:54,258 --> 00:06:55,958 like investigations and human review. 135 00:06:56,648 --> 00:06:59,168 So, like, obviously there's lots of abuse on Facebook. 136 00:06:59,705 --> 00:07:03,050 And we had to have a way for, like, every product that's ever created, like, 137 00:07:03,640 --> 00:07:08,180 Can we make their content reviewable by like a human review team at Meta? 138 00:07:08,430 --> 00:07:12,217 So this is like, yeah, trying to fan out from the content and traverse the graph 139 00:07:12,227 --> 00:07:13,767 of things associated with it for review. 140 00:07:14,317 --> 00:07:18,617 So I guess like, all this like graph walking was very, goes 141 00:07:18,627 --> 00:07:22,471 very hand in hand with query languages and also schematization. 142 00:07:23,151 --> 00:07:27,068 So, I guess I mentioned like three different areas of like 143 00:07:27,108 --> 00:07:28,288 quote unquote compliance, right? 144 00:07:28,338 --> 00:07:32,018 Deletion, download your information, and content review. 145 00:07:32,658 --> 00:07:35,478 Well, if you're a product team at Meta, you don't want to 146 00:07:35,488 --> 00:07:37,668 be slowed down with having to integrate into all these systems. 147 00:07:37,668 --> 00:07:38,588 You don't have to learn about them. 148 00:07:38,588 --> 00:07:39,638 You just want to like ship your product. 149 00:07:40,263 --> 00:07:43,923 So like our key focus was how can we make it so developers can 150 00:07:43,923 --> 00:07:45,673 just make their product and not have to understand these things. 151 00:07:46,303 --> 00:07:48,923 So we kind of, we built it into the schema language. 152 00:07:49,153 --> 00:07:53,553 So as you're defining your schema for your product, you can say like for a 153 00:07:53,553 --> 00:07:57,083 given edge, like this is an ownership edge, or this is like a deletion edge. 154 00:07:57,483 --> 00:08:01,383 And then just like by declaratively specifying all this stuff all these 155 00:08:01,383 --> 00:08:02,723 other systems could just work. 156 00:08:03,333 --> 00:08:03,643 And. 157 00:08:03,978 --> 00:08:06,658 Yeah, eventually when I pivoted to local-first problem, I was like, wow, this 158 00:08:06,658 --> 00:08:12,098 like developer experience of everything being declarative and the schema layer 159 00:08:12,098 --> 00:08:16,038 down, taking care of it for you maybe we can do this for local-first too. 160 00:08:16,258 --> 00:08:17,438 I couldn't agree more. 161 00:08:17,448 --> 00:08:21,438 Did you eventually land on something where you got those declarative 162 00:08:21,448 --> 00:08:24,408 benefits for some local-first problems? 163 00:08:24,696 --> 00:08:28,986 Yeah, so, the project I ended up working on was called CR-SQLite, 164 00:08:29,496 --> 00:08:32,576 so I was, like, taking these ideas and adding them to SQLite. 165 00:08:32,806 --> 00:08:36,836 So, yeah, SQLite's already, you know, a relational database, and it's already 166 00:08:36,836 --> 00:08:40,486 pretty declarative in terms of you can define your schema for your table, 167 00:08:40,486 --> 00:08:43,806 and you can specify an index, and you can specify foreign key relations, and 168 00:08:43,806 --> 00:08:45,116 the database manages all that for you. 169 00:08:45,796 --> 00:08:49,536 But it didn't have any primitives for collaboration, right? 170 00:08:49,566 --> 00:08:54,451 So, like What if I give you a copy of my database and I have a copy of my 171 00:08:54,451 --> 00:08:57,661 own database and we both go offline and we both make a bunch of changes? 172 00:08:58,071 --> 00:09:00,961 What if we want to like sync our changes together or merge our DBs? 173 00:09:01,561 --> 00:09:04,791 So what I started adding to SQLite was this concept of CRDTs. 174 00:09:05,481 --> 00:09:09,971 So you could say for a given table, what type of CRDT should we modeled as? 175 00:09:10,531 --> 00:09:15,231 Like if it's a grow only set, or if it's like an add or move set, like 176 00:09:15,291 --> 00:09:17,101 technical terms in CRDT literature. 177 00:09:17,706 --> 00:09:21,076 And then for specific columns in the table, you could also say 178 00:09:21,076 --> 00:09:24,256 like, what sort of CRDTs should be used to merge those columns. 179 00:09:24,946 --> 00:09:30,583 So yeah, the idea was any developer who has an app backed by SQLite, they can 180 00:09:30,583 --> 00:09:33,083 just go in and make some schema changes. 181 00:09:33,553 --> 00:09:36,328 And then that app can become collaborative by allowing merging 182 00:09:36,328 --> 00:09:38,213 databases with other collaborators. 183 00:09:39,027 --> 00:09:39,397 Got it. 184 00:09:39,427 --> 00:09:44,867 And I think the parallel there is that many things about SQL can be also seen 185 00:09:44,907 --> 00:09:49,427 as rather declarative and the schema modeling, et cetera, those is where 186 00:09:49,687 --> 00:09:53,487 I think what we gain with local-first software is the the collaborative 187 00:09:53,487 --> 00:09:56,017 nature of the apps working together. 188 00:09:56,557 --> 00:09:57,947 And I think this is where you. 189 00:09:58,307 --> 00:10:02,657 Brought some of the nice experiences that you've seen at Meta, where there's 190 00:10:02,667 --> 00:10:06,937 such a strong engineering culture for having a great developer experience, 191 00:10:07,337 --> 00:10:11,717 bringing that to the new topic that you were interested in, local-first. 192 00:10:12,440 --> 00:10:14,970 Meta had a really interesting philosophy when I got there. 193 00:10:15,640 --> 00:10:18,430 I think like one of my first questions I asked was like, oh, how do we like 194 00:10:18,490 --> 00:10:21,840 make sure people use our stuff that our teams, our security infrastructure? 195 00:10:22,340 --> 00:10:25,237 And they're building things to make sure Meta's code was more secure. 196 00:10:25,277 --> 00:10:25,987 That was my first team. 197 00:10:26,170 --> 00:10:29,803 And like, security at Lockheed was always Everybody hated it. 198 00:10:29,803 --> 00:10:31,743 It was like, nobody wants to do it. 199 00:10:31,743 --> 00:10:33,753 And there's always this like mandated thing. 200 00:10:33,983 --> 00:10:35,253 So I got to Facebook. 201 00:10:35,253 --> 00:10:37,853 I'm like, how are we, I'm on security for, how are we going to 202 00:10:37,853 --> 00:10:38,973 make sure people use our stuff? 203 00:10:39,493 --> 00:10:40,493 And it was interesting. 204 00:10:40,503 --> 00:10:45,213 Like the manager at the time was just like, yeah, like we can't do anything. 205 00:10:45,303 --> 00:10:46,490 Facebook's bottom's up. 206 00:10:46,490 --> 00:10:47,120 We can't do anything. 207 00:10:47,120 --> 00:10:48,260 Make sure anybody uses our stuff. 208 00:10:48,610 --> 00:10:53,870 The only way we can do it is make the safe way, the default, and easiest way. 209 00:10:53,880 --> 00:10:55,930 Like, people are going to choose the secure way because 210 00:10:55,930 --> 00:10:56,840 we've made it the easiest. 211 00:10:57,320 --> 00:10:59,570 And like, there's no other way we can make them choose it. 212 00:10:59,980 --> 00:11:03,620 So yeah, there's a huge, always a huge focus on everything we did from DevX. 213 00:11:03,640 --> 00:11:07,310 So, that like, developers would just pick that solution. 214 00:11:07,390 --> 00:11:09,370 Because it was easiest, not necessarily because it was secure, just because 215 00:11:09,370 --> 00:11:10,360 it was the easiest thing around. 216 00:11:10,460 --> 00:11:11,800 And then it was also secure. 217 00:11:12,048 --> 00:11:13,668 Which is like the nice added benefit. 218 00:11:14,085 --> 00:11:19,115 So after having worked for many years at Facebook, you've later then started 219 00:11:19,115 --> 00:11:24,325 working on CR-SQLite, but you didn't arrive at CR-SQLite immediately. 220 00:11:24,415 --> 00:11:28,555 I think you were for a little bit also still working on 221 00:11:28,565 --> 00:11:32,145 bringing back an app you've been working on before called Strut. 222 00:11:32,455 --> 00:11:36,815 So I'd be curious to hear a little bit more about that and particularly 223 00:11:36,825 --> 00:11:41,205 given that you worked on it quite a bit before you joined Facebook and then 224 00:11:41,295 --> 00:11:43,495 you started working on it again after. 225 00:11:43,845 --> 00:11:47,165 So tell me more about the app and how that led you to local-first. 226 00:11:47,462 --> 00:11:47,792 Yeah. 227 00:11:47,792 --> 00:11:51,572 So it was my first, I guess, web app I was building. 228 00:11:51,802 --> 00:11:54,282 So yeah, at Lockheed, I was doing all these like Java rich clients, 229 00:11:54,812 --> 00:11:56,192 and I wanted to get into web dev. 230 00:11:56,392 --> 00:11:58,912 So I started building this thing called Strut.io. 231 00:11:58,932 --> 00:12:00,112 It's a presentation editor. 232 00:12:00,682 --> 00:12:03,732 And I knew people's devices, you know, they have storage and 233 00:12:03,732 --> 00:12:06,772 compute, and the browser had, you know, some storage back then. 234 00:12:07,352 --> 00:12:09,302 And I didn't want to, like, run servers for people. 235 00:12:09,732 --> 00:12:13,512 I just wanted them to be able to work and edit presentations if 236 00:12:13,512 --> 00:12:17,332 they're online or offline and not have to sign up for an account. 237 00:12:17,372 --> 00:12:19,402 Like you go to the site and you can immediately start doing stuff. 238 00:12:20,032 --> 00:12:25,132 So yeah, like I guess without knowing it, I was building a presentation editor that 239 00:12:25,132 --> 00:12:27,382 was, you know, local-first in some ways. 240 00:12:28,002 --> 00:12:33,098 And at that time, it was like 2011, 2012, there was, you know, I guess the prequel 241 00:12:33,098 --> 00:12:37,328 to the local-first movement, there's like remoteStorage.io and ownCloud. 242 00:12:37,798 --> 00:12:40,358 And I started getting involved with those, but then, yeah, eventually 243 00:12:40,358 --> 00:12:43,818 I got the Facebook offer and I just put Strut on the shelf to be 244 00:12:43,818 --> 00:12:45,228 resurrected sometime in the future. 245 00:12:45,515 --> 00:12:49,265 And then, yeah, when I was leaving Meta or Facebook, yeah, I didn't 246 00:12:49,265 --> 00:12:50,455 know exactly what I was going to do. 247 00:12:50,605 --> 00:12:54,675 So I figured, oh, I can just resurrect Strut.io real quick and you know, 248 00:12:54,695 --> 00:12:55,915 it'll bring in some side income. 249 00:12:56,427 --> 00:12:59,547 In the early days before I abandoned it, it had like 10, 000 monthly actives. 250 00:12:59,737 --> 00:13:02,997 So I was like, Oh maybe a decade later we can get back to that again. 251 00:13:03,597 --> 00:13:06,757 But yeah, as I resurrected it, kind of the landscape had changed a bit. 252 00:13:06,837 --> 00:13:12,240 Like people had a lot more devices and also multiplayer was an expectation where 253 00:13:12,240 --> 00:13:13,840 like, You know, Google Slides, right? 254 00:13:13,840 --> 00:13:15,310 You can have multiple people on the same deck. 255 00:13:16,000 --> 00:13:19,180 So I was, you know, looking into how I was going to solve that problem for Strut. 256 00:13:19,800 --> 00:13:23,943 And, you know, rather than I guess I've been building frameworks for so long at 257 00:13:23,943 --> 00:13:29,603 Meta, like, rather than making this a one off thing in Strut to solve the problem. 258 00:13:29,848 --> 00:13:31,768 I was like, Oh, like, can I solve it at a lower level? 259 00:13:32,218 --> 00:13:36,928 And that's when I went down the rabbit hole of CRDTs investigating SQLite, 260 00:13:36,968 --> 00:13:39,268 adding stuff to SQLite and so on. 261 00:13:40,195 --> 00:13:45,585 So the data architecture you had for the first version of Strut around, like, 262 00:13:45,975 --> 00:13:50,785 yeah, before Facebook, you mentioned like around 2012 to when you've left 263 00:13:50,785 --> 00:13:53,065 Facebook and what you've arrived at. 264 00:13:53,350 --> 00:13:54,120 Very different. 265 00:13:54,460 --> 00:13:59,240 So I'm curious, like how you went from like whether you took rather 266 00:13:59,240 --> 00:14:05,040 some smaller steps or some much bigger steps altogether to rethink the data 267 00:14:05,040 --> 00:14:06,840 architecture and how you went about that. 268 00:14:07,460 --> 00:14:11,730 Yeah, I mean, I guess beyond just collaboration, I wanted to support 269 00:14:12,340 --> 00:14:14,560 Yeah, arbitrarily large presentations. 270 00:14:15,230 --> 00:14:19,620 So right, the original version of Strut, it would load everything off disk and 271 00:14:19,620 --> 00:14:23,160 into memory, and then when you're closing the tab, or periodically, it would save. 272 00:14:23,855 --> 00:14:27,575 A big dump of memory to disk and yeah, I think that works fine, probably 273 00:14:27,575 --> 00:14:29,665 for the vast majority of cases. 274 00:14:29,905 --> 00:14:33,895 But I wanted to be able to handle presentations with hundreds of slides 275 00:14:33,895 --> 00:14:38,505 and tons of binary content or like images and videos and all sorts of things. 276 00:14:39,065 --> 00:14:42,865 So like at first I was like, okay, how do I like lazily load this stuff off of disk? 277 00:14:43,110 --> 00:14:46,770 And, you know, as you're starting to implement that in your own data 278 00:14:46,770 --> 00:14:49,740 model, you're like, Oh, well, this is kind of like a database, which, 279 00:14:50,240 --> 00:14:53,620 you know, knows how to page in and out things in and out for me. 280 00:14:54,170 --> 00:14:58,357 So I think, you know, that was also an indication that, okay, like, something 281 00:14:58,417 --> 00:14:59,807 more like a database would be good. 282 00:15:00,297 --> 00:15:02,327 And then I don't know how I found that SQLite was better. 283 00:15:04,167 --> 00:15:08,147 I guess I saw James Long's Absurd-SQL and Actual Budget. 284 00:15:08,657 --> 00:15:12,687 That's when I first realized that, yeah, like, I mean, I'd always known Wasm was 285 00:15:12,687 --> 00:15:16,467 a thing but that like SQLite was being compiled to Wasm and could run in the 286 00:15:16,467 --> 00:15:21,407 browser and it was actually faster than IndexedDB when you were, I guess there's 287 00:15:21,407 --> 00:15:25,990 some caveats there, but SQLite would store to IndexedDB, but it would store pages. 288 00:15:26,140 --> 00:15:27,580 So it would store four kilobyte chunks. 289 00:15:28,110 --> 00:15:29,450 IndexedDB, when you're like. 290 00:15:29,772 --> 00:15:32,432 Reading things in and out individually, like single objects 291 00:15:32,432 --> 00:15:33,952 by key value is like super slow. 292 00:15:34,382 --> 00:15:38,525 And the fact that like, okay, you're using SQLite and reading in chunks 293 00:15:38,525 --> 00:15:42,425 rather than individual objects, like it would be faster than IndexedDB. 294 00:15:42,595 --> 00:15:45,725 Like if you change your IndexedDB access to be paged, okay, it 295 00:15:45,725 --> 00:15:46,745 could be faster in SQLite. 296 00:15:48,020 --> 00:15:51,560 SQLite, you can still do point queries and be faster than a point query in IndexedDB. 297 00:15:52,020 --> 00:15:55,060 So that was something else that turned me on, like, oh, maybe I should use SQLite 298 00:15:55,060 --> 00:15:58,040 in the browser rather than IndexedDB, so I don't have to implement all this 299 00:15:58,170 --> 00:16:00,040 crazy stuff to make IndexedDB go fast. 300 00:16:00,740 --> 00:16:04,100 And then, you know, there's all the problem of, like, collaboration still 301 00:16:04,540 --> 00:16:09,500 and my justification for picking SQLite then was, you know, not only is SQLite 302 00:16:09,500 --> 00:16:13,910 running the browser now, if I ever want to take this app and make it native 303 00:16:14,020 --> 00:16:18,660 well, SQLite, Everything I build for collaboration in the SQLite, like it'll 304 00:16:18,660 --> 00:16:24,120 work on native, it'll work on desktop it'll work in the cloud if you want 305 00:16:24,120 --> 00:16:28,510 some like cloud peer or something rather than like a JavaScript only solution. 306 00:16:29,350 --> 00:16:34,820 So you made the decision, SQLite is the way forward, SQLite on top of IndexDB or 307 00:16:34,830 --> 00:16:39,490 on top of like some persistence mechanism, but you now also need collaboration. 308 00:16:39,510 --> 00:16:42,660 You looked a little bit into CRDTs, but you still decided 309 00:16:42,970 --> 00:16:46,860 instead of going CRDTs first in terms of the developer experience. 310 00:16:46,860 --> 00:16:52,530 You still want to give a developer yourself SQLite as the primary way to 311 00:16:52,530 --> 00:16:57,450 think about the developer experience, but you started to use CRDTs or you 312 00:16:57,450 --> 00:17:02,490 wanted to use CRDTs as a implementation detail to bring into SQLite. 313 00:17:02,500 --> 00:17:03,670 So how did you go about that? 314 00:17:03,690 --> 00:17:08,490 How did you solve that puzzle of making SQLite collaborative out of the box? 315 00:17:09,285 --> 00:17:12,965 Yeah, so SQLite has a bunch of mechanisms for extending it. 316 00:17:13,242 --> 00:17:15,832 You can make new virtual tables. 317 00:17:16,208 --> 00:17:18,918 yeah, a virtual table is essentially a table. 318 00:17:18,928 --> 00:17:24,483 It presents itself as a table in SQLite, except that The implementation of that 319 00:17:24,483 --> 00:17:28,473 table is any arbitrary code you want to write, so C code, Rust code, whatever. 320 00:17:28,760 --> 00:17:32,930 You can make functions, new functions, like a mat, you know, like min or max 321 00:17:32,930 --> 00:17:34,220 or whatever function you want to define. 322 00:17:34,690 --> 00:17:37,960 You can make new virtual file systems, so it's pretty extensible 323 00:17:38,120 --> 00:17:39,440 from the plugin side of things. 324 00:17:39,807 --> 00:17:43,990 And yeah, one of the teams I had worked with at Meta, they built this thing called 325 00:17:44,020 --> 00:17:48,347 osquery way back, I think, You know, some other company may have seen osquery 326 00:17:48,367 --> 00:17:52,820 now, but it basically lets you query all details about your OS through SQLite, 327 00:17:53,090 --> 00:17:55,490 and they did that through virtual tables. 328 00:17:55,683 --> 00:17:57,723 So yeah, for adding CRDTs to SQLite, 329 00:17:57,990 --> 00:18:01,200 it was kind of a combination of a bunch of special functions. 330 00:18:01,805 --> 00:18:06,185 That like when you create a table so, plus special functions 331 00:18:06,185 --> 00:18:07,145 and special virtual tables. 332 00:18:07,645 --> 00:18:11,055 So you can like, create a table, and then you can call a function that 333 00:18:11,055 --> 00:18:16,295 converts that table to a CRR, or Conflict-Free Replicated Relation. 334 00:18:16,735 --> 00:18:19,575 So just a fancy name for a table that's a CRDT. 335 00:18:20,215 --> 00:18:22,875 Or you could use this like virtual table syntax where you're like, I want 336 00:18:22,875 --> 00:18:30,222 to create this table as a CRR and then you list all the columns and in the 337 00:18:31,012 --> 00:18:34,632 columns, so when you're creating a virtual table, like you can customize that 338 00:18:34,632 --> 00:18:36,452 syntax all you want in your extension. 339 00:18:37,022 --> 00:18:40,062 So I could like have a slot for custom data types. 340 00:18:40,712 --> 00:18:42,402 So you can say this is a string. 341 00:18:42,402 --> 00:18:46,082 And then like another addition to the data type is like the CRDT type. 342 00:18:46,702 --> 00:18:50,832 So it's not only text, it's like a perryText, or it's like, not just an 343 00:18:50,842 --> 00:18:53,062 int, it's an int that's a counter. 344 00:18:53,512 --> 00:18:56,132 Or it's not just a date, it's a date that slash right wins. 345 00:18:56,812 --> 00:19:00,052 Yeah, and then and then I guess the last thing, right, like, it's great 346 00:19:00,062 --> 00:19:06,042 to have a database that's, you know, a CRDT but how do you merge efficiently? 347 00:19:06,632 --> 00:19:11,942 . There had been some earlier work, actually by a guy named Iver and we actually worked 348 00:19:11,942 --> 00:19:14,882 together pretty early on, on CR-SQLite. 349 00:19:15,222 --> 00:19:20,572 He did his master's thesis on essentially adding CRDTs to SQLite but some of the, 350 00:19:20,572 --> 00:19:24,042 like, Downsides of like that original approach was there's, there was no 351 00:19:24,042 --> 00:19:25,512 way to incrementally merge databases. 352 00:19:25,562 --> 00:19:29,312 You had to literally send the entire copy of the SQLiteDB to 353 00:19:29,312 --> 00:19:32,032 somebody else, and it would like scan every row and merge them all. 354 00:19:32,622 --> 00:19:34,972 So yeah, one of the key problems we had to solve was like how do we 355 00:19:34,972 --> 00:19:41,267 incrementally sync And that was done essentially like implementing a global 356 00:19:41,277 --> 00:19:45,557 version that gets tagged on every row and a virtual table that knows how 357 00:19:45,557 --> 00:19:50,927 to find all the rows past the given version and just send you those to sync. 358 00:19:51,290 --> 00:19:58,150 Was there any prior art to combining a relational database and CRDTs or 359 00:19:58,150 --> 00:20:00,280 another mechanism to synchronize? 360 00:20:00,923 --> 00:20:05,523 Yeah, so there's Iver's work, and then, I guess, James Long's work on 361 00:20:05,603 --> 00:20:08,968 well, Absurd-SQL, but I guess Yeah, I don't know if he ever gave a name to 362 00:20:08,968 --> 00:20:10,728 the layer above it that did the CRDTs. 363 00:20:10,948 --> 00:20:15,518 But yeah, James work, if I recall correctly, was all in JavaScript. 364 00:20:15,798 --> 00:20:17,668 So it wasn't built into SQLite itself. 365 00:20:17,688 --> 00:20:20,088 You couldn't just run it anywhere SQLite could run. 366 00:20:20,308 --> 00:20:21,898 It had to be in a JavaScript environment. 367 00:20:22,508 --> 00:20:23,858 And he took a different approach. 368 00:20:24,108 --> 00:20:29,058 I think he used hybrid logical clocks and Merkle DAG. 369 00:20:29,058 --> 00:20:31,798 So yeah, I guess we started getting some trade offs of like, 370 00:20:32,118 --> 00:20:33,658 Strut was the first use case. 371 00:20:34,028 --> 00:20:37,328 Which actually informed a lot of design decisions for CR-SQLite and why I 372 00:20:37,328 --> 00:20:41,658 didn't want to use something like James Long's approach or, I don't know other 373 00:20:41,808 --> 00:20:42,968 solutions that were available at the time. 374 00:20:43,818 --> 00:20:44,078 Yeah. 375 00:20:44,078 --> 00:20:48,138 I'm curious to hear more about your your judgment of the trade 376 00:20:48,138 --> 00:20:51,868 offs and what made it a good versus not so good fit for your use case. 377 00:20:52,442 --> 00:20:52,582 Yeah. 378 00:20:52,582 --> 00:20:55,612 So I think James's approach yeah, I hope I'm remembering everything correctly. 379 00:20:55,632 --> 00:20:58,772 This was like a couple of years ago, but when I looked into it, every 380 00:20:58,932 --> 00:21:01,042 single message for all time was stored. 381 00:21:01,602 --> 00:21:03,442 So this is, you know, kind of like event sourcing, right? 382 00:21:03,942 --> 00:21:08,962 But like, one of my key requirements, like, I was using Strut to drive 383 00:21:08,962 --> 00:21:10,302 requirements for CR-SQLite. 384 00:21:10,922 --> 00:21:14,372 And I wanted the ability for people to be offline as long as they wanted. 385 00:21:14,782 --> 00:21:18,172 For it to be able to run on constrained, low memory devices. 386 00:21:18,937 --> 00:21:22,307 And so those two in combination with the fact that it's like a presentation 387 00:21:22,307 --> 00:21:26,247 editor where you can like, you can drag and drop components, you can spin them, 388 00:21:26,247 --> 00:21:31,667 rotate them type, like I didn't want every single event for every keystroke 389 00:21:31,667 --> 00:21:34,877 and every drag and all these things to be logged for all time, right? 390 00:21:35,427 --> 00:21:37,317 Especially if you're offline indefinitely. 391 00:21:37,637 --> 00:21:41,527 So if you're offline for like, I don't know, a couple days editing 392 00:21:41,527 --> 00:21:43,877 a really big presentation, you come back offline, I want it to 393 00:21:43,887 --> 00:21:45,107 be able to sync almost instantly. 394 00:21:45,797 --> 00:21:50,527 So the approach I chose was the set of CRDTs that allow you to collapse history. 395 00:21:51,147 --> 00:21:56,617 So, yeah, if you've been off The database size never grows more than 396 00:21:56,617 --> 00:21:58,467 a constant factor of the base data. 397 00:21:59,057 --> 00:22:01,737 So no matter how much you're editing, you're always going to be within a 398 00:22:01,737 --> 00:22:04,067 constant factor of the regular DB size. 399 00:22:04,687 --> 00:22:07,127 I don't know, I think that constant factor was like three or four times, 400 00:22:07,677 --> 00:22:08,757 I don't recall the exact number. 401 00:22:09,277 --> 00:22:12,237 Whereas the other systems, if you're the longer offline, the more and 402 00:22:12,237 --> 00:22:15,067 more the data grew which was, you know, not a trade off I liked. 403 00:22:15,537 --> 00:22:18,490 Yeah, I guess the downside to my approach is there's a few less 404 00:22:18,510 --> 00:22:21,980 transactional guarantees and you can't, you know, rewind history. 405 00:22:22,550 --> 00:22:24,720 But if somebody really wanted rewinding history, I was like, Oh, 406 00:22:24,750 --> 00:22:26,250 they could do it in user space. 407 00:22:26,350 --> 00:22:27,680 Like, that's not the problem I'm trying to solve. 408 00:22:28,127 --> 00:22:29,187 That makes perfect sense. 409 00:22:29,217 --> 00:22:34,697 And I think I really like, you focusing on the use case that you're most familiar 410 00:22:34,697 --> 00:22:38,537 with Strut and the requirements around that, Since we're dealing with like 411 00:22:38,537 --> 00:22:43,293 distributed systems and programming is hard enough, that you can't really 412 00:22:43,423 --> 00:22:46,043 focus on all use cases at the same time. 413 00:22:47,043 --> 00:22:50,913 And this way you design for the use case that you know of. 414 00:22:51,183 --> 00:22:55,203 And I think this is where you have like a canvas app or something like that 415 00:22:55,213 --> 00:22:57,653 where you move things around constantly. 416 00:22:58,033 --> 00:23:02,683 If you work for multiple days, you don't really care exactly about 417 00:23:02,693 --> 00:23:06,273 the change you've made over like five seconds, three days ago. 418 00:23:06,493 --> 00:23:10,703 You just care about, okay, this is the state it is in right now and that 419 00:23:10,723 --> 00:23:13,173 other users converge to the same thing. 420 00:23:13,643 --> 00:23:19,378 Whereas I think a different set of requirements might be for a chat app 421 00:23:19,728 --> 00:23:24,618 where you might not send around, like, which sort of keystrokes someone has 422 00:23:25,198 --> 00:23:27,088 taken to arrive at the final message. 423 00:23:27,088 --> 00:23:30,158 You send the final message, but then those final messages, you 424 00:23:30,158 --> 00:23:32,868 might actually care about, okay, this came at this point in time. 425 00:23:32,868 --> 00:23:35,828 One message does rarely override another one. 426 00:23:35,958 --> 00:23:39,238 So this is, I think, where that might be a better fit. 427 00:23:39,418 --> 00:23:44,378 And that makes also sense that in James use case with Actual Budget that's all 428 00:23:44,378 --> 00:23:46,203 about, like, historic transactions. 429 00:23:46,223 --> 00:23:49,393 You want exactly those like records and time. 430 00:23:49,633 --> 00:23:52,643 There's probably not so many, and even if there are many, 431 00:23:52,793 --> 00:23:53,973 they're all kind of worth it. 432 00:23:54,513 --> 00:23:55,633 They need to be accurate. 433 00:23:56,263 --> 00:23:59,643 So I think that makes perfect sense, and I think there will be like many different 434 00:23:59,643 --> 00:24:01,613 solutions for many different use cases. 435 00:24:02,133 --> 00:24:06,128 And I think there's even, you know, an approach how you could apply a bit of like 436 00:24:06,128 --> 00:24:08,728 that compaction, even to event sourcing. 437 00:24:08,778 --> 00:24:12,527 I think there's like a concept called subsumption, I think. 438 00:24:12,947 --> 00:24:16,707 And that allows you to also like flatten some of those historic events. 439 00:24:16,997 --> 00:24:20,347 So that's super interesting topic of research, I think. 440 00:24:20,813 --> 00:24:23,813 . Yeah, I know there for certain CRDT algorithms and certain 441 00:24:23,823 --> 00:24:25,473 event sourcing things, like. 442 00:24:26,068 --> 00:24:29,858 Yeah, there are ways to compact history, but it did seem like, still 443 00:24:29,858 --> 00:24:31,978 a pretty evolving area of research. 444 00:24:32,508 --> 00:24:35,238 I think now it's a lot more understood especially in the, 445 00:24:35,238 --> 00:24:37,738 like, list based CRDT space. 446 00:24:37,878 --> 00:24:41,318 Sure, like, I don't know, I guess, Yjs kind of pioneered a lot of 447 00:24:41,318 --> 00:24:44,508 that work of, like, run length encoding, everything, and then stuff. 448 00:24:45,188 --> 00:24:47,908 Yeah, I guess what I want to quickly mention, so there's, when 449 00:24:47,908 --> 00:24:50,952 I was first getting started, there was a community called Braid.org 450 00:24:51,452 --> 00:24:53,622 which I got involved with early on. 451 00:24:54,132 --> 00:24:56,932 And yeah, I don't know, that was an amazing community in terms 452 00:24:56,932 --> 00:24:59,182 of like learning and being open. 453 00:24:59,322 --> 00:25:03,232 So I just want to call them out, like they helped me a ton to understand 454 00:25:03,312 --> 00:25:07,132 all these CRDT algorithms and when history can be compacted, when it can't. 455 00:25:07,827 --> 00:25:11,847 What kind of weird circumstances you get into if you know you're emerging 456 00:25:11,847 --> 00:25:14,517 changes peer-to-peer versus, you know, client server and whatnot. 457 00:25:14,987 --> 00:25:15,437 Yeah. 458 00:25:15,527 --> 00:25:19,657 There this entire community, I'm not yet a part of, but given that 459 00:25:19,657 --> 00:25:24,517 this year I'm also looking much more into implementing, syncing this. 460 00:25:24,577 --> 00:25:28,347 I feel like there's a lot I have to learn about and I would love 461 00:25:28,352 --> 00:25:31,627 to also have some folks from that community here on the podcast. 462 00:25:32,297 --> 00:25:33,892 Yeah they're trying to add. 463 00:25:34,087 --> 00:25:40,947 Syncing to the HTTP protocol, so essentially adding a new verb rather 464 00:25:40,947 --> 00:25:46,070 than get, put, and post to allow you to merge documents or resources together. 465 00:25:46,602 --> 00:25:47,972 That sounds fascinating. 466 00:25:48,002 --> 00:25:49,672 How far along is that effort? 467 00:25:49,672 --> 00:25:52,595 They are part of IETF now. 468 00:25:52,955 --> 00:25:56,385 I don't remember if they're a research, there's like two separate types of groups. 469 00:25:56,385 --> 00:25:59,448 There's like research groups and maybe there's standards groups. 470 00:25:59,448 --> 00:26:00,528 I think those are the two. 471 00:26:01,108 --> 00:26:03,858 Yeah, I don't know if they're still in the research phase or if 472 00:26:03,858 --> 00:26:04,953 they're on the standards track yet. 473 00:26:05,263 --> 00:26:06,113 Very interesting. 474 00:26:06,163 --> 00:26:07,623 I have to look more into that. 475 00:26:08,153 --> 00:26:14,823 So, besides choosing the right CRDTs to fit into SQLite through virtual 476 00:26:14,823 --> 00:26:21,063 tables I think this, like, just picking SQLite to run in the browser quite a 477 00:26:21,063 --> 00:26:25,363 few years ago, that's probably you've must have been running into a bunch of 478 00:26:25,503 --> 00:26:27,663 challenges and issues along the way. 479 00:26:27,683 --> 00:26:31,823 So I'm curious to hear first getting sort of like a broad overview of 480 00:26:31,823 --> 00:26:34,923 the different challenges you were running into and then going into them. 481 00:26:35,310 --> 00:26:39,620 Yeah, I think I wasted months of my life, like fighting SQLite issues. 482 00:26:39,980 --> 00:26:44,230 Cause yeah, it was really early where the official SQLite WASM project 483 00:26:44,350 --> 00:26:45,500 had kind of like just started. 484 00:26:46,040 --> 00:26:50,587 And I guess some of the first problems I ran into was. 485 00:26:51,058 --> 00:26:54,938 I guess compiling an extension into the WASM build. 486 00:26:55,478 --> 00:26:57,818 So yeah, it's one thing to write a SQLite extension. 487 00:26:57,958 --> 00:27:02,528 It's another thing to get it to run correctly in WASM. 488 00:27:03,288 --> 00:27:06,708 It wasn't so hard when the extension was originally written in C, but like, 489 00:27:06,858 --> 00:27:10,578 I don't know, like trying to code review contributions in C and make sure there's 490 00:27:10,578 --> 00:27:13,068 no like memory issues was a nightmare. 491 00:27:13,178 --> 00:27:16,952 So I eventually like bit the bullet, started learning Rust and re implemented 492 00:27:16,982 --> 00:27:21,282 the extension in Rust, which I think was, yeah, it became like a third 493 00:27:21,392 --> 00:27:24,952 or two thirds less code and just way faster to write after doing that. 494 00:27:24,962 --> 00:27:25,882 I think it was worth the time. 495 00:27:26,477 --> 00:27:29,677 But that introduced this whole new complication of like getting the 496 00:27:29,677 --> 00:27:33,937 Rust compiler toolchain to compile an extension that can be linked to the 497 00:27:33,967 --> 00:27:39,237 SQLite compiler toolchain that then goes through Emscripten to build a Wasm bundle. 498 00:27:39,697 --> 00:27:41,807 And let me see what else was there. 499 00:27:42,447 --> 00:27:45,317 Transactions was an interesting thing. 500 00:27:45,647 --> 00:27:49,577 So the inner, the originally the interfaces to SQLite 501 00:27:49,617 --> 00:27:51,137 in the browser were async. 502 00:27:51,647 --> 00:27:55,492 And if you ever had Well, I guess there's two builds of SQLite. 503 00:27:55,522 --> 00:27:57,362 The official SQLite build was always synchronous. 504 00:27:57,462 --> 00:28:02,212 There's this unofficial build, which I think is better, called wa-sqlite, which 505 00:28:02,212 --> 00:28:03,432 all the interfaces were asynchronous. 506 00:28:03,942 --> 00:28:07,732 But it had these interesting caveats, like if you await two calls to 507 00:28:07,772 --> 00:28:09,372 SQLite at once, it would deadlock. 508 00:28:09,578 --> 00:28:13,738 So you had to make sure like there's only one Call to SQLite ever at once. 509 00:28:14,298 --> 00:28:16,148 So, you know, working around that challenge. 510 00:28:16,265 --> 00:28:20,915 Transaction interleaving was another, so like if it's asynchronous and you 511 00:28:20,915 --> 00:28:24,165 start a transaction and you have all the statements in the transaction, okay. 512 00:28:24,175 --> 00:28:26,995 You, you began the transaction and then you like do a read, 513 00:28:26,995 --> 00:28:27,825 you're waiting the read. 514 00:28:28,165 --> 00:28:30,895 Well, while you're waiting in the read, you delegate control of the event loop, 515 00:28:31,255 --> 00:28:35,795 so somebody else could come in and, you know, kick off a new task, right? 516 00:28:36,235 --> 00:28:38,605 A new statement to do a write or something. 517 00:28:39,105 --> 00:28:41,625 Well, now that's somebody else's transaction that's trying to run while 518 00:28:41,625 --> 00:28:44,585 your other transaction is running, so you get this, like, weird interleaving. 519 00:28:44,615 --> 00:28:48,195 So, yeah trying to lock out transactions in the JavaScript layer that was 520 00:28:48,195 --> 00:28:49,375 another annoying thing to deal with. 521 00:28:49,450 --> 00:28:55,598 And then, yeah, this is not related to WASM, but just getting SQLite 522 00:28:56,288 --> 00:28:59,098 with the extension built for all the platforms I wanted to target. 523 00:28:59,331 --> 00:29:04,261 So, like, eventually Expo wants to include it in their Expo SDK, so figuring 524 00:29:04,261 --> 00:29:09,051 out how to do Android and iOS builds in addition to, you know, the WASM build. 525 00:29:09,886 --> 00:29:11,796 And Mac and Linux builds. 526 00:29:12,325 --> 00:29:18,045 Those sound like quite the buffet of different challenges that some sounds 527 00:29:18,325 --> 00:29:23,815 familiar to me as well as I think some you've been pioneering a tad earlier. 528 00:29:23,855 --> 00:29:26,975 And by now, some of those are maybe a bit more. 529 00:29:27,150 --> 00:29:31,240 A well trodden path, but I think there's probably even more 530 00:29:31,260 --> 00:29:32,900 that we don't understand yet. 531 00:29:33,670 --> 00:29:36,630 One more, I know one more interesting one I think more interesting than 532 00:29:36,630 --> 00:29:41,680 the other ones is since SQLite in the browser, at least the one 533 00:29:41,680 --> 00:29:43,430 I'm using, is IndexedDB backed. 534 00:29:43,860 --> 00:29:48,200 IndexedDB has a very high cost for opening a transaction which 535 00:29:48,200 --> 00:29:49,290 I was unaware of at the time. 536 00:29:49,850 --> 00:29:51,620 So yeah, there's a lot of work. 537 00:29:51,862 --> 00:29:55,642 And anytime you open a SQLite transaction, it would open an IndexedDB transaction. 538 00:29:56,339 --> 00:30:00,669 So there's a lot of work making sure that, like, for reads Yeah, normally 539 00:30:00,669 --> 00:30:03,109 you just issue reads, you wouldn't necessarily batch them into a transaction. 540 00:30:03,119 --> 00:30:05,479 Like normally, you know, you'd batch your writes into a transaction. 541 00:30:06,139 --> 00:30:09,609 But yeah, to make SQLite in the browser go fast if you're doing a whole bunch 542 00:30:09,609 --> 00:30:14,439 of reads, like making sure a transaction was open automatically all the reads 543 00:30:14,439 --> 00:30:16,839 would happen and then the transaction would close once all the reads are done. 544 00:30:16,959 --> 00:30:20,989 So it's like way of batching that was like a 10x speed up in the browser for SQLite. 545 00:30:21,408 --> 00:30:26,808 So once you've solved all of those performance challenges and also like 546 00:30:26,858 --> 00:30:30,638 correctness, syncing challenges, et cetera, then it's also needs 547 00:30:30,638 --> 00:30:34,738 to be competitive somehow with the typical developer experience 548 00:30:34,758 --> 00:30:39,518 you have in a React app or in a different framework you're using. 549 00:30:40,238 --> 00:30:41,638 How did you go about designing that? 550 00:30:42,132 --> 00:30:47,415 Yeah, so, yeah, I've developed a set of hooks, so, I don't know, it's, I guess 551 00:30:47,415 --> 00:30:51,349 like the React integration was probably the least amount of, I don't know, 552 00:30:51,359 --> 00:30:52,349 I'd say it's the least amount of work. 553 00:30:52,389 --> 00:30:55,079 Maybe not once I like think about all the strict code problems and 554 00:30:55,079 --> 00:30:58,669 stuff but yeah, essentially the DevEx was like a use query hook. 555 00:30:59,159 --> 00:31:03,079 So I think it's a lot easier to develop when your queries are 556 00:31:03,109 --> 00:31:04,219 co located with your components. 557 00:31:04,540 --> 00:31:07,770 So every component is responsible for getting the data it needs. 558 00:31:08,051 --> 00:31:10,651 And this makes your app like super composable, right? 559 00:31:10,651 --> 00:31:15,331 Like you can add and remove components to your app or your component tree. 560 00:31:15,601 --> 00:31:18,361 And because they're responsible for getting their data, like, you 561 00:31:18,361 --> 00:31:20,831 don't have to do any additional wiring or prop drilling. 562 00:31:20,831 --> 00:31:26,501 And yeah, they're not going to break other components because suddenly, 563 00:31:26,881 --> 00:31:28,261 Some data dependencies are missing. 564 00:31:28,931 --> 00:31:29,771 So I really like that, yeah. 565 00:31:29,831 --> 00:31:33,831 Co located queries, so every component you could use this hook called UseQuery. 566 00:31:34,491 --> 00:31:39,051 And in that UseQuery, you just literally write your SQL that fetches 567 00:31:39,051 --> 00:31:40,671 the data required by that component. 568 00:31:41,221 --> 00:31:44,301 And that SQL could be as complex or as simple as you wanted. 569 00:31:44,891 --> 00:31:49,181 So, you know, like, select from issue where id equals issue id or something 570 00:31:49,181 --> 00:31:52,731 simple or, you know, something complex like select star from issue, 571 00:31:52,741 --> 00:31:57,721 order by modify, join labels join owner, and yeah, all these things. 572 00:31:58,251 --> 00:32:03,661 So you've managed to tame most challenges with SQLite in the browser, making sure 573 00:32:03,661 --> 00:32:08,021 it works in Wasm, making sure that your own extension is working, et cetera. 574 00:32:08,441 --> 00:32:14,451 And SQLite can be super, super fast for most workloads, but once your 575 00:32:14,461 --> 00:32:19,211 cardinality grows, et cetera, your tables get really big then queries can 576 00:32:19,221 --> 00:32:21,716 also start Taking a little bit longer. 577 00:32:22,136 --> 00:32:26,516 And that is because SQLite always kind of, besides a little bit of caching, 578 00:32:26,756 --> 00:32:31,066 always starts with your results for your queries from scratch, kind of. 579 00:32:31,456 --> 00:32:35,166 And you know, I've been chatting about this for quite a while back then. 580 00:32:35,466 --> 00:32:37,856 This is how I know about all of this context. 581 00:32:37,886 --> 00:32:39,636 And that has led you to look into. 582 00:32:40,006 --> 00:32:43,436 incrementally maintaining the query results and the views. 583 00:32:43,756 --> 00:32:46,756 So I'm very curious to hear more what you've learned back then 584 00:32:46,766 --> 00:32:48,896 and which path this led you down. 585 00:32:49,486 --> 00:32:49,936 Yeah. 586 00:32:50,196 --> 00:32:50,336 Yeah. 587 00:32:50,336 --> 00:32:53,286 So I guess, you know, I was talking earlier about each component 588 00:32:53,336 --> 00:32:56,326 fetches its own data, has this use query hook, or just, you know, 589 00:32:56,326 --> 00:32:57,926 specifies a SQL query it's running. 590 00:32:58,456 --> 00:33:02,096 So when you're developing like a rich client side application like this, 591 00:33:02,646 --> 00:33:04,796 ideally your database is reactive, right? 592 00:33:04,796 --> 00:33:06,596 So any, anytime some data changes. 593 00:33:07,146 --> 00:33:09,066 It's like you, you work on Overtone, right? 594 00:33:09,076 --> 00:33:09,756 The music app. 595 00:33:10,316 --> 00:33:16,386 If somebody presses play, then, you know, all the components that care about 596 00:33:16,386 --> 00:33:18,166 play state need to immediately update. 597 00:33:18,906 --> 00:33:22,591 And yeah, as you're saying in SQLite, if you're doing this somebody changes some 598 00:33:22,591 --> 00:33:27,588 state, well, you know, SQLite doesn't have any, you know, facilities for reactivity. 599 00:33:27,648 --> 00:33:29,368 It has a few but they're not very good. 600 00:33:29,868 --> 00:33:32,388 Like they don't give you many fine grained notions about what changed 601 00:33:32,388 --> 00:33:33,618 or like what queries it impacts. 602 00:33:34,168 --> 00:33:37,678 So what you're left doing is rerunning. 603 00:33:38,098 --> 00:33:41,598 Essentially from scratch the queries that could have been affected. 604 00:33:42,268 --> 00:33:46,018 So, like, the SQLite change notifications, they'll tell you the row ID that 605 00:33:46,018 --> 00:33:48,818 changed, but not the contents, and they'll tell you, like, the table. 606 00:33:49,228 --> 00:33:52,528 So it's usually not enough to figure out exactly which queries to invalidate. 607 00:33:52,879 --> 00:33:55,619 And yeah, so you rerun tons of queries throughout the app. 608 00:33:56,209 --> 00:34:00,079 And those are all rerunning from scratch, so like, if you have, like, 609 00:34:00,079 --> 00:34:03,779 for your track list, you're showing a few hundred tracks, and that track list 610 00:34:03,779 --> 00:34:05,589 has to do a number of joins, right? 611 00:34:05,589 --> 00:34:08,919 Has to, for a track, has to join the album to get the album title, has to 612 00:34:08,919 --> 00:34:11,949 join the artist to get the artist names. 613 00:34:12,569 --> 00:34:13,689 Yeah, I don't know if there's any other joins. 614 00:34:13,719 --> 00:34:18,119 Maybe there could be joins about, like, like status, or favorite status, 615 00:34:18,289 --> 00:34:19,539 or something about the track, right? 616 00:34:20,209 --> 00:34:25,049 And, yeah, rerunning that join to select, you know, 200 some items. 617 00:34:25,509 --> 00:34:28,789 And yeah, maybe they just want to sort, and sorting it doing that from 618 00:34:28,789 --> 00:34:32,909 scratch every time somebody mutates something severely limits I guess how 619 00:34:32,909 --> 00:34:34,149 fast you can interact with the app. 620 00:34:34,749 --> 00:34:38,499 And yeah, I'd worked, tried a bunch with like, okay, you know, 621 00:34:38,499 --> 00:34:41,539 maybe a purely in memory SQLite, and that gets you pretty fast. 622 00:34:42,099 --> 00:34:45,119 But yeah, somebody was saying, oh, like, The goal for a reactive 623 00:34:45,119 --> 00:34:46,419 database is it's memory fast. 624 00:34:47,099 --> 00:34:50,669 And then when I started, like, I don't know, that idea, a lot just 625 00:34:50,669 --> 00:34:54,049 helped my brain and I started like actually benchmarking, you know, 626 00:34:54,729 --> 00:34:57,399 if I'm just observing a value in JavaScript, how fast is that? 627 00:34:57,809 --> 00:35:02,699 And if I'm like, Observing a query in SQLite and updating the 628 00:35:02,699 --> 00:35:03,909 row and re running the query. 629 00:35:03,909 --> 00:35:04,679 Like, what is that? 630 00:35:04,689 --> 00:35:08,109 And I have a observable notebook somewhere that like compares all this 631 00:35:08,109 --> 00:35:09,589 and like the difference was massive. 632 00:35:09,989 --> 00:35:13,079 So like just seeing how far I was from memory fast, like started 633 00:35:13,089 --> 00:35:15,589 making me a bit sad on SQLite. 634 00:35:15,819 --> 00:35:20,919 And also like benchmarking the original strut and the new one, which use 635 00:35:20,919 --> 00:35:22,569 SQLite and these use query hooks. 636 00:35:23,099 --> 00:35:27,239 Yeah, like the original one I could throttle my Chrome, you know, but in 637 00:35:27,249 --> 00:35:31,769 the dev tools you can like downgrade your CPU right to like 6x slower or 638 00:35:31,769 --> 00:35:35,449 something and I could go all the way down to the slowest possible and everything 639 00:35:35,449 --> 00:35:40,839 was buttery smooth, but in the new one, like if I downgraded it too much, 640 00:35:40,849 --> 00:35:42,289 like you could see some visible lag. 641 00:35:42,349 --> 00:35:43,069 SQLite 642 00:35:45,129 --> 00:35:49,399 is fundamentally built in this request response style, right? 643 00:35:50,389 --> 00:35:55,079 This is the era of the LAMP stack where, you know, databases request 644 00:35:55,079 --> 00:35:56,129 response made sense, right? 645 00:35:56,189 --> 00:35:59,839 A user would go to the website it would do a query against the 646 00:35:59,839 --> 00:36:01,099 DB and it would render, right? 647 00:36:01,099 --> 00:36:03,179 There was no, like, rich interactivity. 648 00:36:03,179 --> 00:36:04,549 It was always a full page refresh every time. 649 00:36:04,639 --> 00:36:05,919 So request response made sense. 650 00:36:06,449 --> 00:36:09,169 Yeah, but for these rich apps, request response no longer makes sense. 651 00:36:09,169 --> 00:36:13,859 And trying to fit a request response sort of DB designed around that, 652 00:36:13,859 --> 00:36:15,789 it into this reactive scenario. 653 00:36:15,819 --> 00:36:19,239 And I think maybe one day somebody can get there. 654 00:36:19,289 --> 00:36:23,309 But I think right now it didn't seem to make much sense to me and the juice 655 00:36:23,309 --> 00:36:24,539 didn't seem to be worth the squeeze. 656 00:36:25,149 --> 00:36:29,829 So I started, I guess my love affair with SQLite has slowly been coming to a close. 657 00:36:30,249 --> 00:36:32,799 And I started, you know, investigating other projects. 658 00:36:33,369 --> 00:36:36,249 Yeah maybe one, one day port it if there's time or. 659 00:36:36,723 --> 00:36:39,383 porting some of these projects back to SQLite and contributing 660 00:36:39,616 --> 00:36:42,456 incremental data flow, differential data flow to them or something. 661 00:36:43,146 --> 00:36:45,026 But yeah, I guess I should describe what these projects are, right? 662 00:36:45,719 --> 00:36:51,499 , So I like the idea of queries, being able to like, Issue a complex declarative query 663 00:36:52,229 --> 00:36:56,669 against some set of relational data to get back the data you need for your view. 664 00:36:56,853 --> 00:37:00,403 So I wanted to like, how can I , incrementally maintain an 665 00:37:00,413 --> 00:37:01,963 arbitrary query against some data? 666 00:37:02,213 --> 00:37:06,268 So like that track list example you query the, Tracks and their 667 00:37:06,278 --> 00:37:09,378 artists, and their albums, and the play state, and all these things. 668 00:37:10,058 --> 00:37:13,358 How can we, when somebody does it right, rather than re running 669 00:37:13,358 --> 00:37:17,738 the query to get the track list it knows exactly which queries. 670 00:37:17,938 --> 00:37:22,298 Should be invalidated by that write, and rather than re running them, 671 00:37:22,298 --> 00:37:26,784 knows exactly how to patch up the rows that are impacted by the write. 672 00:37:27,193 --> 00:37:31,680 So, you know, at first this problem seems absurdly complex and hard. 673 00:37:32,100 --> 00:37:34,730 And you're like, Oh, yeah, I want to like invalidate and 674 00:37:34,860 --> 00:37:36,870 patch up any arbitrary SQL query. 675 00:37:37,171 --> 00:37:39,171 But yeah, I started reading some of the papers. 676 00:37:39,171 --> 00:37:40,641 There's like a paper called DBSP. 677 00:37:41,111 --> 00:37:43,071 It's a differential data flow paper. 678 00:37:43,761 --> 00:37:47,101 I think Materialize is based on that. 679 00:37:47,781 --> 00:37:49,321 And then, yeah, started implementing it. 680 00:37:49,361 --> 00:37:52,738 And then I realized like, I don't know, this, once you've read it and started 681 00:37:52,738 --> 00:37:56,738 implementing it and you realize like, this is very similar to other stream 682 00:37:56,738 --> 00:38:02,205 processing stuff , like, and other query, you know, builders I've done,, right, 683 00:38:02,245 --> 00:38:06,686 where rather than, you know, you create a series of operations and rather than 684 00:38:06,686 --> 00:38:11,696 them asking a database for the data, they're like taking streaming data in. 685 00:38:12,123 --> 00:38:17,210 So yeah, like some of the work I'd done at Meta was on real time abuse systems 686 00:38:17,570 --> 00:38:19,730 and that was all streaming systems. 687 00:38:20,320 --> 00:38:22,840 So I was like, oh, like this problem isn't that mysterious. 688 00:38:22,840 --> 00:38:24,780 Like I've done stuff like this before. 689 00:38:25,040 --> 00:38:26,370 It seems tractable. 690 00:38:26,566 --> 00:38:27,356 So I, you know. 691 00:38:27,801 --> 00:38:32,811 went full on, decided like, I think we, I think I can implement this yeah, based 692 00:38:32,811 --> 00:38:36,431 on prior experience, based on having done query languages before so yeah, 693 00:38:36,481 --> 00:38:43,391 MaterialLite was born which is bringing differential data flow to JavaScript, so 694 00:38:43,391 --> 00:38:48,861 you can compose, filter, map, reduce, and a join operator to create these pretty 695 00:38:48,871 --> 00:38:52,761 rich queries and have them reactively updated anytime there's a write. 696 00:38:53,115 --> 00:38:53,435 Right. 697 00:38:53,455 --> 00:38:57,775 And so for those of the listeners who have not yet built their own database 698 00:38:58,165 --> 00:39:04,105 and maybe are familiar with like SQL where in SQL, you have like select from 699 00:39:04,105 --> 00:39:08,515 where, and all of those and JavaScript or other programming languages who you 700 00:39:08,515 --> 00:39:10,905 might've like, Map, filter, et cetera. 701 00:39:11,175 --> 00:39:13,145 They might have different names, but they're sort of 702 00:39:13,145 --> 00:39:14,825 like conceptually very similar. 703 00:39:15,345 --> 00:39:20,175 And this is where you're basically just now like trying to recreate the same 704 00:39:20,175 --> 00:39:25,635 semantics that we have from SQL, where you can say select star from this where so, 705 00:39:25,735 --> 00:39:31,830 and you can basically have like an array where it now say .Map .Filter, et cetera, 706 00:39:32,280 --> 00:39:38,420 and where you flip the trade offs from before you owned in SQLite, you might get 707 00:39:38,420 --> 00:39:40,650 this query just once and then it's done. 708 00:39:40,690 --> 00:39:45,630 You no longer are interested in like subsequent changes and then like, smaller 709 00:39:45,650 --> 00:39:49,950 updates, but in JavaScript where our app stays warm, we click the button. 710 00:39:49,950 --> 00:39:51,950 Something's changed slightly. 711 00:39:52,190 --> 00:39:54,940 You want to change the trade offs quite significantly. 712 00:39:55,270 --> 00:39:57,030 And that has led you to Materialite. 713 00:39:57,448 --> 00:40:02,518 Yeah, so like just to build off on like how SQL concepts map to like filter, 714 00:40:02,528 --> 00:40:08,018 map, and reduce, or like map represents select or function application, like 715 00:40:08,138 --> 00:40:12,178 you're selecting some columns, well that's mapping the original object to some 716 00:40:12,198 --> 00:40:14,158 other object with a subset of the fields. 717 00:40:14,838 --> 00:40:19,188 Filter is the same as where Reduce is what you use for any aggregate 718 00:40:19,188 --> 00:40:20,988 function, like group by, sum, count. 719 00:40:21,608 --> 00:40:24,078 Yeah, and I guess one other operator would be concat. 720 00:40:24,218 --> 00:40:28,548 So if you want to do or, you just, you fork your stream that you filtered 721 00:40:28,548 --> 00:40:32,808 or mapped over, and then after you've filtered it, you concat the streams back 722 00:40:32,808 --> 00:40:34,328 together, and then you run a distinct. 723 00:40:34,638 --> 00:40:38,798 So you can do an or and then join, joins a special thing where essentially. 724 00:40:39,343 --> 00:40:43,013 You kind of maintain what you've seen from one stream and then as you 725 00:40:43,023 --> 00:40:46,653 see results from another stream, you link them together based on some key. 726 00:40:46,943 --> 00:40:51,407 At first, it like, Seemed like a large barrier, but then I don't know, once you 727 00:40:51,407 --> 00:40:53,787 get into it and you realize, Oh, like I've done all this before and filter, 728 00:40:53,797 --> 00:40:58,216 map, reduce, like these have direct analogs in SQL, like it's not so bad. 729 00:40:58,713 --> 00:41:04,193 So we've so far like touched the most common usage of SQL. 730 00:41:04,598 --> 00:41:08,818 But SQL has, or SQLite, Postgres, et cetera, but like sticking with 731 00:41:08,818 --> 00:41:12,818 SQLite for a moment there's quite a bit more than like the common 732 00:41:12,828 --> 00:41:15,148 like select from where, et cetera. 733 00:41:15,148 --> 00:41:18,948 There's like various kinds of aggregations, various kinds of ways 734 00:41:18,948 --> 00:41:22,898 to combine queries, sub queries, joins, various kinds of joins. 735 00:41:23,298 --> 00:41:27,418 Is the system that you've came up with MaterialLite, is that on feature 736 00:41:27,418 --> 00:41:30,668 parity on what SQLite is able to do? 737 00:41:31,118 --> 00:41:35,158 Or are there some parts where you say, okay, that's out of scope because 738 00:41:35,158 --> 00:41:40,318 there's years worth, maybe decades worth of optimization have gone into 739 00:41:40,318 --> 00:41:43,918 SQLite into making that fast and that's out of scope for Materialite. 740 00:41:44,358 --> 00:41:44,638 Yeah. 741 00:41:44,658 --> 00:41:47,768 So there's no window functions and there's no recursive queries. 742 00:41:48,343 --> 00:41:52,593 And base Materialite is strictly for streaming. 743 00:41:52,823 --> 00:41:57,896 So, like, if you're processing a stream of writes or events, like 744 00:41:57,936 --> 00:42:01,570 Materialite is a great fit because it's gonna, you know, patch up your 745 00:42:01,570 --> 00:42:03,530 query results as those events come in. 746 00:42:04,060 --> 00:42:09,576 But say you have a table of a million items and you want to 747 00:42:09,596 --> 00:42:11,936 query it from scratch, right? 748 00:42:11,966 --> 00:42:15,056 Like you didn't see the events, you couldn't maintain any queries. 749 00:42:15,136 --> 00:42:16,996 You're just like, I want to know what's in this table. 750 00:42:17,476 --> 00:42:21,196 So, so you're going to have to run that query from scratch, even in Materialite. 751 00:42:21,246 --> 00:42:23,766 But for Materialite, what that means running a query from scratch is like 752 00:42:24,016 --> 00:42:27,966 setting up the Dataflow pipeline and then feeding literally every row, 753 00:42:27,966 --> 00:42:29,546 all million rows into that pipeline. 754 00:42:30,096 --> 00:42:34,246 Which, yeah, since everything's implemented with, like, filter and map, 755 00:42:34,306 --> 00:42:38,056 where, like, it's opaque, like, when you do a filter, you provide a lambda. 756 00:42:38,056 --> 00:42:38,596 It's opaque. 757 00:42:38,656 --> 00:42:42,536 That's what fields you're filtering on which makes this from scratch case hard 758 00:42:42,536 --> 00:42:45,891 because, like, Oh, if I don't know what fields you're filtering on, then I don't 759 00:42:45,891 --> 00:42:49,171 know if I can apply some sort of index that you might have against this data. 760 00:42:49,771 --> 00:42:54,575 So, yeah, I guess after working on Material 8 and getting it to 761 00:42:54,575 --> 00:42:56,945 work pretty well, like, okay, that was the next problem. 762 00:42:57,415 --> 00:43:01,231 And it's like, we need an actual query language because in a 763 00:43:01,231 --> 00:43:02,521 query language is declarative. 764 00:43:02,571 --> 00:43:05,081 You express, like, This is the field name I'm filtering on. 765 00:43:05,081 --> 00:43:07,371 This is the operator being used and this is the value. 766 00:43:07,781 --> 00:43:12,691 So the engine knows exactly where the fields are ordering by the tables you're 767 00:43:12,761 --> 00:43:14,421 using the fields you're filtering on. 768 00:43:15,011 --> 00:43:19,281 And once you have that knowledge, you can also make that like first query 769 00:43:19,291 --> 00:43:23,781 case fast, where I guess you're doing the traditional database thing of like 770 00:43:24,091 --> 00:43:25,531 figuring out what indices to pick. 771 00:43:26,151 --> 00:43:29,411 I don't think we'll ever, Be as fast as SQLite for that case. 772 00:43:29,851 --> 00:43:32,828 But I think, you know, For an app like, for a rich client 773 00:43:32,838 --> 00:43:35,145 like, Overtone or Strut, right? 774 00:43:35,735 --> 00:43:37,375 , Most of your queries are subscriptions, they're not this 775 00:43:37,375 --> 00:43:38,815 like from scratch Sort of thing. 776 00:43:38,935 --> 00:43:42,796 Like, You're gonna subscribe, you're gonna like, You're gonna like, Set up 777 00:43:42,796 --> 00:43:45,896 a query that subscribes to the slide list, set up a query that subscribes 778 00:43:45,896 --> 00:43:47,656 to the component list on a slide. 779 00:43:48,206 --> 00:43:51,776 And then as writes happen, you just want to update those rather than, 780 00:43:52,016 --> 00:43:55,646 yeah, always, like, rather than having to have first class support 781 00:43:55,656 --> 00:43:57,486 for running a query from scratch. 782 00:43:58,226 --> 00:44:00,896 So, yeah, so that latter thing needs to be just fast enough, but 783 00:44:00,896 --> 00:44:03,996 it doesn't need to be, like, Yeah, the fastest thing in the world. 784 00:44:04,446 --> 00:44:08,756 I totally agree that there's a very interesting, different kinds of trade 785 00:44:08,756 --> 00:44:14,046 off that should be explored and should be built that makes so much more sense for a 786 00:44:14,046 --> 00:44:20,016 live active application where everything stays up to date and then changes 787 00:44:20,016 --> 00:44:25,981 quite minimally as opposed to request response, which is how SQLite works today. 788 00:44:26,258 --> 00:44:31,178 That being said, you can still get quite a bit of mileage out of SQLite today. 789 00:44:31,368 --> 00:44:36,438 If you are a little bit more mindful about like how you work with it. 790 00:44:36,748 --> 00:44:42,228 So some patterns that I found to Makes SQLite work for me with Overtone. 791 00:44:42,568 --> 00:44:48,428 Is that what makes SQLite the slowest in a browser context is shoveling data 792 00:44:48,458 --> 00:44:54,548 from within the SQLite database into JavaScript land and back and forth as 793 00:44:54,548 --> 00:44:59,198 this is where you need to cross the memory boundaries from memory staying 794 00:44:59,198 --> 00:45:04,218 within Wasm to how it's being like marshaled into JavaScript objects. 795 00:45:05,043 --> 00:45:11,433 And what I found as a pattern that helps quite a lot is just keeping that small. 796 00:45:11,453 --> 00:45:15,893 So if you need like a, and this is where I think you can apply quite a few of the 797 00:45:15,903 --> 00:45:20,503 tricks that you've now mentioned and you embrace with Materialite, you can actually 798 00:45:20,513 --> 00:45:23,353 bring back and layer on top of SQLite. 799 00:45:23,453 --> 00:45:26,803 Probably still not quite as fast as Materialite. 800 00:45:27,098 --> 00:45:31,268 But probably fast enough to make a use case like Overtone work within that 801 00:45:31,268 --> 00:45:36,138 paradigm as well, that when the list changes, instead of like, getting a 802 00:45:36,138 --> 00:45:40,378 new list of 5, 000 tracks for that playlist, every time those 5, 000 803 00:45:40,428 --> 00:45:45,648 tracks, you just get a change in terms of like, Oh, this new track was added. 804 00:45:46,088 --> 00:45:52,108 And getting that SQLite, I think can also be done with a few tricks namely 805 00:45:52,108 --> 00:45:58,498 also like a temporary table where you can save the previous result of your 806 00:45:58,538 --> 00:46:01,558 query before and after the change. 807 00:46:01,588 --> 00:46:04,858 And then you query that in a diffing way. 808 00:46:05,278 --> 00:46:08,748 And then you still need to sort of like work with that 809 00:46:08,808 --> 00:46:10,878 diff information in JavaScript. 810 00:46:11,198 --> 00:46:14,468 But this is how you can work around that performance challenge. 811 00:46:14,788 --> 00:46:16,898 But your system with Materialite. 812 00:46:17,163 --> 00:46:20,303 All of that is absorbed from you, which is very attractive. 813 00:46:20,363 --> 00:46:24,233 Yeah, and I think right, so Materialite, right now, it can be backed by 814 00:46:24,563 --> 00:46:28,453 an in memory collection, or it's just some little interface that's 815 00:46:28,453 --> 00:46:29,633 called a source that you implement. 816 00:46:30,143 --> 00:46:33,893 So I think, Yeah, SQLite could be that source, and then that would 817 00:46:33,893 --> 00:46:35,603 solve the like, first query problem. 818 00:46:36,163 --> 00:46:39,913 So I just compile the Materialite queries to SQL queries. 819 00:46:40,136 --> 00:46:43,876 and then the incremental updates are maintained by Materialite going forward. 820 00:46:44,083 --> 00:46:46,833 Yeah, obviously you have to like, listen to the transaction failures 821 00:46:46,833 --> 00:46:48,163 and somehow roll those back. 822 00:46:48,603 --> 00:46:51,903 But yeah, I think it's doable to like, Marry those two together and 823 00:46:51,913 --> 00:46:53,803 have kind of the best of both worlds. 824 00:46:54,373 --> 00:46:56,793 And I know there's like a project called GRDB. 825 00:46:56,793 --> 00:47:02,313 I think it's like a iOS SQLite wrapper which adds a bunch of reactivity. 826 00:47:02,673 --> 00:47:04,143 it's not like completely general. 827 00:47:04,203 --> 00:47:06,983 It doesn't support like completely arbitrary SQL queries, at 828 00:47:06,983 --> 00:47:07,993 least in an efficient way. 829 00:47:08,033 --> 00:47:11,083 But I think it gets you close enough with the queries it does support. 830 00:47:11,463 --> 00:47:15,100 One more thing about like, moving from the request response style to 831 00:47:15,100 --> 00:47:17,030 like, queries being subscriptions. 832 00:47:17,610 --> 00:47:19,910 It gives you these really cool opportunities to optimize 833 00:47:19,910 --> 00:47:21,630 queries at a global level. 834 00:47:22,230 --> 00:47:24,960 So like, if all your queries are subscriptions, you can see 835 00:47:24,960 --> 00:47:26,150 what subscriptions are open. 836 00:47:26,740 --> 00:47:29,960 And then you can start seeing which queries are either 837 00:47:29,960 --> 00:47:34,610 identical and deduplicate or which queries share operations. 838 00:47:34,860 --> 00:47:37,270 So like, maybe you have a whole bunch of queries that all do the same join. 839 00:47:37,830 --> 00:47:39,320 While in the streaming system. 840 00:47:39,685 --> 00:47:44,425 You can optimize that so that join happens once and then the results of the join are 841 00:47:44,425 --> 00:47:45,995 fanned out to all the queries that use it. 842 00:47:46,375 --> 00:47:48,755 So yeah, you can do some really neat tricks at a global level once you 843 00:47:49,065 --> 00:47:50,645 model your queries or subscriptions. 844 00:47:50,814 --> 00:47:52,404 Yeah, that's super interesting. 845 00:47:52,434 --> 00:47:58,444 And a few of those things I'm exploring for LiveStore myself where they're 846 00:47:58,444 --> 00:48:00,984 basically the problem of reactivity. 847 00:48:01,239 --> 00:48:05,619 And efficiency, trying to keep the performance as good as possible, 848 00:48:05,649 --> 00:48:10,419 but also squeezing that into a frame budget since LiveStore and Overton 849 00:48:10,429 --> 00:48:14,109 all run within the main thread, which I'm not sure that's probably 850 00:48:14,109 --> 00:48:15,949 also the way how MaterialLite works. 851 00:48:16,519 --> 00:48:20,739 There is Interesting to also see it through a lens of like a different 852 00:48:21,059 --> 00:48:27,299 paradigm shift evolution in the JavaScript front end world which is about signals. 853 00:48:27,649 --> 00:48:32,859 And signals is also about like, I think a more lighter weight reactivity system 854 00:48:32,869 --> 00:48:35,129 that is all about being composable. 855 00:48:35,159 --> 00:48:39,389 And and I think those worlds are now finding, like, a way together 856 00:48:39,539 --> 00:48:44,849 within live store with the prior work with Jeffrey and Nicholas on Riffle. 857 00:48:45,129 --> 00:48:50,379 This was like some early implementations based on a paper called Adapt On. 858 00:48:50,859 --> 00:48:55,559 And from my understanding all of like the research that has gone into adapt on. 859 00:48:56,049 --> 00:48:59,999 And sort of like in parallel, what's been evolving as signals. 860 00:49:00,289 --> 00:49:01,759 Those are actually the same thing. 861 00:49:02,139 --> 00:49:07,099 And I think now we can do some really interesting work on combining the 862 00:49:07,109 --> 00:49:13,284 power that like more substantial state management primitives, Like a database 863 00:49:13,364 --> 00:49:17,354 gives us or something like material light gives us and bring that together 864 00:49:17,684 --> 00:49:23,544 with something like a signal system that also kind of out of the box address the 865 00:49:23,554 --> 00:49:27,754 thing you've just mentioned, which is reusing parts of the computational graph. 866 00:49:28,234 --> 00:49:29,864 The only thing signals is missing. 867 00:49:29,864 --> 00:49:31,224 I'm surprised nobody's done this yet. 868 00:49:31,514 --> 00:49:34,334 Is Incremental computation against collections, right? 869 00:49:34,334 --> 00:49:37,554 So you think of like, I have an array, and I map the array, then I filter 870 00:49:37,554 --> 00:49:39,714 the array, then I reduce the array. 871 00:49:39,764 --> 00:49:42,134 That's like creating a new copy of the array each time. 872 00:49:42,434 --> 00:49:45,484 And if you like, add one element to the array, you like, rerun all 873 00:49:45,484 --> 00:49:47,174 those against the full array again. 874 00:49:47,214 --> 00:49:48,924 So it's kind of like the query from scratch case, right? 875 00:49:49,181 --> 00:49:52,276 Whereas Materialite, right, You add an element to the array, well, 876 00:49:52,286 --> 00:49:55,016 it's only going to run the filter on that one element and the map 877 00:49:55,016 --> 00:49:58,226 on that one element rather than running against the full array. 878 00:49:58,286 --> 00:50:01,066 So there's some interesting benchmarks maybe I can share with you after of 879 00:50:01,066 --> 00:50:05,506 like, yeah, incrementally maintaining a map filter reduce pipeline 880 00:50:05,526 --> 00:50:07,426 against an array of a million items. 881 00:50:07,606 --> 00:50:10,366 Yeah, so it'd be really cool if like the signals implementers would even 882 00:50:10,386 --> 00:50:12,276 implement this sort of stuff eventually. 883 00:50:12,392 --> 00:50:17,772 So even though your love affair with SQLite has for now come to an end. 884 00:50:18,132 --> 00:50:22,962 I think the problems and goals you're chasing after are still very 885 00:50:22,992 --> 00:50:27,442 similar to what I think is feasible with SQLite, just maybe with like 886 00:50:27,442 --> 00:50:29,012 a different performance footprint. 887 00:50:29,372 --> 00:50:32,162 I think with SQLite you also get the benefit. 888 00:50:32,552 --> 00:50:35,982 of just that being like a very familiar and trusted thing that's 889 00:50:35,992 --> 00:50:40,292 like been proven and battle tested for literally decades now. 890 00:50:40,632 --> 00:50:44,032 And even though you said like you were running into some performance 891 00:50:44,052 --> 00:50:49,362 issues SQLite is still so ridiculously fast that it's very feasible to do 892 00:50:49,362 --> 00:50:52,972 write those applications in the, in a browser context, et cetera. 893 00:50:52,972 --> 00:50:57,592 And I think even if you now on a modern CPU straddle your your 894 00:50:57,592 --> 00:50:59,972 CPU, it's still manageable. 895 00:51:00,652 --> 00:51:04,762 But I think there's multiple paths to arrive at the same goal. 896 00:51:05,202 --> 00:51:11,282 And so as my understanding is that the project Materialite opened a couple of 897 00:51:11,282 --> 00:51:16,972 really interesting follow up conversations for you which is now opening a new 898 00:51:16,972 --> 00:51:18,942 chapter for your professional life. 899 00:51:19,212 --> 00:51:20,702 So do you want to tell us a bit more about that? 900 00:51:21,367 --> 00:51:25,167 Yeah, I guess, yeah, Materialite is my least known project, 901 00:51:25,207 --> 00:51:26,867 least complete project. 902 00:51:27,307 --> 00:51:32,787 But yeah, it did pick up some interest by Rocicorp specifically Aaron, who I 903 00:51:32,787 --> 00:51:35,157 met at the local-first conference in St. 904 00:51:35,157 --> 00:51:35,467 Louis. 905 00:51:36,147 --> 00:51:39,317 And yeah, we're talking about, you know, the model of subscript, those queries as 906 00:51:39,317 --> 00:51:42,967 subscriptions and like having a reactive app and I think he's the one that said 907 00:51:42,967 --> 00:51:44,707 like, oh, things need to be memory fast. 908 00:51:45,207 --> 00:51:49,757 And yeah, so, so I started working on MaterialLite and we did some explorations 909 00:51:49,767 --> 00:51:54,267 of like, if you powered one of their demo apps with MaterialLite, like they 910 00:51:54,267 --> 00:51:56,837 have a linear clone that's a demo app and they want to see like, Oh, 911 00:51:56,837 --> 00:51:58,557 how many issues can we scale this to? 912 00:51:58,567 --> 00:52:01,387 If we like use this differential data flow approach. 913 00:52:02,097 --> 00:52:02,347 Yeah. 914 00:52:02,347 --> 00:52:04,617 And they really liked kind of the initial results. 915 00:52:05,197 --> 00:52:07,670 And yeah, we worked together, I guess. 916 00:52:08,089 --> 00:52:10,439 quite a bit over the last four slash six months. 917 00:52:10,499 --> 00:52:12,699 There was like a two month break for me on paternity leave. 918 00:52:13,249 --> 00:52:18,669 And yeah, so I'll be starting a job with Rocicorp here pretty soon as a partner. 919 00:52:18,709 --> 00:52:19,799 Yeah, I'm super excited about that. 920 00:52:20,569 --> 00:52:21,459 That's amazing. 921 00:52:21,509 --> 00:52:23,349 I mean, we had Aaron on the show. 922 00:52:23,379 --> 00:52:27,639 I think that the second show for localfirst.fm, and there were so many 923 00:52:28,009 --> 00:52:33,539 deep insights there by Aaron, and I'm a big fan of the products that's already 924 00:52:33,539 --> 00:52:39,879 been built there, Replicache, Reflect, and sounds like your work with Materialite 925 00:52:40,149 --> 00:52:44,039 is also going to be incorporated into making the products even better. 926 00:52:44,479 --> 00:52:49,959 So I'm curious to hear more of your thoughts on you know, what do you feel 927 00:52:49,959 --> 00:52:54,919 like where's local-first going with the products you're working on with Rosicorp? 928 00:52:55,330 --> 00:52:58,990 Local-first provides a superior DevX because you're no longer 929 00:52:59,000 --> 00:53:02,940 worried about APIs of, like, how do I get data from the server? 930 00:53:03,370 --> 00:53:07,870 Like, a sync engine, or your database is solving, you have a sync engine solving 931 00:53:07,870 --> 00:53:11,300 that for you, right, where you're just coding against local data rather than 932 00:53:11,310 --> 00:53:14,280 setting up all these weird REST APIs that you have to call to get the data. 933 00:53:14,620 --> 00:53:19,207 And yeah, I think that's the key value of local-first and the main thing 934 00:53:19,207 --> 00:53:20,897 that Rocicorp is pushing forward. 935 00:53:21,927 --> 00:53:24,367 It's like making that sync engine experience as best, as 936 00:53:24,697 --> 00:53:25,657 good as it can possibly be. 937 00:53:25,950 --> 00:53:29,970 So you're now joining Rocicorp where you're building Replicache and Reflect, 938 00:53:30,000 --> 00:53:34,674 but the local-first space has grown quite significantly to a point that 939 00:53:34,674 --> 00:53:36,860 can be almost a bit disorienting. 940 00:53:36,920 --> 00:53:41,380 For newcomers who are trying to first figure out what does it mean to build 941 00:53:41,380 --> 00:53:45,630 the app local-first and then also choose the right tool for the job. 942 00:53:45,910 --> 00:53:49,880 Can you provide a little bit of guidance which how someone should think about 943 00:53:49,880 --> 00:53:51,770 use cases and picking the right tool? 944 00:53:52,200 --> 00:53:52,590 Yeah. 945 00:53:52,770 --> 00:53:56,970 So I guess maybe when you're somebody, when the first question somebody 946 00:53:56,970 --> 00:54:00,720 asks, should ask themselves is, "Do they want servers involved at all?" 947 00:54:01,110 --> 00:54:01,920 Is the first question. 948 00:54:02,620 --> 00:54:06,470 So, like, if you want all sync to go through a central authoritative 949 00:54:06,470 --> 00:54:10,430 server, and you want that server to be able to override decisions made 950 00:54:10,430 --> 00:54:15,650 by the client apply data integrity rules revert changes made by a client 951 00:54:16,210 --> 00:54:18,080 Replicash is a very good choice for that. 952 00:54:18,730 --> 00:54:22,887 If you want something, Where, you know, you don't want to preclude the 953 00:54:22,887 --> 00:54:27,737 ability for peer, for nodes to sync peer to peer, or maybe the data they're 954 00:54:27,737 --> 00:54:29,177 syncing is end to end encrypted. 955 00:54:29,687 --> 00:54:33,087 I think something like CR-SQLite actually shines there. 956 00:54:33,137 --> 00:54:37,387 CR-SQLite lets you merge in any topology you want, whether it be 957 00:54:37,417 --> 00:54:39,047 peer to peer or hub and spoke. 958 00:54:39,477 --> 00:54:44,307 So we've seen a couple of companies that are, have like end to end encryption, 959 00:54:44,397 --> 00:54:48,617 where they use CR-SQLite to sync, or they have literally a pub sub 960 00:54:48,617 --> 00:54:51,977 channel, where all peers just publish their messages, and any peer who 961 00:54:51,977 --> 00:54:53,717 receives the messages integrates it. 962 00:54:54,167 --> 00:54:57,640 In those use cases and then I guess if you're really looking for, like, 963 00:54:57,814 --> 00:55:01,834 if your data model is like document based, I guess going back to syncing, 964 00:55:01,874 --> 00:55:05,584 so not just CR-SQLite for peer to peer, but also Yjs and AutoMerge 965 00:55:05,594 --> 00:55:06,804 is capable of peer to peer sync. 966 00:55:07,384 --> 00:55:09,304 They're also, of course, if you can sync peer to peer, you can 967 00:55:09,304 --> 00:55:10,394 sync through central server too. 968 00:55:11,134 --> 00:55:16,204 If you're looking for, you know, you just want to support a collaborative text 969 00:55:16,204 --> 00:55:22,414 field collaborative text doc and you don't need, you know, ways to query over your 970 00:55:22,414 --> 00:55:26,754 documents, like you're fine just storing this doc in a blob in your Whatever your 971 00:55:26,754 --> 00:55:30,984 storage is you know, Yjs is a very good choice because it's primarily targeting 972 00:55:30,984 --> 00:55:33,924 it at text editing, collaborative text but they don't have, you know, 973 00:55:33,924 --> 00:55:38,624 facilities like built in for persistence or querying and these sorts of things. 974 00:55:38,867 --> 00:55:42,667 So I haven't kept super up to date with AutoMerge, but from what I've seen if 975 00:55:42,667 --> 00:55:47,187 you want like, the ability to fork and look at different portions of history. 976 00:55:47,277 --> 00:55:50,647 So like, say people are editing a document somebody wants to fork their 977 00:55:50,647 --> 00:55:55,177 document off, make a bunch of changes and merge the fork back in I mean, I see 978 00:55:55,177 --> 00:55:57,657 AutoMerge talks about this concept a lot. 979 00:55:57,997 --> 00:56:02,647 So I think if it's not there already, AutoMerge would be your best bet for some. 980 00:56:03,267 --> 00:56:04,297 for that sort of behavior. 981 00:56:04,877 --> 00:56:05,937 Yeah, then there's ElectricSQL. 982 00:56:06,837 --> 00:56:11,347 So that is like a, you know, Postgres centric solution. 983 00:56:11,737 --> 00:56:17,294 So if your database is Postgres and you're okay with, like, merging based on CRDT 984 00:56:17,294 --> 00:56:18,564 rules You know, that's a good choice. 985 00:56:18,564 --> 00:56:21,664 So yeah, so I say like your back end is Postgres and you're 986 00:56:21,664 --> 00:56:23,094 okay like using CRDT rules. 987 00:56:23,104 --> 00:56:24,444 Like yeah, what does that mean? 988 00:56:24,764 --> 00:56:27,534 So a really interesting thing about Replicache is it can work with any 989 00:56:27,534 --> 00:56:30,507 back end, pretty much any back end. 990 00:56:30,507 --> 00:56:34,514 So Postgres, MySQL, all sorts of things. 991 00:56:34,784 --> 00:56:39,634 And the way this works is that it uses sort of a rebase model where , the 992 00:56:39,634 --> 00:56:41,474 developer, defines their mutations. 993 00:56:41,729 --> 00:56:47,119 And those mutations get run in a specific order on the server and clients when 994 00:56:47,119 --> 00:56:49,949 they receive updates from the server, any mutations they have outstanding. 995 00:56:50,324 --> 00:56:52,384 They apply the server updates and any mutations that are 996 00:56:52,384 --> 00:56:53,974 outstanding, they rebase on top. 997 00:56:54,524 --> 00:56:59,724 And this is like super flexible in that you can write any, like, if you want to 998 00:56:59,724 --> 00:57:03,034 change merge and conflict behavior, well, you just write your mutations however 999 00:57:03,034 --> 00:57:04,654 you like to get the behavior you want. 1000 00:57:05,044 --> 00:57:09,474 Whereas something like CR-SQLite or electricsql to get the merge 1001 00:57:09,474 --> 00:57:12,124 behavior you want, you have to pick specific CRDT types. 1002 00:57:12,314 --> 00:57:15,064 And it, it seems a little bit less flexible since you can't 1003 00:57:15,064 --> 00:57:16,274 write like a custom mutation. 1004 00:57:16,724 --> 00:57:19,854 But ElectricSQL does seem to have a pretty good end to end story of like, 1005 00:57:19,874 --> 00:57:22,964 you set up your Postgres DB, that schema gets replicated down into a 1006 00:57:22,964 --> 00:57:27,114 local SQLite DB, and as you make schema changes upstream, they flow downstream. 1007 00:57:27,664 --> 00:57:32,270 So in terms of like, Having a good story between two specific 1008 00:57:32,270 --> 00:57:34,220 databases it seems pretty good. 1009 00:57:34,647 --> 00:57:36,577 And then I guess the last thing is PowerSync. 1010 00:57:37,017 --> 00:57:43,249 So PowerSync, uh, works on a different model that's not rebase or CRDTs. 1011 00:57:43,419 --> 00:57:47,879 So with PowerSync, a client will accumulate changes, and if a 1012 00:57:47,879 --> 00:57:50,529 client has local changes, it won't take any changes from the server. 1013 00:57:50,929 --> 00:57:53,009 So in the rebase model, if a client has changes, it'll take 1014 00:57:53,009 --> 00:57:55,369 changes from the server and then replay its local changes on top. 1015 00:57:56,059 --> 00:57:59,329 Uh, but in the PowerSync model, the client has local changes, it won't take any 1016 00:57:59,329 --> 00:58:02,449 server changes until those local changes have been incorporated by the server, 1017 00:58:02,469 --> 00:58:03,799 and then it can take server changes. 1018 00:58:04,479 --> 00:58:07,869 Um, So yeah, to me, I've never built an app with PowerSync, but this seems like 1019 00:58:07,869 --> 00:58:12,149 it'd be a lot higher latency, uh, for the syncing, because if you have any 1020 00:58:12,199 --> 00:58:14,809 outstanding local changes, you simply cannot take a change from the server. 1021 00:58:15,142 --> 00:58:18,412 Whereas rebase model, if you have outstanding changes, it's fine. 1022 00:58:18,452 --> 00:58:21,082 You get the server changes, you just replay your local changes 1023 00:58:21,082 --> 00:58:22,082 on top of the server changes. 1024 00:58:22,285 --> 00:58:26,475 That's a really great overview, and there's many other tools that 1025 00:58:26,485 --> 00:58:31,035 more more tools than we have time to exhaustively can cover here. 1026 00:58:31,325 --> 00:58:34,925 But I think that's a really great orientation, particularly with Asking 1027 00:58:34,925 --> 00:58:38,735 yourself the first question, like where should the authority originate from? 1028 00:58:38,975 --> 00:58:40,535 Do you want to have a server? 1029 00:58:40,665 --> 00:58:43,495 That's, I guess, more traditional how web apps work today. 1030 00:58:43,495 --> 00:58:48,765 And that can simplify things a lot because in that server you can enforce if there's 1031 00:58:48,765 --> 00:58:54,295 like a merge conflict or if there's like maybe a user has set some data in 1032 00:58:54,295 --> 00:58:59,535 a way That might be not compliant with the app that you want to build or for 1033 00:58:59,555 --> 00:59:02,075 other simplification reasons, et cetera. 1034 00:59:02,105 --> 00:59:04,965 So I think picking that, where does the authority live? 1035 00:59:04,985 --> 00:59:06,055 Is there an authority? 1036 00:59:06,375 --> 00:59:09,085 And , which role should a server play? 1037 00:59:09,325 --> 00:59:11,215 I think that's a really important one. 1038 00:59:11,555 --> 00:59:15,005 And then also the two kinds of like ways to go about syncing or 1039 00:59:15,215 --> 00:59:20,965 at least two ways going more about certain spacing, everything on CRDTs. 1040 00:59:21,375 --> 00:59:27,115 Or going with a event rebase model, our Replicash is working. 1041 00:59:27,485 --> 00:59:32,615 I'm curious how someone who has used neither and might have not really 1042 00:59:32,645 --> 00:59:37,215 a lot of intuition for what are the implications of choosing one or the other. 1043 00:59:37,465 --> 00:59:42,015 Can you provide a little bit of an intuition which kind of app use cases 1044 00:59:42,015 --> 00:59:43,916 are a good fit for one or the other? 1045 00:59:44,372 --> 00:59:44,772 Yeah. 1046 00:59:44,842 --> 00:59:47,052 So with CRDTs, you don't. 1047 00:59:47,492 --> 00:59:53,342 Have code on the right path, like the CRDT algorithm is going to 1048 00:59:53,342 --> 00:59:55,882 determine, I guess I should say you don't have code on the merge path. 1049 00:59:55,982 --> 00:59:58,402 The CRDT algorithm is going to determine how things merge. 1050 00:59:58,452 --> 01:00:02,252 And you're just picking a set of CRDT algorithms where something like Replicash, 1051 01:00:02,302 --> 01:00:03,802 where you write custom mutators. 1052 01:00:04,077 --> 01:00:08,317 You can write a set of mutators that run on the client, and you can write mutators 1053 01:00:08,377 --> 01:00:11,487 with the same name, same args, but do something totally different on the server. 1054 01:00:11,947 --> 01:00:16,737 Which, one it's, since you're writing code in the mutation, it's giving you control 1055 01:00:16,747 --> 01:00:19,927 over, kind of, how things merge and sync. 1056 01:00:20,367 --> 01:00:24,317 And two, since The code doesn't have to be identical on every node because 1057 01:00:24,417 --> 01:00:26,287 the server is the authoritative answer. 1058 01:00:26,584 --> 01:00:30,564 You could do something in the server mutation like check permissions or, 1059 01:00:30,634 --> 01:00:33,907 you know, set last modified times or something from the server's 1060 01:00:33,917 --> 01:00:35,457 time rather than the client's time. 1061 01:00:35,632 --> 01:00:40,916 Yeah, but I guess the one downside to that is a CRDT, like, you can sync messages in 1062 01:00:40,936 --> 01:00:46,326 any order, like, yeah, if you have message A, B, and C, if you sync them B, C, A, 1063 01:00:46,326 --> 01:00:50,506 or C, B, A, like, you always get the same state, which means that you don't have to 1064 01:00:50,506 --> 01:00:52,236 have a single server being the authority. 1065 01:00:52,236 --> 01:00:56,466 You can have every peer be on equal footing and essentially have no server 1066 01:00:56,816 --> 01:01:00,529 whereas like this, you know, Replicache model, you do have to have, Someone 1067 01:01:00,529 --> 01:01:04,629 that is that authority that's going to provide the total ordering of messages. 1068 01:01:04,979 --> 01:01:09,579 And to me, it sounds like at least the latter, the rebasing model with 1069 01:01:09,589 --> 01:01:15,109 that authority living in a server, for example, that is a bit easier to get 1070 01:01:15,129 --> 01:01:20,939 into less of a radical shift of like thinking, changing the way, how you 1071 01:01:20,949 --> 01:01:25,729 think about data modeling, et cetera, and it's probably the easier one to 1072 01:01:25,729 --> 01:01:30,624 get into and might be a better catch all solution for many app use cases. 1073 01:01:30,934 --> 01:01:33,684 Like if you look at yeah, I think it's the easier, it's like, it's more 1074 01:01:33,684 --> 01:01:38,324 familiar and if you have an existing app that you're trying to add multiplayer 1075 01:01:38,324 --> 01:01:41,824 to Like, yeah this can work on your existing data model without much 1076 01:01:41,824 --> 01:01:44,114 changes or with little to no changes. 1077 01:01:44,144 --> 01:01:47,774 Whereas, oh, you want to switch to CRDTs, like you might have to make a new data 1078 01:01:47,774 --> 01:01:50,904 model specifically for this collaboration thing and migrate your stuff over. 1079 01:01:51,304 --> 01:01:51,654 Right. 1080 01:01:51,734 --> 01:01:55,384 And I mean, the advice I would give on top of that is that. 1081 01:01:55,404 --> 01:02:00,064 There are now like so many interesting and easy to use options out there, 1082 01:02:00,064 --> 01:02:04,354 whether it's Replicache, whether it's AutoMerge or others just maybe try 1083 01:02:04,374 --> 01:02:09,184 building the same mini version of your app in different technology stacks, 1084 01:02:09,194 --> 01:02:14,274 see what works best for you, and you'll develop that intuition just by yourself. 1085 01:02:14,634 --> 01:02:16,554 And I think that's a really fun thing to do. 1086 01:02:16,979 --> 01:02:17,509 Cool. 1087 01:02:17,779 --> 01:02:21,209 Matt, thank you so much for sharing about like all those different 1088 01:02:21,209 --> 01:02:22,849 projects you've been involved with. 1089 01:02:22,899 --> 01:02:26,779 I'm really excited to see what you'll be building at Rosicorp. 1090 01:02:27,089 --> 01:02:29,559 Is there anything else you want to share with the audience now? 1091 01:02:30,129 --> 01:02:31,029 Yeah, just stay tuned. 1092 01:02:31,029 --> 01:02:31,819 We're building the next. 1093 01:02:32,104 --> 01:02:34,484 Kind of iteration of Replicache and Reflect. 1094 01:02:34,664 --> 01:02:39,701 And , it was exciting enough to make me like drop everything I was doing 1095 01:02:39,701 --> 01:02:44,921 and join the team and yeah, go back to, I yeah, work, working for somebody, 1096 01:02:44,921 --> 01:02:49,041 I guess, rather than just, you know, following whatever research interests, 1097 01:02:49,091 --> 01:02:50,391 wherever they took me from day to day. 1098 01:02:50,891 --> 01:02:53,461 So yeah, it's, I think what they're building is super compelling and 1099 01:02:53,471 --> 01:02:57,301 You'll hear more, you know, over the summer and coming months. 1100 01:02:57,944 --> 01:02:58,964 That's incredible. 1101 01:02:59,024 --> 01:03:03,104 I'm really excited to hear more about what you all have been working on. 1102 01:03:03,614 --> 01:03:06,314 And yeah, thank you so much for coming on the show. 1103 01:03:06,814 --> 01:03:06,974 Yeah. 1104 01:03:06,974 --> 01:03:07,614 Thanks for having me. 1105 01:03:08,690 --> 01:03:11,230 Thank you for listening to the localfirst.fm podcast. 1106 01:03:11,470 --> 01:03:15,200 If you've enjoyed this episode and haven't done so already, please subscribe and 1107 01:03:15,200 --> 01:03:16,960 leave a review wherever you're listening. 1108 01:03:17,330 --> 01:03:18,760 Please also tell your friends about it. 1109 01:03:18,770 --> 01:03:22,340 If you think they could be interested in local-first, if you have feedback, 1110 01:03:22,380 --> 01:03:26,120 questions or ideas for the podcast, please get in touch via hello at 1111 01:03:26,120 --> 01:03:31,900 localfirst.fm or use the feedback form on our website, special thanks to Expo and 1112 01:03:31,900 --> 01:03:33,950 Crab Nebula for supporting this podcast. 1113 01:03:34,350 --> 01:03:35,110 See you next time.