No, not really… It’s not even close, actually. They’re like distant cousins, sure. But it’s not a database. People give me shit all the about writing a database after spending my entire life telling people not to write databases; we didn’t, we did not write a database. There was hardly any query engine and there was no transactions. We wrote an optimized file format. I’m trying that out. [laughter] Anyway, and it uses a Cassandra model, so we can partition your reads over a whole bunch of nodes, so it’s really fast. It’s very important to us that this is interactive, that this is not a thing that you set a new constructive query and then walk away from your desk. It’s interactive because debugging is interactive.

When get sufficiently , they outstrip your ability to predict what is going to break. And I think a lot of us are hitting that threshold faster and sooner than ever before, because there are so many trends that are pushing this level of complexity – everything from schedulers and containers to [unintelligible 00:17:43.09] to your distributed systems, microservices… All these things are awesome, but they’re a lot harder to debug than the LAMP stack was. A lot more of the intelligence lives in the edges between the nodes, not just deep diving in the nodes themselves. In fact, you may not even have any servers, and now what do you do? And it’s really important just to stitch together everything from the edge, with your mobile or IoT device… Storage is out there increasingly, too. How do you know where this bit is supposed to be? All the way through the code that you write yourself; it has to be native SDKs, like an APM [unintelligible 00:18:23.12]

It’s really important to us to be able to debug your database. I don’t know how people DBA without this kind of thing, and the answer is that they don’t. They just know how to look for slow queries, but that’s very often not actually the problem. For instance, people are like “Oh, my database is getting slow, and I looked for slow reads, because reads that used to take one second are now taking you 30 seconds.” Well, okay, that may be the symptom that you’re seeing, but often the problem is something like your write volume is getting higher, and they’re all contending for this one row or this one lock, and because those writes can yield, if they’re just reaching period saturation you can tune read queries all you want, it’s not actually going to make a dent.

The problem – with our tool what you can do is add up all of the time that each lock is being held by each lock query, and there it is. It’s just so easy to deviate once you can see these things. [unintelligible 00:19:24.27] way more interesting Go stuff to say, but we’re using Go for everything – the UI, all the way down to the guts, and it’s not been one of our top ten problems, even close.

Source link


Please enter your comment!
Please enter your name here