[00:35:46.25] Exactly. In those situations, it shouldn’t be so easy to just blindly put in a retry. You perhaps wanna think about how this operation fared. That means that you need to know very intimately about all the parts of the code downstream for you, which means a lot more coupling, a lot more knowledge of the components that you’re building on top of. There are certainly cases to do that; I think they are less widespread than people think, and in general you want to try and compose your programs out of small pieces.

To give an example, the SSH package, which [unintelligible 00:36:33.06] public keys, SSH agents… The interfaces that those types implement are just the usual read/write closer. We worked really hard to make the [unintelligible 00:36:46.16] interface and the session interface look pretty much like a read/write closer or a similar thing that you get from os/exec. People don’t expect os/exec phase to be retriable, so we don’t really expose those either. That’s all from the point of view of building packages that interoperate at a very high level. They don’t know a lot about each other apart from the interfaces.

There’s a separate part of error handling, which is when error does happen, how do you tell the developer or the operations person — what I was saying earlier, you just kind of wave your hands and say “I’m just gonna give the error back to the person above me, the code above me. It will figure out what to do.”

Eventually, you’re gonna reach the top of your function or the main handler of your web server or whatever it is, and if that is gonna come to you, you’re gonna have to figure out what happened. In that case, you want to get as much information about the error that happened. You want to encode as much information as you can; preferably, you want to get a stack trace or something to point you to where the error actually occurred. Because as a developer, I’m gonna get a bug report, and if it just says, “Failed to request I/O.EOF” – where did that come from?

So the second part of error handling is using the fact that the error is a value, and we’ve just talked about it from the caller’s point of view – just making it opaque, just making it, “An error happened” and you don’t know anything more than that. Then we can use this fact and we can stick extra information into it. For a long , the tradition has been to use fmt Errorf, put in a little prefix and then print out the error. Then annotate the error all the way up, so you get this kind of string that’s growing, with a little bit on the front every .

That’s been a pattern we’ve seen in the standard library a lot, Donovan and Kernighan talk about it in their book… There’s a lot of Go code written out there, “if err != nil {fmt errorf} some description that says what happened, and then the text of the error. [00:39:06.14]

And that’s good, because at the top you get what Roger used to call R;breadcrumbs’ of “This failed because this failed/because this failed/because this failed/because this failed”, and you can kind of grep for those little individual strings and kind of manually construct a stack trace of where you were in that code.

That’s good, but it has problems that… There are cases -as few as they are and as many as I would prefer they weren’t – in the standard library where you do actually want to check for a specific value. I/O.EOF is the super example of this. Any I/O reader must return I/O EOF. It can’t return ReadFile;I/O EOF. It must return exactly that value of I/O.EOF. We’re actually checking for quality there.

[00:40:06.18] In certain cases you can’t do this annotation, because taking I/O EOF, printing its string out, pending to another string and then returning an entirely different value from fmt errorf gives you something which doesn’t compare, and you can’t strip off that prefix anymore, because you’ve forever damaged it. So if we’re talking about using the error value to annotate extra information, some kind of message, a stack trace or something like that, it has to be undoable. And again, my work in this area is very small, and it’s certainly not unique. There’s a lot of work that I stand on, with this idea of “Okay, so if we have an error, let’s give it a method that lets you get the underlying error.”

If we’re stacking them one on top of another, let’s have a method that we can undo the stacking, so that if we do need this behavior of saying, “Is this I/O EOF?” or if you use [unintelligible 00:41:09.01], that knows about a certain bunch of types from syscall from Windows; there’s a few other ones that it knows specifically to check, and says “I know how to interpret these error types. I know how to look at them and say, is this actually caused by a file not found?” So you need to be able – whatever wrapping you do – to add context, add a stack frame, add a message, and you need to be able to undo that, because there are cases where you need to extract the error, because that’s the way the code goes.

Source link


Please enter your comment!
Please enter your name here