General rules to apply when troubleshooting
As a developer often we tend to assume that it’s our fault when things go wrong. For whatever reason when things don’t work we assume it’s with something with the solution we’ve created.
This is particularly true for newer developers. Ones that haven’t had enough exposure, and time to feel confident enough to push back and/or ask the other team or vendor about why the issue is happening.
I think it’s mainly due to some sort of imposter syndrome or assumption that what we’ve built might not be as good as that other teams or business.
Often when dealing with some sort of interaction with an external system, or library within your code base, there seems to be this assumption that either of those are working/flawless. I often have this happen to myself, and it’s kind of a double-edged sword — either I must not be using/calling something correctly, or I don’t have the skill set needed to use what was provided. If an error happens the first initial reaction that happens is “I must have done something incorrectly.”
This feeling in particular kicks in when doing green field development. I’ve worked through a number of issues where I just didn’t have the data formatted correctly, or some other thing was just not right with a request, etc.
However, after a few of these, I’ve learned one thing and that is to get more eyes on the issue once you’ve eliminated anything obvious.
Even then… guess what… there are a number of times it will be an obvious issue… and you’ll have that DOH!!! moment.
That’s ok, though! You fixed the issue, collaborated, and got unblocked. These things can eat up so much time. I’ve seen developers (including myself) lose so much time over something that they weren’t able to clearly see.
Knowing your limits, when to walk away, grab another co-worker and go over the issue is a learned process that everyone has to go through during their journey as a developer.
Case in point, I recently dealt with a pretty simple issue and I spent probably more time on it than I should have.
The recent request was to update a service call to use HTTPS instead of HTTP. It was a legacy system that used a self-signed certificate.
The application, developed in Python, was using the requests package. I’m new to Python and not very familiar with all its nuances.
In theory, all I needed to do according to the documentation, was to change http: to https: and then set a flag on the request to turn off certificate verification. Simple?
Well, not so fast…
I tried running the code and got a weird SSL error message back. Something about the incorrect version.
I engaged the troubleshooting mechanism and searched stack overflow with the error code.
Plenty of similar issues were found, reviewed them, and decided on what changes I should make.
Re-test, and I got a different error… Progress!
Funny how excited we get when we don’t get the same error. We view that as progress!
Re-engage the troubleshooting mechanism… make some code adjustments, and test again.
Hmmm… no change… time to start digging deeper into the requests library… must be something I’m not understanding… StackOverflow seems to not have anything related.
Needless to say… a couple of hours later, after multiple attempts and different configurations I got no further.
Time to walk away… Grab some lunch and mull things over.
Once I got distracted while eating some lunch, the thought “this might not be my issue” entered my mind. I assumed that the person who told me to switch over to “https” actually set up their end correctly.
I decided to test the same call using both Postman and curl. This was a bittersweet accomplishment. First I should have done this right away. If I couldn’t get an established program to work, what chance would I have?
As it turns out, even though they had enabled https on their end they didn’t associate/configure it with a valid certificate. Grrr!
I contacted the person indicating that something was incorrect on their end…and sure enough within a few minutes I got a reply asking to try it again. Viola… both Postman and curl returned a valid response.
I reverted back to my original change, and sure enough, it worked.
As the previous example shows, I lost valuable time trying to solve something I had no control over. I failed to follow some of my own general rules when dealing with unexpected issues. I hope these help people out.
Rule #1: Confirm that the system actually works using another tool if possible before you code.
Being from Missouri (the show me state), this rule really rings true for me. Show me that it works, give me a working example that I can actually run.
If I’d followed that rule right away, I would have saved a couple of hours. It was a good learning opportunity, but I’m not a fan of being forced to learn things in this mode. I had about 6ish different methods coded up trying to get this thing to work.
Rule #2: Time boxing is your friend.
Pick a time to step away if you’re stuck on an issue. Move onto something else for a while and come back with a fresh set of eyes.
I did time box it although not intentionally, I planned on spending about an hour on this, but ended up spending around 2. I knew that I was missing something, others were able to do it, so I should be able to as well.
Thus I realized I needed to walk away. Luckily lunch got in my way so I inadvertently time-boxed myself. The Promodo Technique would have come in handy here.
Rule #3: You are as good as everyone else, if they can do it so can you.
If the same code isn’t working then its probably not on your end.
This rule should help determine/guide you to the conclusion that your code should work. Therefore there must be some sort of outside factor happening that is preventing it from working. Have confidence in yourself. If you see that your code is the same as the example, or what others have described then it’s probably something else causing the issue.
The issues here could be pretty wide-ranged. This rule should go in conjunction with Rule #1. This is where you can grab another trusted tool/application and see if that works and get a sanity check. It could be an environment setting, or who knows what that’s preventing the code from running/working correctly.
Rule #4: Ask someone for some time to go over the implementation.
It could be something obvious that you might not see. This is actually a thing… You’re too close to it and need someone else’s perspective from time to time.
Although I didn’t use Rule #4 on this particular issue, I was pretty close to it. I was time boxing again and if I didn’t get through the issue I would ask someone else to look at it with me.
There have been plenty of times when showing someone code they noticed something pretty simple that I just continued to look over. I just was too close to the project and my mind wasn’t able to see it. Walking through the code with a co-worker can be a great experience and also offers up other perspectives and you get some great questions. Why do you do this? Can’t you do this instead? Or what about this or that?
I hope that these help someone out. Issues are not always with you, the odds are 50/50. It’s either on your end or it’s not. Don’t be afraid to say it’s not on your end. Make the other side prove it.