Working closely with GCP’s Deployment Manager recently, it was really hard not to notice that Google sometimes… makes bugs. Seriously. Not that many, I definitely introduced more, but it’s still enough to stumble across them now and then. I think within a month I found like 4 of the most obvious ones, but so did the other members of my team, so bugs in GCP is not something uncommon. So, let’s have a look at few?
In last six or so weeks Microsoft managed to release whole bunch of .NET Core 2.1 SDKs (Preview 2, Release Candidate 1, Early Access, RTM) and we tried all of them. By the end of these weeks my cluster of CI servers looked like a zoo. As everything was done in a hurry, there were servers with RC1 pretending to be Early Access ones. EA servers pretended to be RTM compatible, and the only RTM host we had was pretending to support everything. Don’t look at me funny. It happens.
The problem happened when I tried to cleanup the mess: removed P2, RC1 and EA SDK tags from release branches, deleted prerelease servers, forced remaining servers to tell exactly who they are and finally rolled out new VMs with latest and greatest .NET Core SDK 2.1 installed. Naturally, very first build failed.
Million years ago, way before the ice age, I was preparing small C++ project for “Unix Programming” university course and at some point had to debug it via command line. That was mind blowing. And surprisingly productive. Apparently, when nothing stands in the way, especially UI, debugging can become incredibly focused.
Since .NET Framework got his cross platform twin brother .NET Core, I was looking forward to repeat the trick and debug .NET Core app on Ubuntu via command line. Few days ago it finally happened and even though it wasn’t a smooth ride, that was quite an interesting experience. So, let’s have look.
llnode plugin which does exactly that. Today I’m not going to go deeply into how it works – it’s not in the focus of my daily job now. But it’s still interesting to get a feeling of it. So, let’s dive right in. Continue reading “Examining NodeJS core dump with llnode and lldb”
I’m still busy learning how to troubleshoot .NET Core apps on Linux. Now I more or less understand where to start if I need to analyze .NET Core memory usage, but that’s not the most frequent problem we usually have. Most of the time the main problem is high CPU usage, which obviously leads to the question “What this thing’s doing now??”.
On Windows I’d usually start with analyzing application logs, trace files or performance reports. If that wasn’t enough I’d download Perfview tool and profiled running executable itself. How many threads are there? What functions they execute the most? What their call stacks are? And so forth. The only downside of that tool is that I never read its documentation, so whole troubleshooting experience sometimes resembled a meditation on the monitor, worshiping the holy Google and wild guessing.
While logs and traces are obviously there on Linux as well, I was wondering, can I profile the app in the similar way I would do on Windows? The answer is “Oh, yes”. Monitor, worshiping, guessing – it’s all there.
There’re multiple tools to use out there, but the basic toolkit for profiling .NET Core app on Linux seems to be
perf utility along with
perfcollect. Let’s have a look at all of them. Continue reading “Profiling .NET Core app on Linux”
Most of the last week I’ve been experimenting with our .NET Windows project running on Linux in Kubernetes. It’s not as crazy as it sounds. We already migrated from .NET Framework to .NET Core, I fixed whatever was incompatible with Linux, tweaked here and there so it can run in k8s and it really does now. In theory.
In practice, there’re still occasional StackOverflow exceptions (zero segfaults, however) and most of troubleshooting experience I had on Windows is useless here on Linux. For instance, very quickly we noticed that memory consumption of our executable is higher than we’d expect. Physical memory varied between 300 MiB and 2 GiB and virtual memory was tens and tens of gigabytes. I know in production we could use much higher than that, but here, in container on Linux, is that OK? How do I even analyze that?
On Windows I’d took a process dump, feed it to Visual Studio or WinDBG and tried to google what’s to do next. Apparently, googling works for Linux as well, so after a few hours I managed learn several things about debugging on Linux and I’d like to share some of them today. Continue reading “Analyzing .NET Core memory on Linux with LLDB”