Analyzing .NET Core memory on Linux with LLDB - Dots and Brackets: Code Blog

Most of the last week I’ve been experimenting with our .NET Windows project running on Linux in Kubernetes. It’s not as crazy as it sounds. We already migrated from .NET Framework to .NET Core, I fixed whatever was incompatible with Linux, tweaked here and there so it can run in k8s and it really does now. In theory.

In practice, there’re still occasional StackOverflow exceptions (zero segfaults, however) and most of troubleshooting experience I had on Windows is useless here on Linux. For instance, very quickly we noticed that memory consumption of our executable is higher than we’d expect. Physical memory varied between 300 MiB and 2 GiB and virtual memory was tens and tens of gigabytes. I know in production we could use much higher than that, but here, in container on Linux, is that OK? How do I even analyze that?

On Windows I’d took a process dump, feed it to Visual Studio or WinDBG and tried to google what’s to do next. Apparently, googling works for Linux as well, so after a few hours I managed learn several things about debugging on Linux and I’d like to share some of them today.

The playground (debugging starts later)

Obviously, I can’t use our product as an example, but in reality any .NET Core “Hello world” project would do. I’ll create Ubuntu 16.04 VM with the help of Vagrant and VirtualBox, put the project in it and we can experiment in there.

Ubuntu VM

This is the Vagrantfile that will prepare the VM:

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/xenial64"

  config.vm.provider "virtualbox" do |vb|
     vb.memory = "3072"
  end
  
  config.vm.provision "shell", inline: <<-SHELL
    # Install .net core SDK
    curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg
    mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg
    sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-xenial-prod xenial main" > /etc/apt/sources.list.d/dotnetdev.list'
    apt-get update && apt-get install -y dotnet-sdk-2.0.2

    # Dev tools
    apt-get install -y vim gdb lldb-3.6
  SHELL
end

Vagrant.configure("2") do |config|

config.vm.box = "ubuntu/xenial64"

config.vm.provider "virtualbox" do |vb|

vb.memory = "3072"

end

config.vm.provision "shell", inline: <<-SHELL

# Install .net core SDK

curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg

mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg

sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-xenial-prod xenial main" > /etc/apt/sources.list.d/dotnetdev.list'

apt-get update && apt-get install -y dotnet-sdk-2.0.2

# Dev tools

apt-get install -y vim gdb lldb-3.6

SHELL

end

There’s nothing fancy. It’s 3 GiB RAM VM with .NET Core 2.0.2 SDK, vim, gdb and lldb-3.6 installed (more on them later).

Now, vagrant up will bring that VM to life and we can get into it with vagrant ssh command. The next stop is the demo project.

Demo project

As I said, any hello-world .NET Core app would do, but ideally it should have something in the memory to analyze. It also shouldn’t exit immediately – we need some time to take a process dump.

dotnet new console -o memApp creates almost sufficient project template, which I improved very slightly by adding a static array full of dummy strings:

using System;
using System.Linq;
using System.Text;

namespace memApp
{
    class Program
    {
        static Random random = new Random((int)DateTime.Now.Ticks);

        static char RandomChar() 
            => Convert.ToChar(random.Next(65, 90));

        static string RandomString(int length) 
            => String.Concat(Enumerable.Range(0, length).Select(_ => RandomChar()));

        static void Main(string[] args)
        {
            var dummyStringsCollection = Enumerable.Range(0, 10000)
                .Select(_ => "Random string: " + RandomString(10000)).ToArray();
            Console.WriteLine("Hello World!");
            Console.ReadLine();
        }
    }
}

using System;

using System.Linq;

using System.Text;

namespace memApp

{

class Program

{

static Random random = new Random((int)DateTime.Now.Ticks);

static char RandomChar()

=> Convert.ToChar(random.Next(65, 90));

static string RandomString(int length)

=> String.Concat(Enumerable.Range(0, length).Select(_ => RandomChar()));

static void Main(string[] args)

{

var dummyStringsCollection = Enumerable.Range(0, 10000)

.Select(_ => "Random string: " + RandomString(10000)).ToArray();

Console.WriteLine("Hello World!");

Console.ReadLine();

}

Now, let’s build the app, launch it and begin with experiments:

dotnet build
#...
#Build succeeded.
#    0 Warning(s)
#    0 Error(s)
#
#Time Elapsed 00:00:02.06
dotnet bin/Debug/netcoreapp2.0/memApp.dll
# Hello World!

dotnet build

#...

#Build succeeded.

# 0 Warning(s)

# 0 Error(s)

#Time Elapsed 00:00:02.06

dotnet bin/Debug/netcoreapp2.0/memApp.dll

# Hello World!

Creating a core dump

First, let’s check what’s initial memory stats look like:

ps u
#USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
#ubuntu    4058  7.9  7.9 2752512 243908 pts/0  SLl+ 04:10   0:06 dotnet bin/Debug/netcoreapp2.0/memApp.dll
#...

ps u

#USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

#ubuntu 4058 7.9 7.9 2752512 243908 pts/0 SLl+ 04:10 0:06 dotnet bin/Debug/netcoreapp2.0/memApp.dll

#...

That’s actually quite a lot: ~2.6 GiB of virtual memory and ~238 MiB of physical. Even though virtual memory doesn’t mean we’re ever going to use all of it, process dump (‘core dump’ in linux terminology) will take at least the same amount of space.

The simplest way to create a core dump is to use gcore utility. It comes along with gdb debugger and that’s the only reason I had to install it.

Using gcore, however, in most cases requires elevated permissions. On local Ubuntu I was able to get away with sudo gcore, but inside of Kubernetes pod even that wasn’t enough and I had to go to underlying node and add the following option to sysctl.conf:

echo "kernel.yama.ptrace_scope=0" | sudo tee -a /etc/sysctl.conf # Append config line
sudo sysctl -p # Apply changes

1 2	echo "kernel.yama.ptrace_scope=0" \| sudo tee -a /etc/sysctl.conf # Append config line sudo sysctl -p # Apply changes

But here in Ubuntu VM sudo gcore works just fine and I can create a core dump just by providing target process id (PID):

sudo gcore 4058
# ...
# Saved corefile core.4058

sudo gcore 4058

# ...

# Saved corefile core.4058

As I mentioned before, dump file size is the same as the amount of virtual memory:

ls -lh
#total 2.6G
#-rw-r--r-- 1 root root 2.6G Dec 12 04:25 core.4058

ls -lh

#total 2.6G

#-rw-r--r-- 1 root root 2.6G Dec 12 04:25 core.4058

This actually was a problem for us in Kubernetes, with .NET garbage collector switched to server mode and the server itself having 208 GiB of RAM. With such specs and GC settings virtual memory and core dump file were just above 49 GiB. Disabling gcServer option in .NET, however, reduced default address space and therefore core file size down to more manageable 5 GiB.

But I digressed. We have a dump file to analyze.

Debugger and .NET support

We can use either gdb or lldb debuggers to works with core files, but only lldb has .NET debugging support via SOS plugin called libsosplugin.so. Moreover, the plugin itself is built against specific version of lldb, so if you don’t want to recompile CoreCLR and libsosplugin.so locally (not that hard), the safest lldb version to use at the moment is 3.6.

As a side note, I was wondering what SOS exactly means and found this wonderful SO answer. Apparently, SOS has nothing to do with ABBA or save-our-souls Morse code distress signal. It means “Son of Strike”. Who is Strike, you might ask? Strike was a name of debugger for .NET 1.0, codename Lightning. Strike of Lightning, you know. And SOS is his proud descendant. Whenever I doubt if I should still love my profession, I find a story like this and give it another year. Few years ago a story behind a userAgent browser property did the same trick.

OK, we have a debugger, an executable and a core dump. Where do we get SOS plugin? Fortunately, it comes along with .NET Core SDK which I already installed:

find /usr -name libsosplugin.so
#/usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/libsosplugin.so

1 2	find /usr -name libsosplugin.so #/usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/libsosplugin.so

Finally, we can start lldb, point it to dotnet executable, which started our application, it’s core dump and then load the plugin:

$ lldb-3.6 `which dotnet` -c core.4058
# (lldb) target create "/usr/bin/dotnet" --core "core.4058"
# Core file '/home/ubuntu/core.4058' (x86_64) was loaded.
# (lldb) plugin load /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/libsosplugin.so
# (lldb)

$ lldb-3.6 `which dotnet` -c core.4058

# (lldb) target create "/usr/bin/dotnet" --core "core.4058"

# Core file '/home/ubuntu/core.4058' (x86_64) was loaded.

# (lldb) plugin load /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/libsosplugin.so

# (lldb)

This is where the real black magic begins.

Analyzing managed memory

SOS plugin added a set of commands which are aware of .NET managed nature, so we can not just see what bits and bytes are stored at given location, but what is their .NET type (e.g. System.String).

soshelp command prints out all .NET commands it added to lldb and soshelp commandname will explain how to use a particular one. Well, except when it won’t.

For instance, DumpHeap command, which is basically the entry point for memory analysis, has no help at all. Fortunately for me, I was able to find the missing info next to the plugin’s source code.

(lldb) soshelp
#...
#Object Inspection                  Examining code and stacks
#-----------------------------      -----------------------------
#DumpObj (dumpobj)                  Threads (clrthreads)
#DumpArray                          ThreadState
#..
(lldb) soshelp DumpHeap
-------------------------------------------------------------------------------
(lldb)

(lldb) soshelp

#...

#Object Inspection Examining code and stacks

#----------------------------- -----------------------------

#DumpObj (dumpobj) Threads (clrthreads)

#DumpArray ThreadState

#..

(lldb) soshelp DumpHeap

-------------------------------------------------------------------------------

(lldb)

Memory summary

We have a working debugger, we have a DumpHeap command – let’s take a look at managed memory statistics:

(lldb) sos DumpHeap -stat
#Statistics:
#              MT    Count    TotalSize Class Name
#00007f6d32992aa8        1           24 UNKNOWN
#00007f6d329911d8        1           24 UNKNOWN
#....
#00007f6d323defd8        4        17528 System.Object[]
#00007f6d323e08a8       25        40644 System.Int32[]
#00007f6d323e0168       29        82664 System.String[]
#00007f6d323e3440      335       952398 System.Char[]
#000000000223b860    10092      6083604 Free
#00007f6d3242b460   150846    204845172 System.String
#Total 161886 objects
(lldb)

(lldb) sos DumpHeap -stat

#Statistics:

# MT Count TotalSize Class Name

#00007f6d32992aa8 1 24 UNKNOWN

#00007f6d329911d8 1 24 UNKNOWN

#....

#00007f6d323defd8 4 17528 System.Object[]

#00007f6d323e08a8 25 40644 System.Int32[]

#00007f6d323e0168 29 82664 System.String[]

#00007f6d323e3440 335 952398 System.Char[]

#000000000223b860 10092 6083604 Free

#00007f6d3242b460 150846 204845172 System.String

#Total 161886 objects

(lldb)

Not surprisingly, System.String objects use the most of the memory. Btw, if you summarize total sizes of all managed objects (like I did), resulting memory count comes very close to physical memory count reported by ps u. 202 MiB of managed objects vs 238 MiB of physical memory. The delta, I suppose, goes to the code itself and executing environment.

Memory details

But we can go further. We know that System.String uses the most of the memory. Can we take a closer look at those strings? Sure thing:

(lldb) sos DumpHeap -type System.String
#        Address               MT     Size
#00007f6d0bfff3f0 00007f6d3242b460       26     
#00007f6d0bfff4c0 00007f6d3242b460       42
#...
#00007f6d0c099ab0 00007f6d3242b460    20056     
#00007f6d0c09e920 00007f6d3242b460    20056
#...
#00007f6d323e0168       29        82664 System.String[]
#00007f6d3242b460   150846    204845172 System.String
#Total 150895 objects

(lldb) sos DumpHeap -type System.String

# Address MT Size

#00007f6d0bfff3f0 00007f6d3242b460 26

#00007f6d0bfff4c0 00007f6d3242b460 42

#...

#00007f6d0c099ab0 00007f6d3242b460 20056

#00007f6d0c09e920 00007f6d3242b460 20056

#...

#00007f6d323e0168 29 82664 System.String[]

#00007f6d3242b460 150846 204845172 System.String

#Total 150895 objects

-type works as a mask, so the output also contains System.String[] and a few Dictionaries. Also strings vary in size, whereas I’m actually interested in large ones, at least 1000 bytes:

sos DumpHeap -type System.String -min 1000
# ...
# 00007f6d0e8810f0 00007f6d3242b460    20056     
# 00007f6d0e885f60 00007f6d3242b460    20056     
# 00007f6d0e88add0 00007f6d3242b460    20056    
# ...

sos DumpHeap -type System.String -min 1000

# ...

# 00007f6d0e8810f0 00007f6d3242b460 20056

# 00007f6d0e885f60 00007f6d3242b460 20056

# 00007f6d0e88add0 00007f6d3242b460 20056

# ...

Having the list of suspicious objects we can drill down even more: examine the objects one by one.

DumpObj

DumpObj can look into the managed object details at given memory address. We have a whole first column of addresses and I just picked one of them:

(lldb) sos DumpObj 00007f6d0e8810f0
#Name:        System.String
#MethodTable: 00007f6d3242b460
#EEClass:     00007f6d31c49eb8
#Size:        20056(0x4e58) bytes
#File:        /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/System.Private.CoreLib.dll
#String:      
#Fields:
#              MT    Field   Offset                 Type VT     Attr            Value Name
#00007f6d3244b020  40001c9        8         System.Int32  1 instance            10015 m_stringLength
#00007f6d3242f420  40001ca        c          System.Char  1 instance               52 m_firstChar
#00007f6d3242b460  40001cb       38        System.String  0   shared           static Empty
#                                 >> Domain:Value  00000000022ab050:NotInit  <<

(lldb) sos DumpObj 00007f6d0e8810f0

#Name: System.String

#MethodTable: 00007f6d3242b460

#EEClass: 00007f6d31c49eb8

#Size: 20056(0x4e58) bytes

#File: /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/System.Private.CoreLib.dll

#String:

#Fields:

# MT Field Offset Type VT Attr Value Name

#00007f6d3244b020 40001c9 8 System.Int32 1 instance 10015 m_stringLength

#00007f6d3242f420 40001ca c System.Char 1 instance 52 m_firstChar

#00007f6d3242b460 40001cb 38 System.String 0 shared static Empty

# >> Domain:Value 00000000022ab050:NotInit <<

It’s actually pretty cool. We immediately can see the type name (System.String) and what fields it is made of. I also noticed that for small strings we’d see the value right away (line 7), but not for the large ones.

I was puzzled at first about how to get the value for those. There’s m_firstChar field, but is it like a linked list or what? Where’s a pointer to the next item? Only after checking out the source code for System.String I realized that m_firstChar can be used as a pointer itself and the whole string is stored somewhere as continuous block of memory. This means I can use lldb’s native memory read command to get the whole string back!

For that I just need to take object’s address (00007f6d0e8810f0), add m_firstChar‘s field offset (c, third column in fields table) and then do something like this:

(lldb) memory read 00007f6d0e8810f0+0xc
#0x7f6d0e8810fc: 52 00 61 00 6e 00 64 00 6f 00 6d 00 20 00 73 00  R.a.n.d.o.m. .s.
#0x7f6d0e88110c: 74 00 72 00 69 00 6e 00 67 00 3a 00 20 00 43 00  t.r.i.n.g.:. .C.

(lldb) memory read 00007f6d0e8810f0+0xc

#0x7f6d0e8810fc: 52 00 61 00 6e 00 64 00 6f 00 6d 00 20 00 73 00 R.a.n.d.o.m. .s.

#0x7f6d0e88110c: 74 00 72 00 69 00 6e 00 67 00 3a 00 20 00 43 00 t.r.i.n.g.:. .C.

Does it look familiar? “R.a.n.d.o.m. .s.t.r.i.n.g.”. C# char defaults to UTF16 encoding and therefore it takes two bytes. Even though one of them is always zero for ASCII characters.

We also can experiment with memory read formatting, but even with default settings we can get the idea what’s inside.

(lldb) memory read 00007f6d0e8810f0+0xc -f s -c 13
#0x7f6d0e8810fc: "R"
#0x7f6d0e8810fe: "a"
#0x7f6d0e881100: "n"
#0x7f6d0e881102: "d"
#0x7f6d0e881104: "o"
#0x7f6d0e881106: "m"
#0x7f6d0e881108: " "
#0x7f6d0e88110a: "s"
#0x7f6d0e88110c: "t"
#0x7f6d0e88110e: "r"
#0x7f6d0e881110: "i"
#0x7f6d0e881112: "n"
#0x7f6d0e881114: "g"

(lldb) memory read 00007f6d0e8810f0+0xc -f s -c 13

#0x7f6d0e8810fc: "R"

#0x7f6d0e8810fe: "a"

#0x7f6d0e881100: "n"

#0x7f6d0e881102: "d"

#0x7f6d0e881104: "o"

#0x7f6d0e881106: "m"

#0x7f6d0e881108: " "

#0x7f6d0e88110a: "s"

#0x7f6d0e88110c: "t"

#0x7f6d0e88110e: "r"

#0x7f6d0e881110: "i"

#0x7f6d0e881112: "n"

#0x7f6d0e881114: "g"

Conclusion

I’m just scratching the surface, but I love what I find. I’ve been a .NET programmer for quite a while, but it’s the first time in years when I started to think what’s happening that deep under the hood. What’s inside of a System.String? What fields does have? How those fields are aligned in the memory? The first field has an offset 8. What’s in those eight bytes? A type id? .NET strings are interned, does it mean that m_firstChar of identical strings will point to the same block of memory? Can I check that?

I also wonder how debugging .NET code with lldb looks like. Many years ago I used to debug a C++ pet project with gdb, so I kind of know the feeling. But .NET applications compile Just-In-Time, so it’s interesting to see how SOS plugin deals with that.

8 thoughts on “Analyzing .NET Core memory on Linux with LLDB”

Pingback: Profiling .NET Core app on Linux - Dots and Brackets: Code Blog
shao shao jun says:

July 25, 2018 at 9:39 am

any way to get dump file once .NET CORE app crashed? I’ve tried some solution from web, but all didn’t work.

1. pav says:
  
  July 25, 2018 at 11:00 pm
  
  If you haven’t changed default core dump settings, then most likely you won’t find one. Basically, you need to configure 2 things in order to enable auto crash dumps:
  1) set core_pattern – file name template for future core dumps
  2) set core dump file limit, which at least on Ubuntu is “0” by default and therefore crush dumps are disabled.
  Here’s a good place to start: https://sigquit.wordpress.com/2009/03/13/the-core-pattern/
  
  1. shao shao jun says:
    
    July 26, 2018 at 8:35 am
    
    could you help for my post: https://stackoverflow.com/questions/51491972/get-crash-dump-for-net-core-application-running-on-raspberry3-b-debian9-arm3?noredirect=1#comment89969648_51491972
    
    1. shao shao jun says:
      
      July 28, 2018 at 12:23 am
      
      I’ve done everything I could do, but still could not see crash dumps, could you help.
      
      1. pav says:
        
        July 29, 2018 at 12:04 am
        
        Hey,
        Sorry, can’t answer faster.
        I won’t pretend that I know all of that, but if even after changing core_pattern and changing core dump size limit it still doesn’t work for you, I’d bet on core size limit still being zero.
        Here’s what you can try. Create a small .net app with the only instruction in its Main() method: throw new Exception("OK, now you _must_ produce a dump");, build it and run with the following command:
        echo "/tmp/core-%e-%s-%u-%g-%p-%t" | sudo tee /proc/sys/kernel/core_pattern && ulimit -c unlimited && dotnet run
        It configures automatic core dumps right before runs the app, so it’s hard to go wrong here. I just run it on my clean, never previously configured ubuntu and it produced a core dump in /tmp folder – /tmp/core-dotnet-6-1000-1000-1504-1532836438. I looked inside, and that indeed was hardcoded unhandled exception:
        (lldb) pe Exception type: System.Exception Message: OK, now you _must_ produce a core dump InnerException: StackTrace (generated): SP IP Function 00007FFDB3264850 00007F7520D016FE ThrowsException.dll!Unknown+0x6e
        StackTraceString: HResult: 80131500
        When you able to repeat that, you could try doing the same for your main app.
        Alternatively, you might try running another version of dotnet. They do make mistakes and e.g. older/newer versions might not have it. Also make sure that you are using the latest SDK and runtime. Today, the latest runtime is 2.1.2, and SDK is 2.1.302. Depending on how you’re producing the build, latest SDK might not trigger using the latest available runtime, so you might need to either update .csproj file or do dotnet publish instead of build. Latter triggers using the latest available runtime.
        Good luck.
shao shao jun says:

August 16, 2018 at 7:10 am

thank you and I’ve got the crash dump already, the reason of previous failure is the ulimit -c settings is not persisted for all users.
my problem now is how to analysis the dump, basically I’ve loaded my crash dump (running on .NET CORE 2.1.3 linux-arm) with lldb-4.0, and loaded the plugin from /opt/dotnet/shared/Microsoft.NETCore.App/2.1.3-servicing-26807-02/libsosplugin.so, but at last, the lldb didn’t take any of my command(like soshelp, clrstack and etc.) but just crash itself.
for more please check: https://stackoverflow.com/questions/51721104/analysis-net-core-console-app-crash-dump-from-linux-arm32-debian-raspberrypi

please do help me.

1. pav says:
  
  August 16, 2018 at 11:28 pm
  
  First SO answer ever 🙂
  Basically, I don’t know what’s wrong with unicode, but you’ll need either lldb-3.8 or 3.9

The playground (debugging starts later)

Ubuntu VM

Demo project

Creating a core dump

Debugger and .NET support

Analyzing managed memory

Memory summary

Memory details

DumpObj

Conclusion

Share this:

You might also like

How to unit test.. a server with goss

Quick intro to RabbitMQ

Quick intro to helm – a package manager for Kubernetes

8 thoughts on “Analyzing .NET Core memory on Linux with LLDB”

Leave a Reply Cancel reply