In last six or so weeks Microsoft managed to release whole bunch of .NET Core 2.1 SDKs (Preview 2, Release Candidate 1, Early Access, RTM) and we tried all of them. By the end of these weeks my cluster of CI servers looked like a zoo. As everything was done in a hurry, there were servers with RC1 pretending to be Early Access ones. EA servers pretended to be RTM compatible, and the only RTM host we had was pretending to support everything. Don’t look at me funny. It happens.
The problem happened when I tried to cleanup the mess: removed P2, RC1 and EA SDK tags from release branches, deleted prerelease servers, forced remaining servers to tell exactly who they are and finally rolled out new VMs with latest and greatest .NET Core SDK 2.1 installed. Naturally, very first build failed.
The issue
Compilation error said that Detected package downgrade: Microsoft.NETCore.App from 2.1.1 to 2.1.0
. In fact, it wasn’t even a compilation – the build failed during package restore phase.
There was also a chance that it’s one off issue caused by some mysterious race condition, so I could retry the build job and in case of success pretend the error never happened (can’t deal with it now). But nope, I retried it twice and the build failed twice as well. It looks like I have to use that brainy thing again.
Troubleshooting
Poking around
It’s quite interesting that project builds locally just fine. It’s the same Ubuntu 16.04, the same code, the same SDK.. or is it? Quick dotnet --version
on both hosts shows that local SDK’s version is 2.1.300
, whereas one on the build server is 2.1.301
. So Microsoft released a patch few days ago? Interesting. After newer version finds its way to my workstation, project no longer builds on it as well. Well, that’s a good sign.
I checked project files, but they looked pretty much as usual and nothing would suggest the cause of the conflict. What’s interesting, it took me some time to notice that dotnet build
command actually works. It’s only dotnet publish -r ubuntu-x64
that doesn’t. What’s even more interesting, if I skipped the runtime parameter (-r
), even publish
worked. Not sure how it helps me now, but who knows.
Getting the logs
Having zero ideas about where to look for more hints, I had no other option but enable diagnostics output in build/publish commands and try to find out at what point they start to behave differently.
If you never used -v diag
parameters in MSBuild
or dotnet build
commands you probably should know that it produces a lot of output. No, like this – A LOT. For our ~90 projects solution it emits tens and tens of megabytes of unstructured text output. But if there’s something to find, it should be there.
So here it is: dotnet build -c Debug -v diag > works.txt
for working build, dotnet publish -c Debug -v diag -r ubuntu-x64 > fails.txt
for broken one, and vim
+ :diffthis
to bring them side by side.
That’s… a lot of colours. And lines. Because publish
build failed right during NuGet packages restore, it’s about 20 times smaller. That’s also a good thing – I can remove everything after Done executing task 'RestoreTask'
, which separates restore phase from the rest of the build, and significantly reduce the amount of text to deal with.
Chasing the differences
The error message was saying something about package downgrade and version 2.1.1
in it. Let’s look for it then.
About a dozen of matches later I do find a place where RuntimeFrameworkVersion
property becomes different: 2.1.1 in faulty build vs 2.1.0 in successful one. As a side note, 2.1.0 is the version of runtime shipped by default with .NET Core SDK 2.1.300. The latest SDK at the moment – 2.1.301 comes with patched runtime – 2.1.1. It’s very easy to check:
1 2 3 4 |
dotnet --list-runtimes # Microsoft.AspNetCore.All 2.1.1 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All] # Microsoft.AspNetCore.App 2.1.1 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App] # Microsoft.NETCore.App 2.1.1 [/usr/share/dotnet/shared/Microsoft.NETCore.App] |
The error message stated that Microsoft.NETCore.App
2.1.1
– the runtime – was conflicting with its 2.1.0
counterpart, so it really looks like something in our solution caused one part of it to target the latest runtime, and the other one – stick with base. OK, but where does RuntimeFrameworkVersion
gets its value? Nowhere. In these particular log files it looks like its value comes from outer space and never gets explicitly assigned.
OK, another try. All these build properties are coming from .props
and .targets
files which are the part of .NET SDK. What if I search for the property assignment among them?
1 2 3 4 5 |
cd /usr/share/dotnet/sdk grep -ir RuntimeFrameworkVersion ... #<RuntimeFrameworkVersion Condition="'$(TargetLatestRuntimePatch)' == 'true' ">$(LatestNetCorePatchVersion)</RuntimeFrameworkVersion> #<RuntimeFrameworkVersion Condition="'$(TargetLatestRuntimePatch)' != 'true' ">$(DefaultNetCorePatchVersion)</RuntimeFrameworkVersion> |
That’s interesting. If TargetLatestRuntimePatch
property is set to true, then RuntimeFrameworkVersion
will use LatestNetCorePatchVersion
, which I believe is happening in our case. Here, it’s even in build logs:
OK, I think I see the picture here. The last question is when TargetLatestRuntimePatch
becomes true?
Again, there’s nothing in logs, but in SDK itself I was able to find this:
1 2 3 |
grep -ir TargetLatestRuntimePatch ... #..<TargetLatestRuntimePatch Condition="'$(SelfContained)' == 'true' ">true</TargetLatestRuntimePatch> |
Thinking part
It makes total sense now. When we compile using dotnet build
, SelfContained
becomes false and so does TargetLatestRuntimePatch
, leaving RuntimeFrameworkVersion
with its default version of 2.1.0
. However, it all changes for dotnet publish
. SelfContained
is true, TargetLatestRuntimePatch
is also true and therefore for .NET Core SDK 2.1.301 RuntimeFrameworkVersion
becomes 2.1.1
. For some reason at least one of our test projects still requires 2.1.0
, thus causing the conflict. We didn’t have the issue with SDK 2.1.300, as that was the first one to come out, so Base and Latest runtime versions were the same.
So what’s next? How do I fix that? Well, there’re actually three choices. The true one and two temporary remedies.
- Find the package causing runtime version downgrade and fix it.
- When
SelfContained
, explicitly setTargetLatestRuntimePatch
in problematic projects to true, thus eliminating the conflict. - Explicitly set
TargetLatestRuntimePatch
to false for ‘main’ project, so we always use the base version.
Eventually we decided to come up with forth solution: ignore SelfContained
flag at all and always require latest runtime patch. After all, why would we want to stick with an old one?
Conclusion
Even though I’m not a fan of digging through dotnet and MSBuild internals, there’s some guilty pleasure in exercises like this. Long time ago I was dealing a lot with XSLT, which being an XML also was perfectly valid functional language with functions, recursion, patterns matching, etc. And it’s really something to see a program written in functional XML. Bizarre, but something. CSPROJ files along with .props and .targets files from MSBuild are also XML based and also carry a logic with them – assignments, conditionals, code imports and some form of functions with parameters. While a little bit archaic in nowadays, it’s still kind of cute. Ah, good old medieval days…