Getting to the .NET Core of Things

Getting to the .NET Core of things

This post aims to help developers from other tech stacks get up to speed with .NET Core. It should be enough to follow further discussions of this tech stack, as well as help you decide whether it’s something you might want to investigate further.

Introduction

Microsoft’s tech stack (for various types of applications) has been .NET for over 15 years now. For the most part of those 15 years, Microsoft has exclusively focussed on proprietary, Windows-only software. In recent years Microsoft shifted to open source and cross platform solutions in many areas, including .NET. With that, the newest incarnation of .NET is .NET Core, which is completely open source and available across various platforms.

This post explains the state of Microsoft’s tech stack, from the perspective of this new “.NET Core”.

Note on versions: this post was written when .NET Standard 2.0 and .NET Core 2.0 have just come out. Most information also holds for earlier versions, but unless specified otherwise, all text and code below assume version 2.0 to be the context.

How .NET traditionally worked

Let’s first investigate how .NET in general works, with the pre-.NET Core context in mind.

As a developer, you can write some C# or VB.NET code. When you compile this code, you’ll get IL (Intermediate Language), which is bytecode. This bytecode is packaged in DLL and possibly EXE files, which runs on any computer. Well, technically it runs on any computer… that has the .NET Framework to run it. Remember, talking pre-.NET Core here, so this “any computer” has to be a Windows machine with the proper version of the .NET Framework.

The part of .NET that actually runs the application is the CLR (Common Language Runtime). Included with the CLR is a GC (Garbage Collector) and other memory management tools. Another important part of the .NET Framework is the BCL (Base Class Library) which contains essential base libs, for example for collections, IO, XML handling, etc.

In addition, .NET itself also used to ship with application frameworks. For example frameworks for desktop applications (WinForms and WPF), as well as web application frameworks (ASP.NET). This has changed in recent years. Now, almost all application frameworks (including ASP.NET MVC), are distributed as packages. This is done using the NuGet package manager, where application frameworks live as siblings to other libraries and SDKs. Note that Microsoft’s packages sit there along the third party packages.

And that’s all the basics for building .NET applications traditionally. With that out of the way, let’s move on to the interesting bits.

Terminology

The best way to start explaining about the “new” .NET situation is by building a glossary.

Terminology around .NET Core has been very confusing for quite some time. But since around halfway through 2017 it seems things are coming together. I’ve left all obsolete terms (Hello there, “DNX”!) for an appendix at the end, and will first focus on current terminology. Here’s a quick overview of the important terms.

Let’s start with the most important thing, which is in my opinion is not “.NET Core”. It is .NET Standard, which simply specifies an API. It lists all the (namespaced) types and methods you should implement to create a .NET Implementation (sometimes also referred to as a “.NET Framework” or a “.NET Platform”).

So what .NET Implementations are there then? Several! First, the most well-known one is the .NET Framework, which is available only for Windows.

Second, the .NET Framework framework has been ported, and this (cross platform) port is known as the Mono framework. Today Mono is not only port, but it is in fact also explicitly a .NET Implementation by implementing the .NET Standard officially.

Third, there’s Xamarin. Now there is a company named “Xamarin” (now owned by Microsoft), which develops similarly named platforms: Xamarin.iOS and Xamarin.Android.
These are both versions of Mono for their respective mobile platforms. Recent and upcoming versions of Xamarin.iOS and Xamarin.Android will be .NET Implementations that conform to the .NET Standard too.

Fourth and finally, let’s get to the main topic: .NET Core. This is a cross-platform .NET Implementation by Microsoft, conforming to the .NET Standard. Moreover, it’s completely open source, with most parts using the permissive MIT license.

Basically Microsoft re-implemented the Windows-only .NET Framework in the cross-platform .NET Core, where overlap between the two is specified by .NET Standard. Note that large parts of .NET Core are forked from the .NET Framework.

Within .NET Core there are two other important terms. First, CoreCLR is the Common Language Runtime (CLR) of .NET Core. This is the part that runs your .NET Core applications, takes care of memory management, etc. Second, CoreFX is the Base Class Library (BCL) of .NET Core. It contains the basic types such as those around collections, IO, xml handling, etc. All of these bits and pieces are available cross-platform.

With those terms laid out, let’s dive into the details.

.NET Standard

The .NET Standard API specification for .NET Implementations has different versions. The code and documentation can be found on GitHub, which also shows which implementations conform to each version of .NET Standard. Here’s a trimmed down version of the current overview:

.NET Standard versions

For example, from the above you can tell that .NET Core 1.0 implements .NET Standard 1.0 through 1.6. And as another example, .NET Standard 2.0 is implemented by both .NET Core 2.0 and .NET Framework (the Windows-only one) 4.6.1.

You can easily check what’s in a specific version by checking the markdown-based docs for all versions. It includes “diff” files showing what changed since the previous version. For example, this API was added to .NET Standard going from 1.6 to 2.0:

Now for the important part! When writing .NET code, you can choose what your intended target (“Target Framework“) is. But this does not need to be a .NET Implementation.
You can also target .NET Standard!

But “Why would you target a spec, which cannot run anything?”, you might ask. The main reason to do that would be when you’re writing some kind of library.

For example, suppose you’re targeting .NET Standard 2.0 with your hip new FooBar library. By using .NET Standard as a Target Framework you’re basically saying: anyone running an app on a .NET Implementation supporting .NET Standard 2.0 can use my library.

Now suppose you are a library or framework author who publishes things on NuGet. You then have to specify what Target Framework your code’s compatible with. So from NuGet we can extract interesting statistics, and see that the community is really getting on the .NET Standard bandwagon. Most popular libraries already support even from some 1.x version onward of .NET Standard (usually 1.3 or 1.6).

In addition to explicit Framework targeting, there’s something specific for .NET Standard 2.0. A “compatability shim” was also rolled out in the tooling around packages, meaning you can use any library that is de facto API-compatible with .NET Standard 2.0. Even if the author didn’t explicitly declare it to be compatible. And although this might seem dangerous, it works pretty well in practice, allowing for application authors to switch more quickly to .NET Core if they want to.

.NET Core

This is where things get cross-platform! You can download the SDK for Windows, various Linux distributions (e.g. RHEL, Ubuntu and Mint, SUSE), and Mac OSX. The SDK contains both the CoreCLR (runtime) to run applications, as well as the tools needed to create and build applications.

After installing you can use the command line interface to test everything is working. Just:

  1. Create a folder “hellow” and cd into it;
  2. Execute dotnet new console, which generates:
  3. Execute dotnet run;

And you should see the traditional “Hello World!” greeting now.

To move beyond the CLI to using an IDE for development, there are several choices.

  • Visual Studio is still probably the best experience on a Windows machine.
  • VS Code is available on Windows, Mac, and Linux, offering a pretty light-weight IDE.
  • JetBrains Rider is an Intellij-like IDE for .NET development, available on Windows, Mac, and Linux.

Any code you compile, on any OS, with any IDE, should be runnable on .NET Core on other OSes. If .NET Core is installed on that OS.

You can also create “self-contained applications”: applications that include the .NET Core runtime as well. Obviously, then you need to specify the platform to target because .NET Core binaries are platform-specific. You do this by publishing with a Runtime Identifier (RID) like “win10-x64“, or “osx.10.12-x64“, or “linux-x64“. This will compile your .NET Core application and bundle it with the appropriate version of .NET Core itself.

And that’s really all there is to it. From here on out it’s all about writing code in a .NET language of your choice. This means C# or F#, or VB.NET in the near future.

Wrapping Up

Microsoft is changing up their game. Although the traditional .NET Framework is here to stay, the new .NET Core framework is the future. They are both .NET Implementations and yes: they have overlap (as defined by .NET Standard). But you can safely bet on the fact that .NET Core and .NET Standard are going to get focus forward.

Given that all these efforts are both open source and cross-platform, riding along that train seems like an excellent idea. Especially if you’re currently using another tech stack, but interested in the .NET ecosystem, now is a great time to hop on and join for the ride!

Just give it a go!

~

This post formed the backbone of my talk at DomCode 2017-08-29. By and large it can be considered a transcript of that talk. If you want you can also download the slides of my presentation.


Appendix A: Bonus Topics

There are plenty more in-depth and advanced topics. Here’s a quick list of particularly interesting ones you could further pursue:

  • Docker and .NET Core go very well together. The official docs on that should be a good starting point.
  • EF Core (Entity Framework Core) gets a lot of attention too. EF is Microsoft’s ORM framework, and it has its own dedicated (sub)site with more info.
  • UWP (Universal Windows Platform) for creating Windows Store apps that could be cross platform (including things like Xbox, Windows Phone, HoloLens, etc) will also likely conform to .NET Standard. Check the main UWP docs for further info.
  • Roslyn is the code name for the open-source compilers for .NET languages. The best starting point for more details is the Roslyn Github repo.
  • .NET Native will allow you to compile your .NET code not to IL (bytecode), but to platform-specific native code. Check the official docs for more info.

Appendix B: Obsolete Terminology

Here’s a short list of (currently) prominent terms that I consider to be obsolete, along with their definition (and the source of that definition, if applicable).

  • DNX (Dotnet Execution Runtime), DNVM (script for obtaining DNX) and DNU (Dotnet Developer Utility) were part of older Release Candidates of .NET Core. The features have mostly been moved to the .NET Core CLI. See the Microsoft docs for more info.
  • project.json was meant to be the new project system, but instead Microsoft decided to move back to csproj files with some new features. Read more on these Microsoft docs pages.
  • PCL (Portable Class Library) was an earlier attempt to help library authors create code that could be reused across various fameworks and platforms. The best reference I could find is these docs from Microsoft. In light of .NET Core you can easily forget about it though, unless you need to convert a PCL project to .NET Core.
  • vNext (which at some point was also called ASP.NET 5) can best be seen as a working title of the next .NET Framework version (the one for Windows only), but has been dropped entirely. About the only semi-sensible reference left is on Stack Overflow.
  • ASP Classic is not really an obsolete term, but rather obsolete technology. The latest stable release was from around the year 2000. It has nothing to do with .NET or the various ASP.NET application frameworks. Wikipedia has a quick history recap if you want it.

References

AutoMapper: Missing type map configuration

While trying out AutoMapper I stumbled on this generic error:

Message: AutoMapper.AutoMapperMappingException : Missing type map configuration or unsupported mapping.

Below is the initial Stack Overflow question I wrote, after struggling for at least 25 minutes with this problem. The solution however was shamefully simple: if you call Mapper.Initialize twice, the latter will overwrite the first.

Full Description

So why am I writing an entire post about this? Simple: to ingrain this solution into my brain, may I never make the same mistake again.

Basically, I was trying to understand a more specific version of this generic question on AutoMapperMappingException, getting the same kind of error message:

Message: AutoMapper.AutoMapperMappingException : Missing type map configuration or unsupported mapping.

Here’s a way to repro my scenario:

  1. Using VS2017, create new “xUnit Test Project (.NET Core)” project (gets xUnit 2.2 for me, targets .NETCoreApp 1.1)
  2. Run `Install-Package AutoMapper -Version 6.0.2
  3. Add the following code
  4. Build
  5. Run all tests
  • Expected result: green test.
  • Actual result: error message:

    Message: AutoMapper.AutoMapperMappingException : Missing type map configuration or unsupported mapping.

    Mapping types:
    FooEntity -> FooViewModel
    XUnitTestProject3.FooEntity -> XUnitTestProject3.FooViewModel

If I uncomment the line marked as “culprit” the test turns green. I fail to see why.

I also placed a Mapper.Configuration.AssertConfigurationIsValid() call right before the Map call but that will run without error.

As far as I can tell, the other question, specifically its top answer talks about forgetting the initialization, but that’s explicitly there. I’ve also looked through the other answers but none of them helped me.

Another top question’s answer to this same problem tells me to add ReverseMap(), but that’s not applicable for my scenario.

Solution

Only after writing the entire above question on Stack Overflow, specifically while perfecting the minimal repro, did I realize what was causing the error: the line marked as “Culprit!”. Then, buried deep in Google’s search results (okay, okay: on page 2; but who looks on page 2 of search results?!) I find this answer that has the solution. Multiple initializations should be done like this:

I guess that teaches me for disregarding the advice to use Profiles for proper AutoMapper configuration.

Entity Framework: Cascading Delete of Optional Related Entity

After several years of NHibernate, and a couple of years Dapper and NoSQL, I’m now working on a project that uses Entity Framework as its ORM. It’s a mature ORM at this point. However, it does give me a headache as I’m struggling to find ways to do certain things. I’ll admit that I’m trying to dive in without sitting down and doing a few hours of learning first; for sure I’ll be on Pluralsight some time soon.

Fair warning: what comes next is a highly specific, rather technical, and possibly stupidly uninformed dump of a problem I ran into. Feel free to skip this one: no hard feelings, and I’ll see you on my next post!

The main problem: I’ve got a repro, but it’s extremely similar to various other questions on Stack Overflow. Except… that my code contains the accepted (and often highly upvoted) answers’ code as well, but still doesn’t work as advertised!

What I’m trying to do is make sure that EF will delete a “Child” property (i.e. the database row) automatically when its “Parent” is explicitly deleted via the DbContext. Note that the Child is optional in my scenario.

Here’s the repro. Create a new class library and install EntityFramework (I used 6.1.3) and NUnit (I used 3.6.1). Then drop in one namespace these entities:

And then this DbContext:

And finally this TestFixture:

Make sure the “TestDb” database is available on your “(LocalDb)\cascades” instance (or use another Sql Server instance and database). Then run the tests, and you’ll get:

Test Failed – Delete_will_cascade.
Message: “Child”.
Expected: 0, but was: 1.

All I want is that a “Child” is deleted with its “Parent” once that is deleted. I know I can use a Sql CASCADE, and I know I could manually remove the Child from the context, but I want EF to handle this automatically, damnit!

For sure I’ve missed the obvious solution. I was somewhat hoping it would come to me while writing this. But perhaps I just need a good night’s rest.


References:

Git intro for TFVC users

So, you’re using Team Foundation Version Control (TFVC). You know about Team Projects and Collections, have a “stable” and “dev” branch for most projects, know how to do basic merges, and know how to shelve and unshelve changes. But you only have superficial knowledge of Git.

Well, then you’ve come to the right place: my Git intro for TFVC users.

Disclaimers

Here’s a heads-up about this particular post:

  • It is not in-depth;
  • It will oversimplify things to the point where not every statement is technically true;
  • It does not teach you actual Git skills or commands;
  • It is somewhat subjective.

Oh, and in my humble opinion: Git has a crazy learning curve. So buckle up!

And, as a final disclaimer: I’m writing this because I want to be able to explain the basics of Git, and certainly not because I’m an expert. In fact, I know more about Mercurial than about Git, I have worked mostly with TFVC in the past year, and would have to look up nearly every Git command when using the command line (I prefer GUI tools most of the time). Just so you know.

Disclaimers done, let’s get started!

On “TFS” vs “TFVC”

First, let’s get the TF* terminology right. These terms are different but related things:

  • “TFVC” stands for Team Foundation Version Control, and is the actual system for keeping history of your codebase;
  • “TFS” stands for Team Foundation Server and is the “environment” (if you will) in which source control features come together.

Many people, myself included, often use the acronym “TFS” when we actually mean “TFS with TFVC”. This is probably because TFS is very often used with TFVC as the version control system; but note that you can also use Git with TFS.

This blog post focuses on TFVC (which almost always implies you’re using TFS too).

Why Learn about git?

The personal reasons for learning Git are quite simple: it’s the de facto standard for version control. Knowing about Git is:

  • crucial for your career (your new employer is likely to use Git);
  • crucial for talking to new hires (who will very likely know Git);
  • crucial for efficiently navigating open source.

The intrinsic reasons for learning Git all come down to the fact that it is insanely powerful (which also accounts for the steep learning curve). After having worked with TFVC in a 12 person team for over a year, I’d like to highlight the following main- and most direct advantages over TFVC:

  • Cheap branching;
  • Small repository size;
  • Local commits;
  • Better options for “shelving”;
  • Speed;
  • Better “offline” support.

Beyond these advantages, which you’ll get if you’re using a “central” Git repository, there are even more goodies if you tap into the distributed nature of Git, as well as the more powerful commands (e.g. to rewrite history).

Basic differences

Git is a lot like TFVC, in that it is also a Version Control System (VCS). It is also quite different. Here’s how they compare when managing the code base and doing changes:

TFVC Git
There is a central, server-hosted Team Project and you’ll have one or more local workspaces containing a copy of all the code. For now, let’s assume that there is a central repository hosted somewhere on a server. You’ll have one or more local “clone” repositories containing a copy of all the code and also all of its history.
check in is a command to send your pending local changes to the server where they will be committed to the version control history. You commit changes locally which creates a “change set” private to your “clone”. You can do this multiple times. You push one or more change sets at once to the central repository.
You can Get the Latest Version from the server which directly tries to merge with your current state. You fetch changes from the central repository and merge with your current state (or do both at once by doing a pull) possibly creating a new commit.

The fact that a TFVC “check in” is multiple separate commands in Git gives several advantages:

  • You can build history in small, individual steps (commits), which you can undo individually in several ways, and you can apply individual commits to other branches.
  • Your changes are “safe” in commits when you want to send your changes to the central repository and find out others have conflicting changes, by default they won’t get “lost” if merges go bad. With TFVC, you’re required to resolve conflicts as you try to check in, which can screw up “unsaved” changes.

Our next set of differences would be about branching, but before that it’s good to talk briefly about “shelving” changes.

Setting changes aside

With TFVC you can shelve changes you currently have to safeguard them, optionally reverting those changes in your local workspace. You don’t necessarily need to do this when switching work to another branch, because the branch is typically a sibling folder of the main branch. More on that later.

With Git, you can stash changes you currently have, reverting changes in your local clone. You have to do this before switching to another branch. In order to see how that works, let me move on to the next topic: branches.

Branches

A typical setup in TFVC starts like this (with a local workspace matching the Team Project 1-on-1 in this example):

Inside the project folder for the Team Project, there immediately is a sub folder called “main”. This is done so that it is easy to branch off the entire codebase for that project into this:

The folder “dev” is now a complete copy of “main”. This also means you could be working on two branches simultaneously, and you can check in files to both branches at ones if you so desire. You could even have “main” and “dev” at different points in your version control history.

Git is different.

With Git, the “main” folder as you’d have it in TFVC would be pointless. Instead, there is just:

You can “branch off” any state of “project-x” at any point in time. Your folder “project-x” will point at a certain state, de facto having the code for a specific branch.

If you would like to have both branches “ready for action” on your disk, you would typically have multiple “clones” of the repository. One would be at the most recent state of the “main” (typically called “master”) branch, and another might be at the most recent state of another branch.

As a summary, Git branches relative to TFVC branches:

  • Are light-weight and can be used to “switch context” for example to work on a feature;
  • Are a top-level thing, as opposed to the “path-based” branches in TFVC;
  • Allow for more precise merges and “cherry picking” of commits;

These differences allow for completely different workflows with Git. This is a topic for another post though, if you’re interested I’d recommend research things like “Git Flow”, “GitHub Flow”, and “GitLab Flow”.

Other differences

While there are many other differences, both small (e.g. specific commands) and big (e.g. the “distributed” nature of Git), this post will keep it at the above. Mentioned differences are in my opinion the most “direct” and eye-catching differences. If you want to learn about the other differences I recommend you first get started with some practical skills, e.g. by following a tutorial.

Conclusion

TFVC is a mature, decent version control system. But personally, having worked with both centralized and distributed version control, in TFVC I’m misssing:

  • Ability to do commits locally;
  • Cheap, more powerful branching;
  • Great “shelve” options;

Those three, in addition to its “distributed” nature enabling online platforms like GitHub to flourish, are likely the reasons Git is so popular nowadays. In my opinion those are great intrinsic reasons to learn Git when you’re currently using TFVC (even if you cannot switch on the short term), and if not for those reasons then because it’s crucial for your career.

So ready yourself for a bad-ass learning-curve climb, and start learning more about Git!


Resources & Further Reading

Here are some links to continue your journey:

Pet project status report

This post assumes you’ve read my previous post on this project. It’s going to be a very short status report.

In aforementioned post I’ve tried to break through my analysis paralysis by listing all the things I had to think about (it helped!). Let me re-iterate, update, and complete the list to once again gather my thoughts:

  1.  Project and Namespace structure. KISS, so I went with just a single project (plus one for tests) for now.
  2.  Folder structure. following the lead of popular C# projects, only needing some files in the root and a src folder for the projects.
  3.  Initializing git. this was actually the biggest mental blockade. I just bursted through: screw TFS history, screw being “optimal”, and just move. I copy-pasted all code, cleaned everything carefully, and initialized the git repo with an already decently sized project.
  4.  License. MIT.
  5.  GitHub setup. Let’s start simple. Project created under our organization’s GitHub account. Pushing straight there for now, using my own personal profile. Will think about working with forks and pull requests later.
  6. NuGet packaging. Haven’t started this yet.
  7. Re-including the open sourced bits in my closed source solution. Have postponed optimizing this. The not-so-optimal solution for now is that the projects are gone from TFS, and there’s a “lib” folder instead with compiled DLLs from the open source project.
  8.  Choosing a name. Chosen, but not ready to disclose yet, even though it’s very easy to sherlock this bit.
  9. .NET Core. We’ll cross that bridge when we get there.
  10.  What am I forgetting? Looking at some of the “top” C# GitHub projects (Restsharp, NodaTime, dapper-dot-net, AutoMapper) was extremely helpful. Note to self: in-depth code reviews of those projects will be extremely educational.
  11.  Minimum quality. I actually chose to make this project an exercise in being as “clean” as possible (within -though close to the edge of- reason). But one bullet at a time, so e.g. crafting a great readme is a sub-exercise left for later.
  12. Early feedback. Still on my list, but want to get to some kind of “alpha” stage before I send out review requests.
  13. Logo. Was a great excuse to get started with the recently released GitHub Projects feature.
  14.  XML Documentation. Probably way over the top, but a nice personal exercise, so worth it after all.
  15. CI. Have not started with this yet, but know I have to at some point.
  16. GitHub Wiki. Probably way over the top, but would be a nice personal exercise to create one.
  17. Domain name. Not sure how that plays together with the license, the fact that the repo was initialized under my organization’s account, or any trademark stuff. Will have to figure that out some time soon.
  18. .NET Framework Versions. Related to the .NET Core bullet I guess, but slightly more important. I found that the popular repo’s I looked at for guidance have some kind of setup with duplicated projects several times over, not sure how that works. For now I’ll have to stick with a (unfortunately slightly older) version, 4.5.1, because that is the highest version I can use in the project that is dog-food-testing the project.

Okay, now I can stop that hurricane of thoughts, and get back to this pet project, and tick some more things off the list!

Extracting and Publishing closed bits as Open Source

There’s a project at work that’s got all the traits to serve as a great vehicle for dabbling some more with open source:

  • Small-but-not-too-small;
  • Self-contained;
  • Not too specific (e.g. to our domain);
  • Keeping the IP safe isn’t a concern since it’s just a small tool unrelated to the core business (i.e. we’ve got “the okay” to make it open source);
  • It might be of some use to someone else, and failing that it’s still fun to open source it;

While trying to start with the process of extraction and open sourcing, there seem to come up a million things to think about when going through this process. I want to gather my thoughts (by writing this post) and untangle that mental mess, so that I can get to a clear plan of attack. Perhaps this post will even be of use to someone else, somewhere.

Context

Currently, the bits I want to make open source are only slightly coupled to the bits I want to keep closed. That is: currently everything is in one solution, but the split in projects already clearly demarcates what would be open and what would remain closed. To put this in a tabular overview (where “Foo” is the closed bit, and “Bar” is the open bit):

Current Situation

Closed Source

/Foo.App
/Foo.App.Tests
/Foo.Core
/Foo.Core.Tests
/Foo.MyInstances
/Foo.ConsoleWrapper
/Foo.ServiceWrapper

Target Situation

Open Source

/src/Bar.App ???
/src/Bar.App.Tests ???
/src/Bar.Core
/src/Bar.Core.Tests
/src/Bar.ConsoleWrapper
/src/Bar.ServiceWrapper

Closed Source

/packages/Bar ???
/Foo.MyInstances

/Foo.WindowsService ???
/Foo.Console

There are some question marks in that overview. Elaborating on them a bit:

  • I’m not sure if the distinction between “Core” and “App” should remain a dll/project splitup as opposed to possibly a namespace split.
  • I’m not sure how the Open Source solution will be included in the closed source app (via NuGet, or as external source or binaries, or…?).
  • I’m not sure if I can easily distribute a generic windows service wrapper, or if I need to create it in the closed source solution after all.

In addition to these source-related questions, there are other things to think about as well. So next up:

What things to think about beforehand?

Let’s summarize all the things I can currently come up with that might be important in this process, in no particular order, merely numbered to be able to reference them easily:

  1. Project, Namespace, and DLL structure: what’s wise, what’s neat, what’s useful?
  2. Folder structure: how will the repository be structured? What’s future proof here?
  3. Initializing git: currently, the project history is in our corporate TFS. So not even sure if/how to keep the history intact, or if that’s even feasible.
  4. License: okay, that’s not too difficult, but have to choose one nonetheless.
  5. GitHub setup. What’s a good setup? Should I make the organization main owner of the repo, with a personal fork from which I do pull requests?
  6. NuGet packaging: how does this even work? As an application developer I had never had the need to learn how any of this works.
  7. Re-including the open sourced bits in my closed source solution: via NuGet, as external source, or as external binaries?
  8. Choosing a name. One which is not already in use for some software, where a domain with “normal” extension is still avaible, where the name is not taken on GitHub yet, etc.
  9. .NET Core. From my rss feeds I gather that it can be a quite fuss to make open source projects .NET Core “proof”, but it would be yet another thing to tackle while starting this whole thing off. Perhaps something for later? Or is that a grave mistake?
  10. What am I forgetting? I’ve been looking at some of the “top” C# GitHub projects (Restsharp, NodaTime, dapper-dot-net, AutoMapper) and their git history to see how they have evolved, to find out if I’m forgetting anything.
  11. Decent starting point, code-wise: I might be setting unreasonably (and unnecessarily) high standards for the initial commit, but still something to consider. What is the minimum quality I’m requiring before opening the source?
  12. Early feedback: it might be useful to get friends and colleagues to review the setup and get the first version(s) of the repository just right.
  13. Logo: yeah, this project needs the best. logo. ever!

In addition, there’s a practical point: once I split off bits to an open source project, I’ll effectively have two (source-control-wise unrelated) forks of the same code-base, until I’m ready to make my closed source solution depend on the open source bits again.

In Conclusion

After writing the above: wew! No wonder I was not getting anywhere: my mind was just wandering in circles through all of the above points. And I guess “Perfect is the enemy of Good”, or some such. Time to tick off some of the above items.

Aggregated online interactions

This blog hasn’t seen much action lately, but that’s a misrepresentation of my online interactions. Most of my interaction in the past few months has been on Stack Overflow Q&A, and some on Stack Overflow “Documentation” as well as a small amount on GitHub. I wanted to aggregate some of those interactions on my blog, as well as perhaps cross-post bits and pieces here, mainly for my own reference.

Let’s start with the first: aggregating the bits and pieces that I want to have easy links to.

Stack Overflow Documentation

  • Showcasing all common Angular constructs“. I’m linking to the most up to date version. I wrote V1 of that article, which was subsequently improved by various other folks. It’s the tutorial (and equally important: the style of tutorial) I wish I’d had when I started learning Angular.
  • KnockoutJS “Equivalents of AngularJS bindings“. Linked page summarizes the state SO Documentation is currently in, at least for low-traffic tags: little and poor collaboration, and some frustration because some decent examples I wrote just don’t get reviewed (neither approved nor rejected). Thinking I might turn my content there into a (series of) blog post(s) here. Not sure yet.

Stack Exchange Q&A

At around 20 questions and 200 answers in 2016 so far I’d say I’m “moderately active”. Here’s a few that stood out when I browsed through my recent history:

I also gave SoftwareRecs.SE another shot, posting some questions, but they fit right into my question history: lots of unanswered tumbleweeds. And not for lack of trying, as I spend a lot of effort on making my questions there as good as they can be. The main reason I do that (and the reason I keep coming back to softwarerecs.se, in spite of the tumbleweed-factor) is that thinking carefully about your wishes and requirements at the least will help you find something yourself, if no-one else recommends anything.

And even though I haven’t interacted with Cooking.SE much lately, every stray upvote now and then to my “Cooking fish in a dishwasher” answer makes me smile.

GitHub

I don’t interact as much here yet as I’d like. I specifically wish I remembered more often than a measly four times to create gists, because the ones I did create are ones I tend to go back to. In addition to gists, I’ve gotten to creating only very few issues and pull requests, something I want to work on.

One shoutout by the way to the DefinitelyTyped repository, because that community has to deal with a really scattered committer base, and seem to do so pretty well. My pull request (though small) was reviewed and merged quite quickly.

In Closing…

What to do next? The tags I followed on Stack Overflow for answering seem to have dried up a bit. Perhaps some more interaction on GitHub, as well as re-editing some of the above links into blog posts? Then again, a few weeks of vacation to Hawaii are coming up as well, so it might be a while again before posting…

Nested Elastic Explorations – Part 2

Reusing a properly modelled domain for storing data in Elasticsearch does not work well out of the box. Let’s examine a problem scenario. Consider this mini-domain:

Mini Bieb Domain

This ties in with my last post, where I mentioned that loops are a pain for serializing to json. Here’s the loop, visualized:

Mini Bieb Domain Loop

The problem is that NewtonSoft (used under the hood by Nest) will start serializing “The Greatest Book”, and recurses through all properties. In the end it’ll try to serialize “The Greatest Book” again as part of “Richard Roe”‘s AuthoredBooks property.

Breaking this serialization loop is actually pretty simple with NewtonSoft, and since a while you can inject the appropriate NewtonSoft setting in Nest as well. Something like this:

Problem solved, right? Not so much. Here’s why. Suppose I use the LoopHandling “fix” and load up the mini-domain with this integration test:

This will create a document in Elasticsearch of a whopping 71 KB / 1364 lines, see this example JSON file. Not so good.

The simple solution which would do for now would be to index only Book items, and all related people (authors, editors, translators), but not those people’s Books (AuthoredBooks, etc). We somehow need to let Nest and Elasticsearch know that we want to stop recursion right there. The question is how to be explicit about how they should map my domain objects to documents. I see two courses of action I like:

  1. Declarative mapping, with Attributes. This would (to my taste) require separate DTO classes to represent the documents in Elasticsearch, and have an explicit transformation between those DTO’s and my Domain objects. (I wouldn’t like to litter my domain object classes with persistence-specific attributes.)
  2. Mapping by code. This would seemingly allow me to keep using domain object classes for persistance, having the “Mappings” in code as a strategy for the transformation in separate files. At this point though I’m unsure if this approach will “hold up” once you start adding more complex properties and logic to domain objects.

I lean towards option 1, even though it feels like it’ll be more work. Guess there’s only one way to find out…

Nested Elastic Explorations – Part 1

After my previous post on what to explore next I’ve dabbled with option 4 “Linux” (but got frustrated so I set it aside), worked some on option 2 “More Gulp!” (bu that was mostly at work). However, I think I’ve settled for now with putting some effort in learning Elasticsearch and Nest. So let’s start a series of posts on this venture. (No promises though, this “Part 1” post might end up being the only part…)

About this venture…

Elasticsearch can be seen as a NoSQL database, more specifically as a document database. I’ll be using my Bieb pet project (or at least its domain) for testing its features. Bieb’s domain should be familiar to everyone: books!

Specifically I’m writing this post because I quickly ran into a fundamental challenge with document databases, and I hope to gather my thoughts on the matter by writing about it. To set the stage, I’ll be talking about this part of the domain:

  • A Book has multiple Authors (of type Person).
  • A Person has authored mulitple Books.

The classic many-to-many relationship example. The challenge then obviously is how to express this in the various tools.

Setting the coding stage…

Before I dive into the Elasticsearch bit, let me first describe how I got this going in C#, SQL, and NHibernate. First, the SQL bit:

Run-of-the-mill stuff, with a many-to-many table. Obviously, we don’t want to see BookAuthor show up as a class in our code. That is, we want this kind of C#:

As you can see, the Book is the “main” entity and the “boss” of the relationship with authors. Apart from the virtual keyword everywhere, little to no SQL or NHibernate know-how leaked into this domain.

The NHibernate mapping looks like this:

The important bit, and the link to Elasticsearch, is the inverse="true" bit. With NHibernate, you have to indicate which entity is “in charge” of updating the many-to-many table. With document databases you have to do the same thing.

Moving to Nest & Elasticsearch…

Diving head first into throwing these entities into Elasticsearch, I tried to run the following Nest integration test:

It’ll fail, because Nest internally uses JSON.Net to serialize the Book, which means you’ll get this:

Newtonsoft.Json.JsonSerializationException : Self referencing loop detected with type ‘Bieb.Domain.Entities.Book’. Path ‘allAuthors[0].authoredBooks’.

A problem that was to be expected.

With SQL self-referencing loops are just fine, handled by the many-to-many table. In NHibernate it’s a minor nuisance, simply handled by the inverse="true" bit. However, with document databases this is somewhat less trivial, in my opinion. In fact, it may be the hardest part of using them: how do you design your documents?

What I’d like to do…

The solution obviously is to break the self-referencing loop somewhere. Simply not serializing a Person’s authored books should do the trick. (On a side note, it occurs to me that graph databases would not need this trick, and would be great for a scenario with endless Book-Person-Book-… traversal, but that’s for another time.)

However, I don’t want to hard-code the loop-break into my domain object, e.g. by marking AuthoredBooks as not-serializable. The first and foremost reason is that this would mean my storage layer leaks into my domain layer. The second reason is that I may want to break the loop at different places depending on the Elasticsearch index I’m targeting. That is, I’ll probably have an index for Authors as well as Books, and want to break the loop in different places on either occasion.

Questions…

So some questions remain. How could we do this with JSON.Net without altering the domain layer? Or is there a way to do this with Elasticsearch + Nest? Or should we bite the bullet and have a seperate set of DTO’s to/from our domain entities to represent how they’re persisted in Elasticsearch? Or do we need a different approach alltogether?

Good questions. Time to go and find out!

A New Hope

Did I mention before that I write here to collect my thoughts and as writing practice? Well, in any case, after my previous monster post, and given that it’s the start of a new year, I figured I’d do a “New Beginnings” post. Then the sci-fi addict in me awoke and typed “A New Hope“, even though I’m a Trekkie at heart. But I digress.

I wanted to summarize which technologies and experiments I’m most interested in pursuing next. Let me try to list them in order of current preference.

1. Elasticsearch …. and Nest?

Elastic as a (primary?) data store intrigues me. It feels like a second chance for Lucene towards me (experimenting with Solr didn’t “click” for me). There recently was a major version released so it seems like a great time to start investigating. I’m hesitant about Nest though, which you’d kind-of have to use if you want to access Elastic with .NET: during my first few experiments I felt like I was “wrestling” the API.

2. Gulp, or: More Gulp!

I’ve done quite some work with Gulp recently, and it was a very (and surprisingly) fun experience! In the past, most devops tasks felt like chores to me, but not so much with this Gulp. So this point would actually be “More Gulp” (as I’ve done quite a bit of it recently already), but I still have quite a list of things I want to try out, like for instance setting up JSCS.

3. Node

Yes, yes, I know: I’d be a hipster-after-the-fact if I’d start on Node now. Even so, I enjoy writing Javascript, and Node would be a way to do even more of it in my pet projects. Not sure how I would be utilizing Node yet: with ExpressJS, for custom (Gulp) packages, general scripting, or perhaps something like project Euler.

4. Linux

This one’s been on my mind for a while now: perhaps I’ve been stuck too much in my Windows-comfort-zone. It would be good to shape up my base knowledge of Linux as a development environment.

5. Grunt

Why yes, in addition to Gulp there’s also Grunt on this list. Gotta know what’s on the other side of the fence to be able to compare, right?


I’m placing a short but distinct separation here, because these are currently mostly “in reserve” kind of ideas.


6. Couchbase

Working a little with this at work as well. There’s a few things I’d be excited about to try at home, mostly features from version 3 and 4. Would need a “subject” though to make it interesting…

7. Study three-way-diff-tools

For some reason I never got my mind wrapped around these tools. And that’s frustrating. I should just spend a few hours trying to understand the details of these tools… I guess.

8. Raspberry / something IoTish

This one’s on the list because it nags me that I don’t see what all the rage with IoT is (or is it “was”?) about, and for that reason I feel like I have to “make Slackbot call a webhook that calls my webserver that calls my Raspberry which turns on a LED strip and autopilots a drone around in my room”. Or something.

9. Android app

I’m curious how the Android-dev-experience compares to my short experiments from around 2011. This would also allow me to revisit Java. But on a whole I feel I’d have to invest a lot of time or not do this one at all.

10. Closure Compiler

It’s on the list because I find the idea of it intrinsically interesting. It’s low on the list because I’m not sure if I work on things personally or professionally at the moment where this would be worth the investment.


Wow, that worked like a charm! Apologies for the stream-of-consciousness post, but now I do know what to do next! And I’m not even going back to adjust the (order of the) list accordingly, I’m just gonna leave you all hanging out there, guessing at what’s next…