Entity Framework: Cascading Delete of Optional Related Entity

After several years of NHibernate, and a couple of years Dapper and NoSQL, I’m now working on a project that uses Entity Framework as its ORM. It’s a mature ORM at this point. However, it does give me a headache as I’m struggling to find ways to do certain things. I’ll admit that I’m trying to dive in without sitting down and doing a few hours of learning first; for sure I’ll be on Pluralsight some time soon.

Fair warning: what comes next is a highly specific, rather technical, and possibly stupidly uninformed dump of a problem I ran into. Feel free to skip this one: no hard feelings, and I’ll see you on my next post!

The main problem: I’ve got a repro, but it’s extremely similar to various other questions on Stack Overflow. Except… that my code contains the accepted (and often highly upvoted) answers’ code as well, but still doesn’t work as advertised!

What I’m trying to do is make sure that EF will delete a “Child” property (i.e. the database row) automatically when its “Parent” is explicitly deleted via the DbContext. Note that the Child is optional in my scenario.

Here’s the repro. Create a new class library and install EntityFramework (I used 6.1.3) and NUnit (I used 3.6.1). Then drop in one namespace these entities:

And then this DbContext:

And finally this TestFixture:

Make sure the “TestDb” database is available on your “(LocalDb)\cascades” instance (or use another Sql Server instance and database). Then run the tests, and you’ll get:

Test Failed – Delete_will_cascade.
Message: “Child”.
Expected: 0, but was: 1.

All I want is that a “Child” is deleted with its “Parent” once that is deleted. I know I can use a Sql CASCADE, and I know I could manually remove the Child from the context, but I want EF to handle this automatically, damnit!

For sure I’ve missed the obvious solution. I was somewhat hoping it would come to me while writing this. But perhaps I just need a good night’s rest.


References:

Nested Elastic Explorations – Part 2

Reusing a properly modelled domain for storing data in Elasticsearch does not work well out of the box. Let’s examine a problem scenario. Consider this mini-domain:

Mini Bieb Domain

This ties in with my last post, where I mentioned that loops are a pain for serializing to json. Here’s the loop, visualized:

Mini Bieb Domain Loop

The problem is that NewtonSoft (used under the hood by Nest) will start serializing “The Greatest Book”, and recurses through all properties. In the end it’ll try to serialize “The Greatest Book” again as part of “Richard Roe”‘s AuthoredBooks property.

Breaking this serialization loop is actually pretty simple with NewtonSoft, and since a while you can inject the appropriate NewtonSoft setting in Nest as well. Something like this:

Problem solved, right? Not so much. Here’s why. Suppose I use the LoopHandling “fix” and load up the mini-domain with this integration test:

This will create a document in Elasticsearch of a whopping 71 KB / 1364 lines, see this example JSON file. Not so good.

The simple solution which would do for now would be to index only Book items, and all related people (authors, editors, translators), but not those people’s Books (AuthoredBooks, etc). We somehow need to let Nest and Elasticsearch know that we want to stop recursion right there. The question is how to be explicit about how they should map my domain objects to documents. I see two courses of action I like:

  1. Declarative mapping, with Attributes. This would (to my taste) require separate DTO classes to represent the documents in Elasticsearch, and have an explicit transformation between those DTO’s and my Domain objects. (I wouldn’t like to litter my domain object classes with persistence-specific attributes.)
  2. Mapping by code. This would seemingly allow me to keep using domain object classes for persistance, having the “Mappings” in code as a strategy for the transformation in separate files. At this point though I’m unsure if this approach will “hold up” once you start adding more complex properties and logic to domain objects.

I lean towards option 1, even though it feels like it’ll be more work. Guess there’s only one way to find out…

Nested Elastic Explorations – Part 1

After my previous post on what to explore next I’ve dabbled with option 4 “Linux” (but got frustrated so I set it aside), worked some on option 2 “More Gulp!” (bu that was mostly at work). However, I think I’ve settled for now with putting some effort in learning Elasticsearch and Nest. So let’s start a series of posts on this venture. (No promises though, this “Part 1” post might end up being the only part…)

About this venture…

Elasticsearch can be seen as a NoSQL database, more specifically as a document database. I’ll be using my Bieb pet project (or at least its domain) for testing its features. Bieb’s domain should be familiar to everyone: books!

Specifically I’m writing this post because I quickly ran into a fundamental challenge with document databases, and I hope to gather my thoughts on the matter by writing about it. To set the stage, I’ll be talking about this part of the domain:

  • A Book has multiple Authors (of type Person).
  • A Person has authored mulitple Books.

The classic many-to-many relationship example. The challenge then obviously is how to express this in the various tools.

Setting the coding stage…

Before I dive into the Elasticsearch bit, let me first describe how I got this going in C#, SQL, and NHibernate. First, the SQL bit:

Run-of-the-mill stuff, with a many-to-many table. Obviously, we don’t want to see BookAuthor show up as a class in our code. That is, we want this kind of C#:

As you can see, the Book is the “main” entity and the “boss” of the relationship with authors. Apart from the virtual keyword everywhere, little to no SQL or NHibernate know-how leaked into this domain.

The NHibernate mapping looks like this:

The important bit, and the link to Elasticsearch, is the inverse="true" bit. With NHibernate, you have to indicate which entity is “in charge” of updating the many-to-many table. With document databases you have to do the same thing.

Moving to Nest & Elasticsearch…

Diving head first into throwing these entities into Elasticsearch, I tried to run the following Nest integration test:

It’ll fail, because Nest internally uses JSON.Net to serialize the Book, which means you’ll get this:

Newtonsoft.Json.JsonSerializationException : Self referencing loop detected with type ‘Bieb.Domain.Entities.Book’. Path ‘allAuthors[0].authoredBooks’.

A problem that was to be expected.

With SQL self-referencing loops are just fine, handled by the many-to-many table. In NHibernate it’s a minor nuisance, simply handled by the inverse="true" bit. However, with document databases this is somewhat less trivial, in my opinion. In fact, it may be the hardest part of using them: how do you design your documents?

What I’d like to do…

The solution obviously is to break the self-referencing loop somewhere. Simply not serializing a Person’s authored books should do the trick. (On a side note, it occurs to me that graph databases would not need this trick, and would be great for a scenario with endless Book-Person-Book-… traversal, but that’s for another time.)

However, I don’t want to hard-code the loop-break into my domain object, e.g. by marking AuthoredBooks as not-serializable. The first and foremost reason is that this would mean my storage layer leaks into my domain layer. The second reason is that I may want to break the loop at different places depending on the Elasticsearch index I’m targeting. That is, I’ll probably have an index for Authors as well as Books, and want to break the loop in different places on either occasion.

Questions…

So some questions remain. How could we do this with JSON.Net without altering the domain layer? Or is there a way to do this with Elasticsearch + Nest? Or should we bite the bullet and have a seperate set of DTO’s to/from our domain entities to represent how they’re persisted in Elasticsearch? Or do we need a different approach alltogether?

Good questions. Time to go and find out!

Pretty Database Diagrams

This is a cross-post about database diagrams, originally a self-answered question on the new Software Recommendations Stack Exchange site.


Up until a few years ago I used Microsoft Visio to reverse engineer an existing database to a diagram. This feature has been deprecated for a while now, and I’m rather stuck in finding an alternative.

Requirements

So, what would be the requirements for an alternative? Most of the requirements below were in Visio, though not all of them. I guess finding one piece of software that has them all is not feasible, but I’d like to get close.

Important:
  • SQL Server 2012 databases;
  • Reverse engineer tables with all columns and Primary/Foreign Key indications;
  • Reverse engineer foreign keys as connectors between tables;
  • Manual layouting of boxes and connectors (I want to group stuff visually);
  • Basic theming: choose background colors for tables;
  • Able to handle complex models, e.g. 100+ tables;
  • Able to choose which items to show or not (e.g. leave out certain tables, etc);
  • At least a free trial available;
  • Exporting to various formats for distributing and printing purposes, PDF, or PNG would also do;
  • WYSIWYG;
Main bonus points:
  • Runs on Windows 8 (though I’d install a Linux VM for software fulfilling all important requirements)
  • Automatic layouting and connector routing to minimize overlap;
  • Able to do SQL Server 2008 and 2008R2;
  • Able to reverse engineer views;
  • Free (or a free version);
  • Incremental update, e.g. reverse engineer again and have tables added/removed, etc., whilst keeping theme, existing layout, etc.
  • Pretty diagrams (modern look and feel, anti-aliasing, choice of fonts, colors, gradients, drop shadows, theming, etc);
  • Friendly learning curve (I can handle many complex features as long as it’s easy to get the basics done quickly);
MUST HAVE:
  • Installer doesn’t secretly install toolbars in my browser or change my home page!

My Research

Of course this question has been asked before, even on sister sites (with some success even though they’re off topic there). Here’s a few:

And slightly similar, from DBA.SE:

Furthermore, Wikipedia has a table comparing several software packages for “reverse engineering” feature:

Tried

These are the things I’ve tried:

  • Visio 2003 (this is what I used to use). Not free, outdated, but otherwise it fit the bill for most points. Not an option anymore though.
  • Visio 2010 and 2013 (this is what I have currently). Reverse engineering is no longer supported. I’ve considered writing a plugin or using VBA to hack things together though. Not sure how tough that would be. It’s not free (except I guess that I personally have it already), but fits the bill on a lot of other points.
  • Graphviz. Tried this quite some time ago, from what I remember it had a bit steep learning curve and I wasn’t quite able to manually determine final layout.
  • SSMS diagrams. This works for small cases, but can be cumbersome for larger cases.
  • Visual Studio 2012 ADO.NET Entity Data Model. Manual layouting of connectors is limited, theming does not exist at all, it is really meant for a goal other than documentation and it shows (e.g. default you get “Navigation Properties”, etc), and it doesn’t work well for 100+ tables. This may be useful for smaller scenario’s but doesn’t fit the bill for me.

And I think I may have tried a few suggestions from the SO threads, though I can’t really remember specifics.

So I asked: what would others recommend?

…Radio silence…

So I guess I’ll have to answer my own question then. My suggestion:

Use plain Visio for this.

Here’s why.

Explanation of my approach

There is no tool that conveniently satisfies all mentioned requirements, not even one that fulfills the main requirements only. Visual Studio 2012 comes close, but lacks theming support and didn’t work very well for a large number of tables.

The following process worked for me:

  1. Create a blank Visio Flowchart template. I’ve also tried the UML template and shapes but this has infuriating behavior for connector routing. Not recommended! Choose a design you like. Set the page and print sizes to something you want (I chose landscape A3 format and the “Linear” theme).
  2. Open a query in SSMS on the side to get the details you’ll need later: tables, column names, and IS NULL info.

  1. Start adding the tables one by one. Choose manually which tables and columns you which to include.

This is an example of what the end result looks like:

Database Diagram Example (Zoomed In)

Or a zoomed out screenshot of the entire model (because it was manual labor I chose to include significantly less than all 100+ tables):

Database Diagram Example (Zoomed Out)

The procedure I used to make the “tables” look as they do:

  • Headers are bold and a larger font size.
  • Separators aren’t lines but just a series of bold “—” dashes after eachother.
  • Four “sections”: TableName, Primary Key, Foreign Keys, plain columns.
  • Bold for NOT NULL columns.
  • White for TableName and separators, black for column names.

In addition, I chose for foreign keys to:

  • Never allow crossing connectors, and instead creating small “reference” blocks that refer to a table that’s somewhere on the other side of the model.
  • Not include all tables (i.e. there are more foreign keys than connectors).

Hopefully the above will be of use to others too.

How it holds up to the requirements

As a bottom line, I do recommend this approach, mainly because you end up with a visually appealing model and a high degree of control. This approach doesn’t fit the entire bill though. Here’s how it holds up to the requirements:

Data connectivity
  • SQL Server 2012 databases
  • Reverse engineer tables with all columns and Primary/Foreign Key indications
  • Reverse engineer foreign keys as connectors between tables

± This is all possible but not automated. So “more or less”.

Control and visuals
  • Manual layouting, basic theming, able to choose which items to show
  • Able to handle complex models, e.g. 100+ tables

✓ Yes, this all works very well with plain Visio.

Moneys
  • At least a free trial available

✓ AFAIK there’s a free Office trial available. I haven’t looked into this though as I have Visio.

Other
  • Exporting to various formats
  • WYSIWYG;

✓ Yes, this is available in Visio.

Bonus Points

It has about half of the bonus points:

  • runs on Windows 8
  • SQL 2008 also supported
  • pretty
  • friendly learning curve

Conclusion

It’s a shame I didn’t find any tool that met my requirements. I actually feel like failing in this regard: surely I haven’t looked hard enough?

It’s also a shame that I wasn’t able to muster the perseverance to automate this. I’m a developer for fuck’s sake!? (Or am I?)

It’s without shame though that I did end up with a rather pretty, useful diagram, with exactly the layout and content I wanted. I guess “Good is the enemy of Perfect” held up here.

Episode 6: DBA Stack Exchange Challenge

Database Administrators Stack Exchange
DBA.SE

Here’s a short episode in the Stack Exchange Challenge, for Database Administrators. It’s a weird one in the eco-system in my opinion. When you first hear about it this would seem like a great and vast topic to ask and answer questions about, but in reality it turns out this one’s not all that active. But lest I start with a conclusion, let’s first dig up the facts!

Current Statistics

Here are the usual stats for this particular sub-domain:

Fact DBA.SE
Questions 8,660
Questions with no upvoted answers 767 (8.86%)
FAQ (questions with most links) 1,619
Top 3 questions 130 votes, 74 votes, 64 votes
Questions active last hour 9
All-time rep for top 3 users 38.0k, 28.9k, 21.7k
Meta questions 245

My Questions

So far I’ve asked three questions on DBA.SE, and I’m moderately pleased with all of them:

That last one didn’t draw much response at first, so I took my chance to explore the bounty system some more. This worked perfectly, because someone was triggered by this to give me the answer I was looking for.

Related to my own questions there’s one more important thing to mention. This is the thing that triggered me to do an episode on the DBA site: the @StackDBAs Twitter engine tweeted a great question by a good friend of mine, on using the “sp_” prefix for stored procedures. It’s great to see something from a friend pop up in your Twitter feed via an unexpected angle!

My Answers

Yeah, well, ermm… still working on that. No answers so far. Weird thing is: most of my Stack Overflow reputation comes from answering DBA-related questions. But the questions on DBA are not very voluminous, and often also out of my league. Oh well.

Interesting Questions

Here are a small few gems with a lot of votes:

Community Wiki

Hmm, there aren’t any real great questions in this category that stand out from the bulk.

Conclusion

The main problem with this site is that people will probably prefer to ask their DBA questions on the subdomains that are more likely to generate a lot of views and answers: Stack Overflow and ServerFault. So let me summarize the pro’s and cons of this particular Stack Exchange:

Great, friendly community
Usually fast answers
Not very active
Askers seem to opt for asking on Stack Overflow or ServerFault instead

Bottom line: great community, not so great content (relative to other Stack Exchanges). Alas.