Tuesday, November 13, 2012

MassTransit, first look

In my last project we have used self made service bus for about 3.5 years. As for me, using service bus was the one of the best things that happend to the project. After some experience in the field I think I do understand what features I need. Also I regard that developing own service bus was a bad move. It has some gaps and we never had a time and resources to close them. Not sure if today's known ESB solutions was had all needed features 3.5 years ago. As I remember them was in a very zygote state, but any way... In my future projects I prefer to move to an existing and proven free open source that can fit my vision of how ESB should work. 
I am checking MassTransit as the option and it looks mostly matching. I have to say that it's documentation lacks. Also some intresting to me features are in a very early state. But in general, MassTransit seems very promising.

Tuesday, November 6, 2012

NHibernate, bug in deep load of bags


Some days ago I write about performance difference between bag and set mapping of collections. But not all is so sunny with the bags. Let's look on the following example.
Model

We have blogs that contains posts, and posts that contain comments.  And we want to map Blog.Posts and Post.Comments associations as <bag>.
Now let's create new blog. Then add four posts. Then add one comment to second post, two comments to third post and three comments to forth post.  In total we added 7 comments to 4 posts.
But when we try to fetch whole graph of blog with its posts and with their comments in one single query we get strange results.
session.Query<Blog>()
   .Where(b => b.Id == expected.Id)
   .FetchMany(b => b.Posts)
   .ThenFetch(p => p.Comments)
   .ToList().First();


Our blog have 7 posts instead of 4. SQL query returned 7 joined rows from the DB (as expected) but it seems that NHibernate did not transformed properly received results.
If we map Blog.Posts as <set>  it works fine. So that issue appears only with bags.
I have opened a bug on NHibernate's JIRA issue tracker. There are tests that I have provided.

Sunday, November 4, 2012

NHibernate, Bidirectional vs Unidirectional associations

I heard from some peoples that bidirectional association is evil. I don’t know where them sneezed it from. I am not going to take a side and discuss it here. I am going to tell you that your choice have influence on NHibernate behavior and your code’s performance.
Lets start with bidirectional association.
Model
public class Order{
    public virtual Guid Id { get; set; }
    public virtual int Version { get; set; }
    public virtual DateTime OrderTime { get; set; }
    public virtual ICollection<OrderLine> OrderLines { get; set; }
}

public class OrderLine{
    public virtual Guid Id { get; set; }
    public virtual Order Order { get; set; }
    public virtual string ProductName { get; set; }
    public virtual decimal Price { get; set; }
    public virtual int Quantity { get; set; }
}

Order references a collection of order lines and each order line references it’s order. And I map collection as <bag> with inverse=true option.
Lets try to add a new order line to an existing order.

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
    var order = session.Get<Order>(id);
    order.OrderLines.Add(new OrderLine
    {
        Order = order,
        ProductName = "Product 1",
        Price = 9.99M,
        Quantity = 15
    });

    tx.Commit();
}

Take a look on what is going in the database

SELECT 
    order0_.Id as Id0_0_, 
    order0_.Version as Version0_0_, 
    order0_.OrderTime as OrderTime0_0_ 
FROM Orders order0_ 
WHERE order0_.Id=@p0

INSERT INTO OrderLines (OrderId, ProductName, Price, Quantity, Id) 
VALUES (@p0, @p1, @p2, @p3, @p4)

UPDATE Orders SET Version = @p0, OrderTime = @p1 WHERE Id = @p2 AND Version = @p3
Works as expected. The last query is for aggregate root versioning.
Now lets see what happens when we use model with unidirectional association. Little changes to code and mapping. And that’s what we got in the database.

SELECT 
    order2x0_.Id as Id0_0_, 
    order2x0_.Version as Version0_0_, 
    order2x0_.OrderTime as OrderTime0_0_ 
FROM Orders order2x0_ 
WHERE order2x0_.Id=@p0

SELECT 
    orderlines0_.OrderId as OrderId1_, 
    orderlines0_.Id as Id1_, 
    orderlines0_.Id as Id1_0_, 
    orderlines0_.ProductName as ProductN2_1_0_, 
    orderlines0_.Price as Price1_0_, 
    orderlines0_.Quantity as Quantity1_0_ 
FROM OrderLines orderlines0_ 
WHERE orderlines0_.OrderId=@p0

INSERT INTO OrderLines (ProductName, Price, Quantity, Id) 
VALUES (@p0, @p1, @p2, @p3)

UPDATE Orders SET Version = @p0, OrderTime = @p1 WHERE Id = @p2 AND Version = @p3

UPDATE OrderLines SET OrderId = @p0 WHERE Id = @p1

Dah!? What’s going on here?
The second query ignoring that collection mapped as a bag and goes to bring us already existing order lines. Seems like a bug.
Third query forgot to set reference from the new order line to the order.
And fifth query fixes that mistake. And that update will be executed for each order line.
It’s annoying. So, finally, we have at least two unnecessary queries when we decide to use unidirectional association.
I don’t know why and what was going in the heads of NH programmers when them coded it (no offence), but that is what we have and we will use it for the win! Just know your tools and make right decisions.

Friday, November 2, 2012

NHibernate: the little-known difference between <set> and <bag>

Set is a collection of unique items. Bag is none unique collection of items. This restriction influences on set's and bag's behavior from a performance point of view.
Take a look on pretty simple model:
 SupplierModel
We have a supplier with thousands of orders. Let's try to add new order to the supplier and see what happens in the database.

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
var supplier = session.Get<Supplier>(id);
supplier.Orders.Add(new SupplierOrder
{
Supplier = supplier,
OrderTime = DateTime.Now
});

tx.Commit();
}
If we map the orders as a <set> then NHibernate produces the next sql queries:

SELECT
supplier0_.Id as Id6_0_,
supplier0_.Name as Name6_0_
FROM Suppliers supplier0_
WHERE supplier0_.Id=@p0

SELECT
orders0_.SupplierId as SupplierId1_,
orders0_.Id as Id1_,
orders0_.Id as Id7_0_,
orders0_.SupplierId as SupplierId7_0_,
orders0_.OrderTime as OrderTime7_0_
FROM SupplierOrders orders0_
WHERE orders0_.SupplierId=@p0

INSERT INTO SupplierOrders (SupplierId, OrderTime, Id) VALUES (@p0, @p1, @p2)
As we see it queries for the supplier row, its orders and then inserts new order. What happens if supplier already has thousands of orders? We fetch them all!
Let's try to map orders as a <bag>

SELECT
supplier0_.Id as Id6_0_,
supplier0_.Name as Name6_0_
FROM Suppliers supplier0_
WHERE supplier0_.Id=@p0

INSERT INTO SupplierOrders (SupplierId, OrderTime, Id) VALUES (@p0, @p1, @p2)

This one seems better. Now we just fetched supplier and inserted new order.

Take it into account when you choose between set and bag.

Pay attention. It behaves different if you are not using bidirectional association between master and detail classes. Also it behaves different when you map detail class as a composite-element. But that's another story.

Thursday, November 1, 2012

NHibernate, Paging and eager loading

Q: How to fetch with NHibernate a page from a couple of master – detail tables?
Let’s say we have a well known and simplified model of orders – order lines.
Model
And we want to fetch a 3rd page (page size is 10) of the last orders received before specified day.
The intuitive solution is just query orders table with outer join on order lines table using standard paging technics.

var res = session.Query<Order>()
    .Where(o => o.OrderTime < beforeDate)
    .OrderByDescending(o => o.OrderTime)
    .FetchMany(e => e.OrderLines)
    .Skip(20)
    .Take(10)
    .ToList();

But when we look at results we see that we fetched less than 10 orders! What was happened here?
Take a look on generated by NH sql:

SELECT TOP (@p0) 
    Id4_0_, Id5_1_, Version4_0_, OrderTime4_0_, OrderId5_1_, ProductN3_5_1_, Price5_1_, Quantity5_1_, OrderId0__, Id0__ 
FROM (
    select 
        order0_.Id as Id4_0_, 
        orderlines1_.Id as Id5_1_, 
        order0_.Version as Version4_0_, 
        order0_.OrderTime as OrderTime4_0_, 
        orderlines1_.OrderId as OrderId5_1_, 
        orderlines1_.ProductName as ProductN3_5_1_, 
        orderlines1_.Price as Price5_1_, 
        orderlines1_.Quantity as Quantity5_1_, 
        orderlines1_.OrderId as OrderId0__, 
        orderlines1_.Id as Id0__, 
        ROW_NUMBER() OVER(ORDER BY order0_.OrderTime DESC) as __hibernate_sort_row 
    from Orders order0_ 
        left outer join OrderLines orderlines1_ 
            on order0_.Id=orderlines1_.OrderId 
    where order0_.OrderTime<@p1
) as query
WHERE query.__hibernate_sort_row > @p2
ORDER BY query.__hibernate_sort_row

It firstly joins orders with their order lines so we have multiple lines per same order and only than takes a specified page. So we not only fetching less than 10 orders, but we also not actually skipping 20 orders. We are skipping and taking joined rows.
As for me it’s a bug and not intended behavior.
I have expected that it firstly will take a needed page of orders and only than will join result with order lines.
So how anyway we can fetch a page of orders with their order lines in a most efficient way?
There is the best solution that I have found:

var subQuery = session.Query<Order>()
    .Where(o => o.OrderTime < beforeDate)
    .OrderByDescending(e => e.OrderTime)
    .Select(o => o.Id)
    .Skip(20)
    .Take(10);

var res = session.Query<Order>()
    .FetchMany(e => e.OrderLines)
    .Where(e => subQuery.Contains(e.Id))
    .OrderByDescending(o => o.OrderTime)
    .ToList();

I use sub query to find a needed page of orders and than fetch orders with its order lines by order’s ids present in the sub query.

select 
    order0_.Id as Id4_0_, 
    orderlines1_.Id as Id5_1_, 
    order0_.Version as Version4_0_, 
    order0_.OrderTime as OrderTime4_0_, 
    orderlines1_.OrderId as OrderId5_1_, 
    orderlines1_.ProductName as ProductN3_5_1_, 
    orderlines1_.Price as Price5_1_, 
    orderlines1_.Quantity as Quantity5_1_, 
    orderlines1_.OrderId as OrderId0__, 
    orderlines1_.Id as Id0__ 
from Orders order0_ 
    left outer join OrderLines orderlines1_ 
        on order0_.Id=orderlines1_.OrderId 
where order0_.Id in (
    SELECT TOP (@p0) Id 
    FROM (
        select order2_.Id, 
            ROW_NUMBER() OVER(ORDER BY order2_.OrderTime DESC) as __hibernate_sort_row 
        from Orders order2_ 
        where order2_.OrderTime<@p1
    ) as query
    WHERE query.__hibernate_sort_row > @p2
    ORDER BY query.__hibernate_sort_row
)
order by order0_.OrderTime desc


Pay attention. It does not work with 3.2 version of NH. In 3.2 version NH generates wrong SQL for Contains method. In our case it totally ignored sub select. The issue was fixed in the latest released version of NH: 3.3.1.4000.