ss » s Home ss » s Online TV ss » . Technology ss » s Blog

Monday, 11 May 2009

All about LINQ



Language Integrated Query (LINQ, pronounced "link") is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages.

Microsoft LINQ defines a set of proprietary query operators that can be used to query, project and filter data in arrays, enumerable classes, XML (XLINQ), relational database, and third party data sources. While it allows any data source to be queried, it requires that the data be encapsulated as objects. So, if the data source does not natively store data as objects, the data must be mapped to the object domain. Queries written using the query operators are executed either by the LINQ query processing engine or, via an extension mechanism, handed over to LINQ providers which either implement a separate query processing engine or translate to a different format to be executed on a separate data store (such as on a database server as SQL queries (DLINQ)). The results of a query are returned as a collection of in-memory objects that can be enumerated using a standard iterator function such as C#'s foreach.

Many of the concepts that LINQ has introduced were originally tested in Microsoft's Cω research project. LINQ was released as a part of .NET Framework 3.5 on November 19, 2007.

Introducing Linq
Linq is short for Language Integrated Query. If you are used to using SQL to query databases, you are going to have something of a head start with Linq, since they have many ideas in common. Before we dig into Linq itself, let's step back and look at what makes SQL different from C#.

Imagine we have a list of orders. For this example, we will imagine they are stored in memory, but they could be in a file on disk too. We want to get a list of the costs of all orders that were placed by the customer identified by the number 84. If we set about implementing this in C# before version 3 and a range of other popular languages, we would probably write something like (assuming C# syntax for familiarity):
List Found = new List();
foreach (Order o in Orders)
if (o.CustomerID == 84)
Found.Add(o.Cost);
Here we are describing how to achieve the result we want by breaking the task into a series of instructions. This approach, which is very familiar to us, is called imperative programming. It relies on us to pick a good algorithm and not make any mistakes in the implementation of it; for more complex tasks, the algorithm is more complex and our chances of implementing it correctly decrease.

If we had the orders stored in a table in a database and we used SQL to query it, we would write something like:
SELECT Cost FROM Orders WHERE CustomerID = 84
Here we have not specified an algorithm, or how to get the data. We have just declared what we want and left the computer to work out how to do it. This is known as declarative or logic programming.

Linq brings declarative programming features into imperative languages. It is not language specific, and has been implemented in the Orcas version of VB.Net amongst other languages. In this series we are focusing on C# 3.0, but the principles will carry over to other languages.

Understanding A Simple Linq Query

Let's jump straight into a code example. First, we'll create an Order class, then make a few instances of it in a List as our test data. With that done, we'll use Linq to get the costs of all orders for customer 84.
class Order
{
private int _OrderID;
private int _CustomerID;
private double _Cost;
public int OrderID
{
get { return _OrderID; }
set { _OrderID = value; }
}
public int CustomerID
{
get { return _CustomerID; }
set { _CustomerID = value; }
}
public double Cost
{
get { return _Cost; }
set { _Cost = value; }
}
}
class Program
{
static void Main(string[] args)
{
// Set up some test orders.
var Orders = new List {
new Order {
OrderID = 1,
CustomerID = 84,
Cost = 159.12
},
new Order {
OrderID = 2,
CustomerID = 7,
Cost = 18.50
},
new Order {
OrderID = 3,
CustomerID = 84,
Cost = 2.89
}
};
// Linq query.
var Found = from o in Orders
where o.CustomerID == 84
select o.Cost;

// Display results.
foreach (var Result in Found)
Console.WriteLine("Cost: " + Result.ToString());
}
}
The output of running this program is:
Cost: 159.12
Cost: 2.89
Let's walk through the Main method. First, we use collection and object initializers to create a list of Order objects that we can run our query over. Next comes the query - the new bit. We declare the variable Found and request that its type be inferred for us by using the "var" keyword.

We then run across a new C# 3.0 keyword: "from".
from o in Orders
This is the keyword that always starts a query. You can read it a little bit like a "foreach": it takes a collection of some kind after the "in" keyword and makes what is to the left of the "in" keyword refer to a single element of the collection. Unlike "foreach", we do not have to write a type.

Following this is another new keyword: "where".
where o.CustomerID == 84
This introduces a filter, allowing us to pick only some of the objects from the Orders collection. The "from" made the identifier "o" refer to a single item from the collection, and we write the condition in terms of this. If you type this query into the IDE yourself, you will notice that it has worked out that "o" is an Order and intellisense works as expected.

The final new keyword is "select".
select o.Cost
This comes at the end of the query and is a little like a "return" statement: it states what we want to appear in the collection holding the results of the query. As well as primitive types (such as int), you can instantiate any object you like here. In this case, we will end up with Found being a List, though.

You may be thinking at this point, "hey, this looks like SQL but kind of backwards and twisted about a bit". That is a pretty good summary. I suspect many who have written a lot of SQL will find the "select comes last" a little grating at first; the other important thing to remember is that all of the conditions are to be expressed in C# syntax, not SQL syntax. That means "==" for equality testing, rather than "=" in SQL. Thankfully, in most cases that mistake will lead to a compile time error anyway

No comments:

Post a Comment