Background

Computation heavy algorithms cost more money before optimized. You can feed your server systems enough RAM and CPUs to handle the CPU cycles and memory they require. This is not one of the most recurring investments and generally good for the application it is serving overall. If you are developing Smart Client/Desktop applications, your software can take as much as it can as long as it is running. However, in two cases careful software design can save (saving is earning) obvious money.

Windows Azure – if your application demands more resources than it should (may be after refactoring and performance tuning), the more users you will start to captivate, the more the cloud resources will be used. Therefore, you will have to pay extra bills.

Windows Phone 7 – Even though Microsoft set quite high bar on system specification as minimum, you will start to see your apps suffer if you take the liberty to use the system resources without considering efficient design. Given that there is no relational database on-device, you have to do a lot of XML parsing and string processing as part of local data support such as user settings. The more your apps suffer in performance, the more you will eventually receive negative reviews and possibly the more you will be losing customers.

Let me tell you beforehand that this article is based on pure fundamental data structures, yet I have tried to give a Windows Phone and Windows Azure flavor lest developers should miss it. Everybody talks about new features, but this topic was needed to be brought to developers' attention for our very nature of getting hands on the code first.

The Problem

String algorithms are usually very costly, yet easy to understand. I’ll pick up a random string related problem here, and walkthrough step by step how to optimize it. Assume you are building a Calculator app (I know this is no exciting sample case), however, unlike real calculator, user will enter the full expression as string at once, and you will have to parse and say that the brace usage was correct or no. For example:

{1+[2+(3+4)]} Correct
{(3+5)-[2*1]} Correct
{3+5}-[2+1] Correct
(2+[3+5)+1] Incorrect

The Ad-hoc Solutions

It can be solved in couple of ways. One of them could be running an index on the string from one side and find the matching braces from the other side running another index.

Ad-hoc1.png

Another way could be running two indexes from the same side. Once a brace has encountered, the other index moves ahead and try to find matching brace immediately after.

Ad-hoc2.png

In both the designs, we are running a loop through the whole string for each brace we encounter. This means in worst case, these algorithms can run as high as O(n^2) and for large enough strings it will run pretty bad. For example, for a 32KB string, it may take 1073741824 condition checks.

Stack-based Solution

In order to improve from the prior designs, we need to reduce one of the loops. We cannot remove the first loop which basically starts from one direction and finds braces through the whole string. But we will get rid of the other loop using a clever data structure.

Stack is one of the most popular data structures, which is suitable in many design problems, such as this. It is a data storage which ensures First In Last Out (FILO) principle, meaning that items can be retrieved from the storage exactly in reverse order of entry. So, we will loop through the string, start extracting braces from one direction and as long as no matching brace is found, we will keep pushing them to the Stack. However, if we find braces facing opposite direction, we will pop from the Stack, compare and proceed.

Here is the algorithm:

BraceMatch.png

The runtime of this algorithm is O(n), which means, number of comparisons it will make in worst case is linear to the number of characters the string has, which is incredible performance improvement over ad-hoc solutions that we have considered with complexity O(n^2). So, next time you build a Windows Phone app, think about efficient string handling and general performance improvement can be achieved through better design for lack of relational database and other hardware specific limitation.

Conclusion

Do not just jump right into the code. Design first, and better, code later. It is even more necessary than useful in handheld devices and the cloud where CPU cycles and memory aren't cheap.

推荐.NET配套的通用数据层ORM框架:CYQ.Data 通用数据层框架
新浪微博粉丝精灵,刷粉丝、刷评论、刷转发、企业商家微博营销必备工具"