Tuesday 30 August 2011

The Bench-marking journey and results

We have finished bench-marking and it has been an interesting journey into the world of Amazon servers and  'cloud computing'. I will give the highlights (and low lights) and the final results. You are more than welcome to contact me if you want more details as to our findings and the procedure followed. Please note that all results and recommendations are subjective. We ran Bank Recon on one server (Application, service and database) and we were only running Bank Recon, no other applications. Amazon servers are also virtual so the performance will most likely differ to a real hardware setup with similar specs.

We also only tested the recon rules and cashbook imports. Testing of the manual screen as well as load testing may be conducted at a later stage.

We tested two different database setups. Both databases are real world databases with actual cashbook and bank transactions in them. The recon rules are in use at clients and were not changed in anyway for the testing.
1. 170 GB database where the Cashbook Transaction table has been partitioned and the Recon ID is the clustered index. Ran the rules on one domain for all transactions matched on one day.
2. 7 GB database with the standard Bank Recon schema. Ran the rules for all the domains for all transactions in the database.

The m1.Large instance has 7.5 GB memory and 2 virtual cores with 2EC2 units each. The underlying processor is an Intel Xeon E5430 (according to the Windows Server 2008 System information)
The m2.xLarge instance has 17.1 GB memory and 2 virtual cores with 3.25 EC2 units each. The underlying processor is an Intel Xeon X5550 (according to the Windows Server 2008 System information)
1 EC2 unit is approximately equal to the CPU capacity of a 1-1.2 GHz 2007 Opteron or 2007 Xeon Processor. More information on the instances is available here.
All instances are running Windows Server 2008 R2 64 bit.

And the results:

Cashbook Imports:
Only imported data into the 170 GB database, using the SQL BCP process. There were no code changes made.

Import File sizeNo of Txns Importedm1.Large Time (hh:mm:ss)m2.xLarge Time (hh:mm:ss)% Improvement
777,039 KB1 033 39500:53:3400:38:1829%
779,603 KB1 003 10400:52:0200:39:2225%

Recon Rules

m1.Large (old vs New code)m2.xLarge (new code)m2.xLarge (new code)
best time
DatabaseDomainRecon Count for all rules (approx)% Improvement code change% Improvement: increased hardware as compared to m1.Large new codeTime taken
170 GB1 24 60085%20%00:16:28
7 GB137 00014%44%01:37:44
2315 60069%47%02:46:22
335 90035%26%02:59:39
412 68074%32%00:03:15
What really jumps out from the above results is the wide swing in recon performance improvement. All domains tested show an improvement but it varies between 14% and 85% when looking at the code changes and between 20% and 47% when looking at the hardware scaling. 

What this seems to indicates, is the effect that rule design has on the performance. We didn't change the matching logic, only the reconciling logic. The matching code has to work harder to find matches when the data isn't fully filtered or the sorting isn't consistent, and the reconciling portion is then a smaller percentage of the overall process. We may make changes to the matching logic in the future but you can get further rule improvements now by looking at your rules and tweaking them where necessary. But rule design is a subject for another blog post. 

Why these instances?
We tested on the m1.Large, m1.xLarge, m2.xLarge and m2.2xLarge instances. m1.Large is the smallest instance which made sense in our case - it allowed us to scale the 'hardware' and provided enough power to test on. We then upgraded to the m1.xLarge instance. On paper this is double the m1.Large instance and very similar to the m2.xLarge instance. But to our surprise we didn't get the expected improvements. Research led us to the conclusion that not all Amazon instances are created equal. It is likely you are sharing the underlying hardware when using a m1 instance. So you can find yourself in a 'noisy neighbourhood' and suffering from 'CPU steal cycle'. The m2 instances do not seem to experience the same issues to as large a degree.

We tested the scalability of Bank Recon by running the rules on the m2.2xLarge (double the m2.xLarge instance) instance but did not see noticeable improvements. Bank Recon does not make use of parallelism so the additional cores were of no benefit. There is also a limit to the amount of memory Bank Recon can currently utilise.

Summary and Guidelines
The results indicate that the optimum configuration for Bank Recon is 18 GB RAM and a 2-3 GHz dual core processor. This is required for both the Application and Database server. Your hard disk size we be determined by the size of your database and application. Please note that this guideline is based on running Bank Recon on one server with NO other applications. If you run multiple applications or databases on the same server you will benefit from more cores and memory which allow for parallel execution of processes and multiple threading.

A big thank you to the development team for the code improvements and all the time spent in bench-marking and analysing the results. We are busy with some small tweaks to the Manual Recon screen to make it more user friendly and then we will release everything in one bumper release

Tuesday 16 August 2011

Benchmarking Performance on Amazon Servers

Bank Recon benchmarking is underway with the goal of providing measurable information as to what the optimal setup is, as well as to quantify the improvements gained from the recon rule changes that have been made. Remember, we are testing in a very specific environment so we can only give guidelines to maximise performance. Factors such as the number of applications running on your servers and the number of users will affect performance in your environment in ways we haven't tested. 

We decided to use the Amazon EC2 servers for the benchmarking exercise. The benefit of the Amazon setup is that hardware can be easily scaled to measure the effect of upgrading the processors, memory and disk space without the physical cost of buying and housing the servers . Amazon provides a number of 'instances' which have differing CPU's, cores and memory. Amazon uses the EC2 Compute Unit (ECU) to measure CPU output. One ECU unit is roughly equivalent to a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. More information about the Amazon EC2 servers can be found here.

Thus far we are testing on the following configurations

Instances Used
EC2 Large: 4 ECUs, 2 cores, 7.5 GB memory
EC2 Extra Large: 8 ECU's, 4 cores, 15 GB Memory

Server Setup
Windows Server 2008 R2 64 bit.
SQL Server 2005. 
Database and application running on the same server.

160 GB with the Cashbook Transaction Table partitioned on Recon Id
6 GB with a standard schema

I will detail the testing done and the results in a later blog post when it is all completed, which should be in the next week or so. We have tested the recon performance improvement of the new code vs the old code as well as benchmarked the Import process on the Large instance. We are now testing how scaling the hardware affects the performance by running the recon rules and imports on the Extra Large Instance using the new code only. We may test on other instances as well to get a better idea of scalability. Every time we release a new version going forward we will benchmark it to ensure the performance is as good or better. 

And as soon as the testing is complete we will release the new build - watch this space!