Tuesday, 30 August 2011

The Bench-marking journey and results

We have finished bench-marking and it has been an interesting journey into the world of Amazon servers and  'cloud computing'. I will give the highlights (and low lights) and the final results. You are more than welcome to contact me if you want more details as to our findings and the procedure followed. Please note that all results and recommendations are subjective. We ran Bank Recon on one server (Application, service and database) and we were only running Bank Recon, no other applications. Amazon servers are also virtual so the performance will most likely differ to a real hardware setup with similar specs.

We also only tested the recon rules and cashbook imports. Testing of the manual screen as well as load testing may be conducted at a later stage.

We tested two different database setups. Both databases are real world databases with actual cashbook and bank transactions in them. The recon rules are in use at clients and were not changed in anyway for the testing.
1. 170 GB database where the Cashbook Transaction table has been partitioned and the Recon ID is the clustered index. Ran the rules on one domain for all transactions matched on one day.
2. 7 GB database with the standard Bank Recon schema. Ran the rules for all the domains for all transactions in the database.

The m1.Large instance has 7.5 GB memory and 2 virtual cores with 2EC2 units each. The underlying processor is an Intel Xeon E5430 (according to the Windows Server 2008 System information)
The m2.xLarge instance has 17.1 GB memory and 2 virtual cores with 3.25 EC2 units each. The underlying processor is an Intel Xeon X5550 (according to the Windows Server 2008 System information)
1 EC2 unit is approximately equal to the CPU capacity of a 1-1.2 GHz 2007 Opteron or 2007 Xeon Processor. More information on the instances is available here.
All instances are running Windows Server 2008 R2 64 bit.

And the results:

Cashbook Imports:
Only imported data into the 170 GB database, using the SQL BCP process. There were no code changes made.

Import File sizeNo of Txns Importedm1.Large Time (hh:mm:ss)m2.xLarge Time (hh:mm:ss)% Improvement
777,039 KB1 033 39500:53:3400:38:1829%
779,603 KB1 003 10400:52:0200:39:2225%


Recon Rules

m1.Large (old vs New code)m2.xLarge (new code)m2.xLarge (new code)
best time
DatabaseDomainRecon Count for all rules (approx)% Improvement code change% Improvement: increased hardware as compared to m1.Large new codeTime taken
(hh:mm:ss)
170 GB1 24 60085%20%00:16:28
7 GB137 00014%44%01:37:44
2315 60069%47%02:46:22
335 90035%26%02:59:39
412 68074%32%00:03:15
What really jumps out from the above results is the wide swing in recon performance improvement. All domains tested show an improvement but it varies between 14% and 85% when looking at the code changes and between 20% and 47% when looking at the hardware scaling. 

What this seems to indicates, is the effect that rule design has on the performance. We didn't change the matching logic, only the reconciling logic. The matching code has to work harder to find matches when the data isn't fully filtered or the sorting isn't consistent, and the reconciling portion is then a smaller percentage of the overall process. We may make changes to the matching logic in the future but you can get further rule improvements now by looking at your rules and tweaking them where necessary. But rule design is a subject for another blog post. 

Why these instances?
We tested on the m1.Large, m1.xLarge, m2.xLarge and m2.2xLarge instances. m1.Large is the smallest instance which made sense in our case - it allowed us to scale the 'hardware' and provided enough power to test on. We then upgraded to the m1.xLarge instance. On paper this is double the m1.Large instance and very similar to the m2.xLarge instance. But to our surprise we didn't get the expected improvements. Research led us to the conclusion that not all Amazon instances are created equal. It is likely you are sharing the underlying hardware when using a m1 instance. So you can find yourself in a 'noisy neighbourhood' and suffering from 'CPU steal cycle'. The m2 instances do not seem to experience the same issues to as large a degree.

We tested the scalability of Bank Recon by running the rules on the m2.2xLarge (double the m2.xLarge instance) instance but did not see noticeable improvements. Bank Recon does not make use of parallelism so the additional cores were of no benefit. There is also a limit to the amount of memory Bank Recon can currently utilise.

Summary and Guidelines
The results indicate that the optimum configuration for Bank Recon is 18 GB RAM and a 2-3 GHz dual core processor. This is required for both the Application and Database server. Your hard disk size we be determined by the size of your database and application. Please note that this guideline is based on running Bank Recon on one server with NO other applications. If you run multiple applications or databases on the same server you will benefit from more cores and memory which allow for parallel execution of processes and multiple threading.

A big thank you to the development team for the code improvements and all the time spent in bench-marking and analysing the results. We are busy with some small tweaks to the Manual Recon screen to make it more user friendly and then we will release everything in one bumper release

1 comment: