We also only tested the recon rules and cashbook imports. Testing of the manual screen as well as load testing may be conducted at a later stage.
We tested two different database setups. Both databases are real world databases with actual cashbook and bank transactions in them. The recon rules are in use at clients and were not changed in anyway for the testing.
1. 170 GB database where the Cashbook Transaction table has been partitioned and the Recon ID is the clustered index. Ran the rules on one domain for all transactions matched on one day.
2. 7 GB database with the standard Bank Recon schema. Ran the rules for all the domains for all transactions in the database.
The m1.Large instance has 7.5 GB memory and 2 virtual cores with 2EC2 units each. The underlying processor is an Intel Xeon E5430 (according to the Windows Server 2008 System information)
The m2.xLarge instance has 17.1 GB memory and 2 virtual cores with 3.25 EC2 units each. The underlying processor is an Intel Xeon X5550 (according to the Windows Server 2008 System information)
1 EC2 unit is approximately equal to the CPU capacity of a 1-1.2 GHz 2007 Opteron or 2007 Xeon Processor. More information on the instances is available here.
All instances are running Windows Server 2008 R2 64 bit.
And the results:
Cashbook Imports:
Only imported data into the 170 GB database, using the SQL BCP process. There were no code changes made.
Import File size | No of Txns Imported | m1.Large Time (hh:mm:ss) | m2.xLarge Time (hh:mm:ss) | % Improvement |
777,039 KB | 1 033 395 | 00:53:34 | 00:38:18 | 29% |
779,603 KB | 1 003 104 | 00:52:02 | 00:39:22 | 25% |
Recon Rules
m1.Large (old vs New code) | m2.xLarge (new code) | m2.xLarge (new code) best time | |||
Database | Domain | Recon Count for all rules (approx) | % Improvement code change | % Improvement: increased hardware as compared to m1.Large new code | Time taken (hh:mm:ss) |
170 GB | 1 | 24 600 | 85% | 20% | 00:16:28 |
7 GB | 1 | 37 000 | 14% | 44% | 01:37:44 |
2 | 315 600 | 69% | 47% | 02:46:22 | |
3 | 35 900 | 35% | 26% | 02:59:39 | |
4 | 12 680 | 74% | 32% | 00:03:15 |
What really jumps out from the above results is the wide swing in recon performance improvement. All domains tested show an improvement but it varies between 14% and 85% when looking at the code changes and between 20% and 47% when looking at the hardware scaling.
What this seems to indicates, is the effect that rule design has on the performance. We didn't change the matching logic, only the reconciling logic. The matching code has to work harder to find matches when the data isn't fully filtered or the sorting isn't consistent, and the reconciling portion is then a smaller percentage of the overall process. We may make changes to the matching logic in the future but you can get further rule improvements now by looking at your rules and tweaking them where necessary. But rule design is a subject for another blog post.
Why these instances?
We tested on the m1.Large, m1.xLarge, m2.xLarge and m2.2xLarge instances. m1.Large is the smallest instance which made sense in our case - it allowed us to scale the 'hardware' and provided enough power to test on. We then upgraded to the m1.xLarge instance. On paper this is double the m1.Large instance and very similar to the m2.xLarge instance. But to our surprise we didn't get the expected improvements. Research led us to the conclusion that not all Amazon instances are created equal. It is likely you are sharing the underlying hardware when using a m1 instance. So you can find yourself in a 'noisy neighbourhood' and suffering from 'CPU steal cycle'. The m2 instances do not seem to experience the same issues to as large a degree.
We tested the scalability of Bank Recon by running the rules on the m2.2xLarge (double the m2.xLarge instance) instance but did not see noticeable improvements. Bank Recon does not make use of parallelism so the additional cores were of no benefit. There is also a limit to the amount of memory Bank Recon can currently utilise.
Summary and Guidelines
The results indicate that the optimum configuration for Bank Recon is 18 GB RAM and a 2-3 GHz dual core processor. This is required for both the Application and Database server. Your hard disk size we be determined by the size of your database and application. Please note that this guideline is based on running Bank Recon on one server with NO other applications. If you run multiple applications or databases on the same server you will benefit from more cores and memory which allow for parallel execution of processes and multiple threading.
A big thank you to the development team for the code improvements and all the time spent in bench-marking and analysing the results. We are busy with some small tweaks to the Manual Recon screen to make it more user friendly and then we will release everything in one bumper release
A big thank you to the development team for the code improvements and all the time spent in bench-marking and analysing the results. We are busy with some small tweaks to the Manual Recon screen to make it more user friendly and then we will release everything in one bumper release
Great work. Excellent info and guideline.
ReplyDelete