With one of our customers we have been struggling with severe lockings on the InventSumDelta table. The effect of the lockings is that the users cannot perform their daily work while for example somebody is invoicing a large sales orders or other processes that accesses inventory data for shorter og longer period of time.
While trying to fix this issue we have been attacking from several fronts and in this blog post I will try to list up some of the actions we have taken to work our way towards a solution.
The current status is that we are still experiencing locks. However, they have been minimised to a – more or less – acceptable level. With performance being somewhat subjective it is difficult to say when enough is enough.
So, what servers are we working on?
5 AOS servers + a load balancer. All running Windows Server 2008 R2
The SQL server is a SQL 2012 Standard edition also running on a Windows Server 2008 R2 machine.
What have we done so far?
On the technical side we have done the following:
To eliminate the disk I/O, which we suspected in the beginning, we added a SSD drive directly in the SQL server and moved the databases from the conventional SAN to the new drive. The disk performance rocketed but still we had locking issues. The current setup is the AX databases on a SSD SAN and the TempDB running on the local SSD to boost performance.
Next step was to remove the page locking on the indexes on the InventSumDelta table. The theory here was that the records were locked based on an overlap in the pages. Then the OS and SQL server was patched with every relevant patch and CU and we doubled the RAM available to the SQL service from 32 to 64 GB. This was done since we could see it was running a bit low on the RAM. The result of being shorthanded here is actually increased lockings even though we disabled the page locking. Still no cigar. The general performance increased but still the users were stuck.
In the beginning we were convinced that this was due to general performance issues in AX 2012 R2. Yes, they do exist. 🙂
But more and more pointed towards the SQL server being the bottle neck. That made us start fiddling around with fill factors going from 80 to 70 and experimenting with even more extreme numbers ending back on 80 since we saw no noticeable effect besides a rapidly increasing database size.
We did – assisted by Jesper from Miracle A/S – a lot of tracing on the SQL server and discovered multiple indexes missing from the system. A couple of the were probably specific for the way the customer works and a couple of them were in the category of public interest I think. Both of them are created on InventSumDelta:
Index1: TTSId Partition DataAreaId IsAggregated Index2: ItemId DataAreaId Partition
These index tweaks actually made a difference. Again, we did not reach the finish line. But we did move a fair bit closer.
While diving into this we came across what probably is the actual problem. What we discovered was that even though we had an excellently indexed table the SQL server Optimizer seemed to ignore that fact randomly and selected indexes not matching the query and therefore locking the full table.
There are several ways to address this. One is to increase the frequency of updating statistics. Going down to every two minutes did not do any good. What we then did was to completely disable the automatic statistics update on all indexes on the table. The theory here was that since the content of the table is – with high frequency – continuously switching from no records to x number of records the statistics could in most cases be based on a wrong perception of how the indexes was utilized the best. And this seemed to be a hit.
Since this was changed we have seen a significant drop in lockings. No doubt that the new indexes has made a difference and that we would not be on our way out of the woods without them. However, I believe that the main handle we pulled was in the way statistics are handled.
Afterwards we have added a couple of indexes more on InventSumDeltaDim and SalesInvoiceTmp to further improve performance. I will try to create a follow-up post to summarise these steps.
This long path does not necessarily fix or patch your system if you are experiencing the same issues. If it does I hope that it makes your customers happy. If not, I hope that you can use a bit of our findings and that you will share yours by responding to this blog post.