企业绩效管理网

 找回密码
 立即注册

QQ登录

只需一步,快速开始

查看: 675|回复: 8

Correlation between "Commit" times and the Transacti ...

[复制链接]

73

主题

390

帖子

558

积分

高级会员

Rank: 4

积分
558
QQ
发表于 2014-3-16 01:50:22 | 显示全部楼层 |阅读模式
Hello,

As I would imagine many of you know, when a query is completing referencing cells previously not cached, TM1 "commits" these cells at the conclusion of the query (what is displayed if you are watching TM1Top). During which TM1 becomes single threaded, or in other words, no one can do anything in TM1 until this commit completes.

What I am curious about, is what exactly is going on during this "commit" process. We've noticed that there is a direct correlation between the length of time the "commit" runs for and the transaction log size. So, if our transaction log is letssay 30 MB, the "commit" would take 3 seconds. However if the transaction log was 1 GB that same query would take 20 seconds to commit (don't quote me on the exact ratio, the increase is just very noticable).

Any thoughts?

Thanks,
Brad
回复

使用道具 举报

66

主题

378

帖子

540

积分

高级会员

Rank: 4

积分
540
QQ
发表于 2014-3-16 03:01:13 | 显示全部楼层
There is no correlation between the size of a log file and anything, except for how long it may take to do a system save. You could have a large log file because you have been busy making changes to a cube or simply because it's been a week or more since you did a save. Also a "query" does nothing to force a commit. A commit is what happens when you WRITE to a cube, not READ from it.

That being said, if you have a 30MB log file then something is wrong. You are probably logging TI processes which is likely unecessary, depending on your setup. Turning that off in your processes will speed them up immensely.
回复 支持 反对

使用道具 举报

86

主题

402

帖子

589

积分

高级会员

Rank: 4

积分
589
QQ
发表于 2014-3-16 03:05:11 | 显示全部楼层
I think that perhaps this is about the commit of calculated results to the shared calculation cache structure. Are you sure that the blocking is of the whole server and not just the cube being queried. Also what version are you on?
回复 支持 反对

使用道具 举报

90

主题

419

帖子

614

积分

高级会员

Rank: 4

积分
614
QQ
发表于 2014-3-16 03:14:23 | 显示全部楼层
Tomok,

I don't believe you correct with regard to the "Commit" and WRITES and READS.  We've been watching TM1Top closely for months and while the "Commits" occur at the conclusion of a WRITE, they also most certainly occur at the end of a READ query.  I think Duncan P's hypothesis with regard to the cache structure is correct.

With the regard to the transaction log size, 30 MB for us is nothing.  This is an extermly large daily/highly available model with large amounts of data being pulled in every couple of mintues, and then that data needs to get replicated across multiple cubes 6x (long story).  We also have the need for our main instance to sync with an input instance every couple mintues.   

We've found that if we perform SaveDataAll with the sync running the instance will hang/deadlock, therefore that is not an option.  Because of this we have intentionally left cube logging on.  So, if there is a server crash or reboot when the instance comes up there won't lost data because it will read from the transaction log.  We are currently working on a solution to pause our sync so we can our SaveDataAlls.   The environment of the application is extremly locked down and we are not permitted to make any "changes" to the application.  Ideally, we would just turn-off the sync chore and perform the SaveDataAll, however this is considered a change so we cannot do it.

With that said, as I mentioned previously this is a highly available system (23 hours 7 days a week) and the loss of data would result in us having to reload 4 years of historical daily data.  That would result in an outage of 4 - 5 hours which simply not acceptable for our client.  In this situation, the saftey net of a transaction log is a necessary evil.   

Duncan,

Yea, I'm sure it locks up the entire server.  For example, if it starts to perform the "Commit" you can't even naviagte the instance in Perspectives.  In top you can even see the the Excel Add-in verifying it's connectivity to server waiting for it to complete.  Personally,  prior to this engagement I wasn't even aware of the nature of a "Commit" occuring because the models were smaller and less calcuated.  The Commits would occur however they would happen very quickly and wouldn't be noticed.  Also, I've never had to watch top this closely previously.  We are on 9.5.1 currently.

Thanks,
Brad
回复 支持 反对

使用道具 举报

64

主题

404

帖子

556

积分

高级会员

Rank: 4

积分
556
QQ
发表于 2014-3-16 03:27:57 | 显示全部楼层
Brad, I can confirm your observations from my experience.

With some big models you can definitely notice "Commit" calls in TM1Top when data retrieval (e.g. user refreshing a report) thread finishes. And I agree with you that these Commits put a lock on the server and no other thread can continue to run until it's finished.

It would be nice if anyone could shed some more light on the subject, but I've always assumed it to be doing pretty much what you said.
回复 支持 反对

使用道具 举报

70

主题

437

帖子

587

积分

高级会员

Rank: 4

积分
587
QQ
发表于 2014-3-16 03:36:20 | 显示全部楼层
bradohare wrote:We've found that if we perform SaveDataAll with the sync running the instance will hang/deadlock, therefore that is not an option.  Because of this we have intentionally left cube logging on.  So, if there is a server crash or reboot when the instance comes up there won't lost data because it will read from the transaction log.  We are currently working on a solution to pause our sync so we can our SaveDataAlls.   The environment of the application is extremly locked down and we are not permitted to make any "changes" to the application.  Ideally, we would just turn-off the sync chore and perform the SaveDataAll, however this is considered a change so we cannot do it.
What is it about a certain type of IT management that can so perfectly allow the precise lobotomization of the common sense lobe?  I feel for you, sounds like you have a reasonably large and complex model and these sorts of operational rules only serve to make everyone's lives more difficult and makes systems less agile and changes much slower and more expensive to implement.  To use a simple car analogy, you have a steering wheel but you aren't being allowed to use it.  Rather you have to design and deploy a self-steering system instead! Far from reducing operational risk this lack of common sense between what is a true technical change, configuration change versus changing an operational parameter leads to sub-optimally performing systems, unhappy customers and frustrated administrators and developers.
bradohare wrote:Tomok,

I don't believe you correct with regard to the "Commit" and WRITES and READS.  We've been watching TM1Top closely for months and while the "Commits" occur at the conclusion of a WRITE, they also most certainly occur at the end of a READ query.  I think Duncan P's hypothesis with regard to the cache structure is correct.
I agree with this assessment as well.  On completion of view retrieval/calculation any new calculated values get written to the calculation cache and this commit action does cause a lock (although typically only for a specific cube(s) as opposed to server wide unless a meta data change is also involved such as reevaluation of a dynamic subset.)  3 potential suggested actions which may help
1/ make sure that the views being queried in the read threads causing the locks aren't using any dynamic subsets. If so replace with static subsets. (Not sure of the "change" impact on your system here!)
2/ It is possible to effectively disable the calculation cache by using the setting InfiniteThresholdForSingleCellStorage=T in tm1s.cfg.  As the calculation cache isn't written to there is no commit at the end of a calculation activity and hence no lock (dynamic subsets could still be an issue though).  There a a trade-off in performance though as no cache means all view retrieval is calculated from scratch each time.  However in a model with frequent updates this may not be much of an impact.  The change is simple and just requires a restart to take effect but in your environment would certainly be seen as a change ...
3/ Upgrade to 9.5.2 with parallel interaction (probably a long shot in the environment you have described.) Server wide locking is a thing of the past and locks only occur due to meta data changes.
回复 支持 反对

使用道具 举报

66

主题

363

帖子

518

积分

高级会员

Rank: 4

积分
518
QQ
发表于 2014-3-16 04:25:55 | 显示全部楼层
InfiniteThresholdForSingleCellStorage=T

It's a shame you can't set this for individual cubes rather than for the whole service. In a perfect world you would then be able to leave this on for summary cubes that are hit more often.
回复 支持 反对

使用道具 举报

83

主题

416

帖子

588

积分

高级会员

Rank: 4

积分
588
QQ
发表于 2014-3-16 05:04:06 | 显示全部楼层
Hi lotsaram,

With regard to your thoughts on IT management, as you would guess, I completely agree with you.  However, this is the hand that was dealt and it is what it is.  I'm most certainly not going change anything and it's something we need to live around.

With that said, this goes back to my main question and the point of my confusion.  Why would the trans log have anything to do with the time a commit takes??  Just doesn't make sense to me.

To each of your points:

1 - Knowing the potential issue with dynamic subsets we avoid them like the plague, making almost all of our subsets static

2 - Thanks for this, I wasn't aware of this setting.   Unfortunaly, we cannot use it.  We have a handful of cubes in our instance, one is a "calculated" cube that is performant and reflects data as of the previous end of day and the other is a "live" cube pulling in data every couple min.  We need to have the caching on for the calcuated cube.  Jim's point is good as it would be great to disable this caching for an individual cube.

3 - In the cards and definatly on the horizion.  Though, our model is massive and we are already maxing out the memory on our boxes.  Trying to get more memory has been a chore and we don't have a clear idea how much more we would need for 9.5.2.  IBM says 10 - 30% but I don't buy it, I think it'll be more.

thanks
brad
回复 支持 反对

使用道具 举报

75

主题

409

帖子

574

积分

高级会员

Rank: 4

积分
574
QQ
发表于 2014-3-16 05:38:54 | 显示全部楼层
Hi Brad:
With that said, this goes back to my main question and the point of my confusion. Why would the trans log have anything to do with the time a commit takes?? Just doesn't make sense to me.

Sorry to jump in a bit late, but what platform are you running your TM1 Server(s) on?  I think we've (to this point somewhat anecdotally) seen some odd performance issues that seem to correlate with larger transaction logs at one of our customers that's running TM1 on a Unix OS.  I'd have to dig around a bit to find out more on that, but wanted to mention it in case that was true in your case, too - if you're running TM1 on a Windows OS then I can't say I've seen the same correlation (and we have way more customers running TM1 on Windows servers).

Regards,
Mike
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|企业绩效管理网 ( 京ICP备14007298号   

GMT+8, 2021-11-30 18:14 , Processed in 0.060618 second(s), 11 queries , Memcache On.

Powered by Discuz! X3.1 Licensed

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表