如何限制 DataStage
我从事的一个项目中,我们运行多个可以并行运行的 DataStage 序列,其中一个序列性能较差,占用大量资源,影响共享环境。性能调整计划正在进行中,但需要时间。
与此同时,我希望我们可以限制 DataStage 以限制该特定作业/序列可以使用的资源 - 但我个人对 DataStage 没有具体的经验。
如果 DataStage 中存在此功能(我相信是 v8.5),任何人都可以发表评论,并为我指出一些进一步细节的方向。
其次,我知道我们可以根据用户进行限制(我认为这与 AIX 的“ulimit”有关,但不确定)。作为不同的用户运行不同的作业/序列是否容易/可能?
I work on a project where we run a number of DataStage sequences can be run in parallel, one in particular is poorly performing and takes a lot of resources, impacting the shared environment. Performance tuning initiative is in progress but will take time.
In the meantime I was hopeful that we could throttle DataStage to restrict the resources that could be used by this particular job/sequence - however I'm not personally experienced with DataStage specifically.
Can anyone comment if this facility exists in DataStage (v8.5 I believe), and point me in the direction of some further detail.
Secondly, I know that we can at the throttle based on the user (I think this ties into AIX 'ulimit', but not sure). Is it easy/possbile to run different jobs/sequences as different users?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在这种情况下,可以通过在配置文件中指定节点和资源的数量来限制特定作业的资源。在 8.5 中可以实现,您可以在 www.datastagetips.com 找到一些内容
In this type of situations resources for a particular job can be restricted by specifying number of nodes and resources in a config file. Possible in 8.5 and you may find something at www.datastagetips.com
Revolution_In_Progress 是正确的。
Datastage PX 有配置文件的概念。可以为您运行的所有作业指定该文件,也可以逐个作业覆盖该文件。配置文件可用于限制与作业关联的物理资源。
在这种情况下,如果您的大多数作业都有一个 4 节点配置文件,那么您可能需要为存在性能问题的作业编写一个 2 节点配置文件。这样,您将获得最小量的并行性(无需完全顺序)并使用最少量的资源。
http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r1/index.jsp?topic=/com.ibm.swg.im.iis.ds.parjob .tut.doc/module5/lesson5.1探索配置文件.html
Revolution_In_Progress is right.
Datastage PX has the notion of a configuration file. That file can be specified for all the jobs you run or it can be overridden on a job by job basis. The configuration file can be used to limit the physical resources that are associated with a job.
In this case, if you have a 4-node config file for most of your jobs, you may want to write a 2-node config file for the job with performance issue. That way, you'll get the minimum amount of parallelism (without going completely sequential) and use the minimum amount of resources.
http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r1/index.jsp?topic=/com.ibm.swg.im.iis.ds.parjob.tut.doc/module5/lesson5.1exploringtheconfigurationfile.html
序列是各个作业的集合。
Sequence is a collection of individual jobs.