将旧版 Cobol/PL1 迁移到 Java 的经验
原始问题: 我想知道是否有人有将大型 Cobol/PL1 代码库迁移到 Java 的经验?
该过程的自动化程度如何以及输出的可维护性如何?
从事务型到面向对象的转变是如何进行的?
在此过程中吸取的任何经验教训或可能有益的资源/白皮书将不胜感激。
编辑 7/7: 当然,NACA 方法很有趣,能够继续对 COBOL 代码进行 BAU 更改,直到发布 JAVA 版本,这对任何组织都有好处。
对于具有大型代码库的大型组织来说,主张采用与 COBOL 相同的布局的过程式 Java 可以让编码人员在熟悉 Java 语言的同时获得舒适感,这是一个有效的论点。 正如 @Didier 指出的那样,每年节省 300 万美元,为未来持续重构代码的任何 BAU 更改提供了慷慨的空间。 正如他所说,如果你关心你的员工,你就会找到一种方法让他们快乐,同时逐渐挑战他们。
我根据@duffymo 的建议看到了这个问题
最好尝试并真正理解 追根溯源并重新表达问题 作为面向对象的系统
,如果您正在进行任何 BAU 更改,那么在编码新的 OO 系统的漫长项目生命周期中,您最终会进行编码和更改。 测试双重变化。 这是 NACA 方法的一个主要好处。 我有一些将客户端-服务器应用程序迁移到 Web 实现的经验,这是我们遇到的主要问题之一,由于 BAU 的变化而不断改变需求。 它使 PM & 安排一个真正的挑战。
感谢@hhafez,他的经验被很好地描述为“相似但略有不同”,并且在从 Ada 到 Java 的自动代码迁移方面获得了相当令人满意的体验。
感谢@Didier 的贡献,我仍在研究你的方法,如果我有任何问题,我会给你留言。
ORIGINAL Q:
I'm wondering if anyone has had experience of migrating a large Cobol/PL1 codebase to Java?
How automated was the process and how maintainable was the output?
How did the move from transactional to OO work out?
Any lessons learned along the way or resources/white papers that may be of benefit would be appreciated.
EDIT 7/7: Certainly the NACA approach is interesting, the ability to continue making your BAU changes to the COBOL code right up to the point of releasing the JAVA version has merit for any organization.
The argument for procedural Java in the same layout as the COBOL to give the coders a sense of comfort while familiarizing with the Java language is a valid argument for a large organisation with a large code base. As @Didier points out the $3mil annual saving gives scope for generous padding on any BAU changes going forward to refactor the code on an ongoing basis. As he puts it if you care about your people you find a way to keep them happy while gradually challenging them.
The problem as I see it with the suggestion from @duffymo to
Best to try and really understand the
problem at its roots and re-express it
as an object-oriented system
is that if you have any BAU changes ongoing then during the LONG project lifetime of coding your new OO system you end up coding & testing changes on the double. That is a major benefit of the NACA approach. I've had some experience of migrating Client-Server applications to a web implementation and this was one of the major issues we encountered, constantly shifting requirements due to BAU changes. It made PM & scheduling a real challenge.
Thanks to @hhafez who's experience is nicely put as "similar but slightly different" and has had a reasonably satisfactory experience of an automatic code migration from Ada to Java.
Thanks @Didier for contributing, I'm still studying your approach and if I have any Q's I'll drop you a line.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
更新 6/25: 一位朋友刚刚遇到了 NACA Cobol 到 Java 转换器。 看起来很有趣,它被用来翻译 4m 行的 Cobol,准确率 100%。 这是NACA 开源项目页面。 我见过的其他转换器都是专有的,并且这些材料明显缺乏成功案例和详细的示例代码。 NACA值得一看。
更新 7/4: @Ira Baxter 报告说 Java 输出看起来非常 Cobol 风格,确实如此。 对我来说,这是自动翻译的自然结果。 我怀疑我们是否能找到更好的翻译者。 这也许支持逐步重写方法。
更新 2/7/11: @spgennard 指出 JVM 上有一些 Cobol 编译器,例如 Veryant 的 是Cobol Evolve。 这些可以用来帮助逐步转换代码库,尽管我认为 OP 对自动源转换更感兴趣。
我对此会非常谨慎。 (我曾经在一家公司工作,该公司自动纠正了 Y2K 的 Cobol 和 PL/I 程序,并做了前端编译器,将 Cobol 的许多方言转换为我们的中间分析形式,同时也是一个代码生成器.) 我的感觉是,您最终会得到一个 Java 代码库,但它仍然不优雅且使用起来不令人满意。 您可能会遇到性能问题、对供应商提供的库的依赖、生成的有错误的代码等等。 您肯定会承担巨额测试费用。
从头开始采用新的面向对象设计可能是正确的方法,但您还必须仔细考虑代码库所代表的数十年存储的知识。 通常,您的新代码可能会忽略许多微妙之处。 另一方面,如果您很难找到维护遗留系统的人员,您可能别无选择。
一种渐进的方法是首先升级到 Cobol 97。这增加了面向对象,因此您可以在添加新功能时单独重写和重构子系统。 或者您可以用新编写的 Java 替换各个子系统。
有时您可以用现成的软件替换组件:我们帮助了一家非常大的保险公司,该公司仍然有 200 万行代码,采用的是 20 世纪 50 年代创建的遗留语言。 我们将其中一半转换为符合千年虫问题的遗留语言,而他们则用从外部供应商购买的现代薪资系统替换了另一半。
Update 6/25: A friend just ran across the NACA Cobol to Java converter. Looks quite interesting, it was used to translate 4m lines of Cobol with 100% accuracy. Here's the NACA open source project page. The other converters I've seen were proprietary, and the materials were conspicuously lacking success stories and detailed example code. NACA is worth a long look.
Update 7/4: @Ira Baxter reports that the Java output looks very Cobol-esque, which it absolutely does. To me, this is the natural result of automatic translation. I doubt we'll ever find a much better translator. This perhaps argues for a gradual re-write approach.
Update 2/7/11: @spgennard points out that there are some Cobol compilers on the JVM, for example Veryant's isCobol Evolve. These could be used to help gradually transition the code base, though I think the OP was more interested in automated source conversion.
I'd be very cautious about this. (I used to work for a company that automatically corrected Cobol and PL/I programs for Y2K, and did the front end compiler that converted many dialects of Cobol into our intermediate analytic form, and also a code generator.) My sense is that you'd wind up with a Java code base that still would be inelegant and unsatisfying to work with. You may wind up with performance problems, dependencies on vendor-supplied libraries, generated code that's buggy, and so on. You'll certainly incur a huge testing bill.
Starting from scratch with a new object-oriented design can be the right approach, but you also have to carefully consider the decades of stored knowledge represented by the code base. Often there are many subtleties that your new code may miss. On the other hand, if you're having a hard time finding staff to maintain the legacy system, you may not have a choice.
One gradual approach would be to first upgrade to Cobol 97. This adds object-orientation, so you can rewrite and refactor subsystems individually when you add new functionality. Or you could replace individual subsystems with freshly-written Java.
Sometimes you'll be able to replace components with off-the-shelf software: we helped one very large insurance company that still had 2m lines of code in a legacy language it created in the 1950s. We converted half of it to Y2K compliant legacy language, and they replaced the other half with a modern payroll system they bought from an outside vendor.
显然,我们的目的是获得与原始 cobol 非常接近的初始 java 代码,以方便人们迁移:他们发现他们在 cobol 中编写的旧应用程序具有完全相同的结构。
我们最重要的目标之一是让最初的开发人员参与进来:这就是我们实现这一目标的方式。 当应用程序迁移到 Java 时,这些人可以开始使其更加面向对象,因为他们进一步开发/重构它。
如果你不关心人员迁移,你可以使用其他策略。
这种 1 对 1 的转换还使 100% 自动转换变得更加简单和简单。 更快:好的结果是我们的经常性储蓄(300 万欧元/年)更快:我们估计需要 12-18 个月。 这些早期节省显然可以重新投资于面向对象重构,
请随时与我联系:[email protected] 或 [电子邮件受保护]
didier
It was clearly our intent to obtain initial java code that was very close to the original cobol in order to facilitate the migration of people: they find the good old app they wrote in cobol in exact same structure.
one of our most important goals was to keep initial developers on board: that's the way we found to achieve it. When application migrated to Java, those people can start make it more OO as they further develop / refactor it.
If you don't care about migrating people, you can use other strategy.
This 1-to-1 conversion also made 100% automated conversion simpler & faster: the good consequence is that we made our recurring savings (3 millions euros / year) much faster: we estimate 12-18 months. Those early savings can clearly be reinvested in OO refactoring
feel free to contact me: [email protected] or [email protected]
didier
我刚刚查看了 NACA 页面和文档。 从他们的文档中可以
看出:“生成的 java 使用类似 Cobol 的语法。它与原始 Cobol 语法尽可能接近,当然在 Java 语言的限制内。
生成的代码看起来不像经典的本机 java,并且从应用程序的角度来看也不是面向对象的。
这是设计上的一个强有力的选择,可以使 Cobol 开发人员顺利迁移到 Java 环境。 目标是让编写原始 Cobol 程序的人掌握业务知识。”
我没有看到示例,但引用的内容给出了结果的强烈味道。
它是用 Java 编写的 COBOL。
您始终可以构建从一种语言到另一种语言的“翻译器”,方法是
只需用目标语言编写解释器即可。 恕我直言,这是
翻译一种语言的方式绝对是糟糕的,因为你最终会得到
两全其美:你无法获得新语言的价值,
而且你仍然需要了解旧的知识才能保留结果
活。 (难怪这个东西被称为“转码器”;我从来没有
以前听说过这个词)。
这一噱头的论点是降低大型机的成本。
哪里有证据表明转换后的程序的工作成本
不要浪费积蓄吗? 我怀疑事实是运营人员
通过抛弃大型机来降低成本,但他们并不在乎
维护任务变得更加昂贵。 虽然这可能是合理的
对于运营人员来说,这对于整个组织来说是一个愚蠢的选择。
愿天堂帮助那些成为这个工具受害者的人。
2010 年 5 月编辑:我找到了 NACA 输出的示例; 他们的之一
测试用例。 这绝对是伟大的JOBOL。 他们是件好事
保留他们的 COBOL 程序员并且不想雇用
任何 Java 程序员。 当您阅读本文时,请务必记住
这是Java代码。
孩子们:这只有专业人士才能完成。 不要在家尝试此操作。
I just looked at the NACA page and docs. From their documentation:
"The generated java uses a Cobol-like syntax. It's as close as possible from original Cobol syntax, within of course the limits of the Java language.
Generated code doesn't look like classical native java and is not object oriented from the application point of view.
This is a by design strong choice, to enable a smooth migration of Cobol developers to the Java environment. The goal is to keep business knowledge in the hand of people who wrote the original Cobol programs."
I didn't see an example, but the quote gives a strong flavor of the result.
Its COBOL coded in Java.
You can always build a "Translator" from one language to another, by
simply coding an interpreter in the target langauge. That's IMHO an
absolutely terrible way to translate a langauge as you end up with
the worst of both worlds: you don't get the value of the new language,
and you still have to have knowledge of the old one to keep the result
alive. (No wonder this thing is called a "Transcoder"; I'd never
heard this term before).
The argument for this stunt is to dump the costs of the mainframe.
Where's the evidence that the costs of working on the converted program
don't swamp the savings? I suspect the truth is that the operations people
lowered their cost by dumping the mainframe, and they couldn't care less
that the maintenance tasks got more expensive. While that may be rational
for the operations guys, its a stupid choice for the orgnization as a whole.
Heaven help people that are a victim of this tool.
EDIT May 2010: I found an example of NACA's output; one of their
testcases. This is absolutely magnificent JOBOL. Its a good thing they
are keeping their COBOL programmers and don't want to hire
any Java programmers. As your read this, be sure you remember
this is Java code.
Kids: This is only done by professionals. Do not attempt this at home.
我的经历相似但略有不同。 我们在 Ada 中有一个大型且旧的代码库(超过 15 年的 0.5Mloc),最近已转换为 Java。
它被外包给一家提供自动/手动转换组合的公司。 他们还进行了测试来验证 Ada 和 Java 系统的行为是否相同。
它的某些部分是用 Ada 95 编写的(即有 OOP 的可能性),但大部分不是
现在是的,代码首先不符合用 Java 编写的代码的相同标准,但我们一直在使用它此后成功(现已 18 个月),没有出现任何重大问题。 我们获得的主要优势是现在我们可以找到更多具有生成可维护代码技能的开发人员来维护我们的代码库。 (任何人都可以使用 Ada 进行开发,但与任何其他语言一样,如果您没有这方面的经验,您最终可能会得到无法维护的代码)
My experience is similar but slightly different. We have a large and old code base in Ada (0.5Mloc over 15+years ) that was recently converted to Java.
It was outsourced to a company that provided combination of automated/manual conversion. They also did testing to verify that the Ada and Java systems behaved the same.
Some parts of it where written in Ada 95 (ie had the possibility of OOP) but most of it wasn't
Now yes the code is not up to the same standards of code written in Java in the first place but we have been using it since then successfully (18 months now) with no major issues. The major advantage we got was now we can find more developers to maintain our code base with the skills to produce maintainable code. (Any one can develop in Ada but like any other language if you don't have the experience in it you can end up with unmaintainable code)
从规避风险的角度来看,NACA 方法绝对有意义。 重用他们的工具可能不会。 他们利用工具的开发来让他们的员工加快使用 java 和 Linux 的速度。
NACA 转换的结果不会足够好,甚至是 OO,并且会导致招聘新人变得困难。 但它是可测试的,可以重构,并且您可以插入更好的翻译器。
[编辑]
艾拉,你似乎不太有风险意识。
将 cobol 程序员送到 java 课程并不会让他们编写可用的面向对象代码。 这需要几年的时间。 在那段时间里,他们的生产力会很低,你基本上可以扔掉他们第一年写的所有代码。 此外,您还将失去 10-20% 的程序员,他们不愿意或没有能力进行转型。 很多人不喜欢回到初学者状态,这会影响等级顺序,因为有些程序员比其他程序员更快地掌握新语言。
NACA 方法允许企业继续运营,并且不会给组织带来不必要的压力。 转换的时间安排是独立的。 拥有一个由 OO 专家编写的 Java 独立翻译器,可以让老团队逐渐接触 Java。 编写测试用例可以增加新 Java 团队的领域知识。
真正的面向对象系统是翻译器,这是插入更好的翻译器的地方。 让这一切变得容易,并且您不必接触生成的代码。 如果生成的代码足够丑陋,那就会自动发生::)
[运行翻译器一次]
是一个糟糕的策略。 不要那样做。 如果您需要编辑生成的代码,请维护
映射回来。 这可以自动化。 并且应该是。 在 Smalltalk 映像中执行此类操作要容易得多,但您可以使用文件来执行此操作。 有些人拥有丰富的经验,但对同一工件持有不同的看法:我们想到的是芯片设计师。
的每日计数
您可能想阅读:
彼得·范·登·哈默 Kees Lepoeter (1996) 管理设计数据:CAD 框架的五个维度、配置管理和数据管理,IEEE 会议录,卷。 84,第 1 期,1996 年 1 月
[移动 Cobol 平台]
对于 NACA 团队来说,从大型机上的 Cobol 迁移到 Windows/Linux 上的 Cobol 可能是一个可行的策略,但问题是如何迁移到 Java。 如果长期目标是拥有一个现代的面向对象系统,并以尽可能小的操作风险实现这一目标,那么 NACA 方法是合理的。 但这只是第一步。 接下来将进行大量重构。
From a risk avoidance point of view, the NACA approach absolutely makes sense. Reusing their tools might not. They used the developing of the tools to get their people up to speed in java and linux.
The result of the NACA conversion is not going to be good enough, or even OO, and makes it difficult to hire new people. But it is testable, can be refactored, and you can plug in better translators.
[edit]
Ira, you don't seem to be very risk-aware.
Sending the cobol programmers to a java course is not going to make them write usable object-oriented code. That takes a few years. During that time, their productivity will be very low, and you can basically throw away all the code they write the first year. In addition you'll lose 10-20% of your programmers, who are not willing or capable of making the transition. Lots of people do not like going back to beginner status, and it is going to influence the pecking order, as some programmers pick up the new language a lot faster than others.
The NACA approach allows the business to continue working, and puts no unneeded pressure on the organisation. The time-schedule for the conversion is independent. Having a separate translator, in java, writen by OO experts, allows a gradual exposure to java for the old team. Writing the test cases increases domain knowledge in the new java team.
The real oo system is the translator, and that is the place to plug in better translators. Make it easy to do that, and you do not have to touch the generated code. If the generated code is ugly enough, that is what will happen automatically: :)
[running the translator once]
is a bad strategy. Don't do that. And if you need to edit the generated code, maintain a
mapping back. That can be automated. And should be. It is a lot easier to do these kind of things in a Smalltalk image, but you can do it with files. There are people with a lot of experience maintaining different views on the same artifact: chip designers come to mind.
The translator should be instrumented, so you can create the daily counts of e.g.
You might want to read:
Peter van den Hamer & Kees Lepoeter (1996) Managing Design Data: The Five Dimensions of CAD Frameworks, Configuration Management, and Data Management, Proceedings of the IEEE, Vol. 84, No. 1, January 1996
[moving Cobol platforms]
Moving from Cobol on the mainframe to Cobol on Windows/Linux could have been a viable strategy for the NACA team, but the question was about moving to java. If the long-term goal is to have a modern OO system, and to get there with as little operational risk as possible, the NACA approach is sound. It is only step one, though. A lot of refactoring is going to follow.
我很惊讶没有人提到 Semantic Design 的 DMS 软件重组工具包。 我过去研究过 COBOL 转换。 那时我正在研究“自动编程”。 在编写翻译器之前,我查阅了该领域之前的一系列工作和产品。 Semantic Designs 的基于 GLR 的工具是同类工具中最好的。
那是很多年前的事了。 当时,该工具将 COBOL 翻译为现代语言,对其进行了重构,并进行了漂亮的打印,等等。现在是它的链接。
http://www.semdesigns.com/Products/DMS/DMSToolkit.html
他们还在附近。 他们扩展了该工具。 是比较通用的。 它可能会帮助人们进行自动转换或定制转换工具。 它的设计是可扩展和可调整的,类似于斯蒂芬指出的那样。 也感谢 Cyrus 提到 SoftwareMining。 如果将来遇到 COBOL 迁移,我也会研究它们。
I'm surprised nobody has mentioned Semantic Design's DMS Software Reengineering Toolkit. I looked into COBOL conversion in the past. I was working on "automatic programming" back then. Before writing a translator, I looked up a bunch of previous efforts and products in that area. Semantic Designs' GLR-based tool was the best of the bunch.
That was many years ago. At the time, the tool translated COBOL to a modern language, refactored it, pretty printed it, etc. Here's the link to it now.
http://www.semdesigns.com/Products/DMS/DMSToolkit.html
They're still around. They've expanded the tool. It's more general. It might help people doing automated conversions or customizing a conversion tool. It's designed to be expandable and tweakable similarly to what Stephan pointed out. Thanks to Cyrus also for mentioning SoftwareMining. I'll look into them too if I run into a COBOL migration in the future.
您正在谈论再造。 好消息是世界各地有很多人尝试这样做。 不好的是,遗留应用程序重新设计存在很多问题:从缺失的源开始,到编译器构建和图论领域的复杂算法。
自动翻译的想法非常流行,直到您尝试转换某些内容。 通常结果是糟糕且无法维护的。 它比原来复杂的应用程序更难以维护。 从我的角度来看,每一个允许从传统语言自动翻译为现代语言的工具都非常以营销为导向:它准确地表达了人们想要听到的“将您的应用程序从...翻译成 Java 一次,然后就忘记了!”,而不是您所知道的。购买合同,然后您就会明白您非常依赖该工具(因为没有它您就无法对您的应用程序进行任何更改!)。
替代方法是“理解”:该工具可以让您非常详细地了解遗留应用程序。 您可以使用它进行维护、记录或在新平台上进行重新发明。
在 Microfocus 去年收购并将开发移至另一个国家。 有大量复杂的分析工具,以及大量支持的目标语言(包括Java)。 但没有客户端真正使用自动代码生成,因此生成部分的开发被冻结。 据我所知,PL/I 支持大部分已实现,但从未完成。 但你仍然可以尝试,也许这就是你正在寻找的。
You are speaking of reengineering. The good thing is that a lot of people worldwide tries to do this. The bad thing is that there are a lot of problems concerning legacy applications reengineering: starting from missing sources and up to complex algorithms from compiler construction and graph theory fields.
Idea of automatic translation is very popular, until you will try to convert something. Usually the result is awful and unmaintainable. It is more unmaintainable than original complicated application. From my point of view, every tool that allows automatic translation from legacy to modern language is very marketing oriented: it says exactly what people want to hear "translate your application from ... to Java once, and forget!", than you are buying a contract, and then you understand that you very tightly depends on the tool (because you can't make any change to your application without it!).
Alternative approach is "understanding": the tool, that allows you very detailed understanding of your legacy application. And you can use it for maintenance, or for documenting, or for reinventing on new platform.
I know a little about Modernization Workbench history before Microfocus bought it last year and moved development to another country. There was great number of complex analysis tools, and number of supported target languages (including Java). But no client really used automatic code generation, so the development of generation part was frozen. As far as I know PL/I support was mostly implemented, but it was never finished. But still you can try, may be this is what you are looking for.