我想从 powerpoint 幻灯片为流畅、可读(MS Word 样式)格式。
我对保留幻灯片概念根本不感兴趣——想一想从大学课程中获取课堂幻灯片并将它们批量转换为一份集体学习指南。
- 我找不到办法在 powerpoint(如果您知道的话,请分享!)而且,
- 我没有编写 Office 应用程序脚本的经验。这种事情容易办到吗?这种脚本是否已经存在于某处?
澄清:
在本文的早期版本中,我使用“流动”一词来指代无幻灯片(类似于 MS Word)格式。然而,这并不是指幻灯片内容的实际格式。因此,保留项目符号列表等是很好的,甚至是可取的。
I'd like to extract all of the information (formatted text, images, etc) from powerpoint slides into a flowing, readable (MS Word-style) format.
I'm not interested in keeping the slide concept at all--think of taking class slides from a college course and batch converting them all into one collective study guide.
- I can't find a way to do this within powerpoint (though if you know of one, please share!) and,
- I don't have experience scripting Office apps. Is this kind of thing easily done? Does this kind of script already exist somewhere?
Clarification:
In an earlier version of this post, I used the word "flowing" to refer to a slide-free (MS Word-like) format. This does not, however, refer to the actual formatting of slide content. So keeping bullet lists, etc. is fine and even desirable.
发布评论
评论(6)
我认为这不是一个简单的任务。根据我的经验,大学教授在他们的幻灯片中使用“标题:要点或图像”或“我要说的每个字”的格式,而你不会从前一种格式中获得流畅、可读的文本。不管你做什么。对于后者,您已经获得了文本,只需将其复制到另一个文档即可。
我认为您不妨打开 PowerPoint,选择所有文本,然后复制+粘贴到 Word/Publisher/InDesign/您最喜欢的页面布局程序中。事后您将获得相同的效果和相同数量的编辑,除非无需编写程序来为您完成此操作。
如果您需要的话,使用 N-up 选项对 PDF 进行打印操作可能是讲义的一个很好的解决方案。您可以扩展这个想法并将所有幻灯片压缩为一个,将其打印出来(每页有 N 张幻灯片,旁边有注释空间)并装订,瞧,即时学习指南。我已经看到了,然后你就可以选择记笔记。
如果你这样做只是因为你可以的话,你就会有更多的力量——不要让我阻止你。通过这种方式可以学到很多好的东西。您可能想考虑使用 .NET 中的 Microsoft.Office.Interop 命名空间编写程序(从 http://msdn.microsoft.com/en-us/library/bb772069.aspx ),或者查看 CPAN(http://search.cpan.org/search?mode=all&query=powerpoint ) 并用 Perl 来做!有很多方法可以做到这一点,但你必须做好迎接挑战的准备。
I don't see this being a simple task. College professors use a format of either "TITLE: BULLET POINTS OR IMAGE" or "EVERY WORD I'M ABOUT TO SAY" for their slides in my experience, and you're just not going to get flowing, readable text from the former no matter what you do. For the latter, you've already got your text, you just have to copy it to another document.
I think you might as well just open the PowerPoint, select all the text, and copy+paste into Word/Publisher/InDesign/your favorite page layout program. You'll have the same effect and the same amount of editing after the fact except without all the hassle of writing a program to do it for you.
Doing a Print operation to a PDF with the N-up options might be a good solution for handouts if that's all you need. You could expand the idea and condense ALL the slide decks into one, get it printed (with N slides per page and the note space next to it) and bound, and voila, instant study guide. I've seen that, and then you get options for note taking.
More power to you if you're doing this just because you can - don't let me stop you. There is much good learning to be had that way. You might want to look into writing a program using the Microsoft.Office.Interop namespace in .NET (starting at http://msdn.microsoft.com/en-us/library/bb772069.aspx ), or perhaps look on CPAN ( http://search.cpan.org/search?mode=all&query=powerpoint ) and do it with Perl! There are lots of ways to do it, but you've got to be up for the challenge.
提取文本相当简单,但是您想要什么文本?仅来自标题和正文文本占位符的文本?文件、另存为,然后选择保存轮廓。
幻灯片上的其他文字?可以通过编程方式将其提取到文本文件中,但以什么顺序呢?假设您有一个带有文本标注的复杂图表。提取文本会给你带来乱码。除了人类观看者通过注意到“啊。这段文本旁边的箭头指向摩擦器子组件,因此必须以某种方式与之相关”之外,文本没有明显/有意义的顺序。尝试在代码中这样做。 ;-)
您可以为作者提供一种将文本按阅读顺序排序的方法,以便代码知道以什么顺序提取文本,但这需要作者做大量的工作。
如果您可以确定所有内容都是标题+项目符号的形式,则不用担心。否则,您必须能够准确地表达出您想要提取的内容、以什么形式以及以什么顺序,然后才能取得任何成果。
Text is fairly simple to extract, but what text do you want? The text from the title and body text placeholders only? File, Save As, and choose to save the outline.
The other text on the slide? That can be pulled out to a text file programmatically, but in what order? Suppose you have a complex diagram with text callouts. Extracting the text is going to give you gibberish. There's no obvious/meaningful order to the text other than what the human viewer supplies by noting that "Ah. The arrow next to this bit of text points to the fribulator sub-assembly, so must relate to it in some way." Try doing that in code. ;-)
You could give the author a way to sort the text into reading order so that the code knows what order to extract it in, but that would require a fair amount of work on the part of the author.
If you can be certain that all of the content is in title+bullet form, no worries. Otherwise, you'd have to be able to articulate exactly what you want extracted, in what form and in what order before you could get anywhere with this.
MS Word 样式不仅可读,而且可写(您的要求中未指定)。如果您想要只读指南,PDF 是您的自然选择(通过 Acrobat Distiller 或 LibreOffice)。将单个 Acrobatted 演示文稿与 PDFtk、Acrobat 或 Foxit 相结合,您无需任何编程即可使用。
“这种事情容易办到吗?” - 是的,您卑微的仆人很久以前就做过几个类似的脚本(从 Powerpoint 幻灯片中提取增强的图元文件)。
“这种剧本已经存在于某处吗?” - 是的。可能在数百个地方,但不确定是否有任何一个被发布到“网络”上。考虑到所有因素,您最好自己学习一些脚本和宏编程,因为现成的脚本可能不太适合您的需求 - 并且要理解和重写它,您需要比编码更多的时间&从头开始调试。
MS Word-style is not only readable, but writeable as well (which was not specified in your requirements). If you want a read-only guide, PDF is your natural choice (either through Acrobat Distiller or LibreOffice). Combine individual Acrobatted presentations with PDFtk, or Acrobat or Foxit and you're good to go without any programming at all.
"Is this kind of thing easily done?" - Yes, your humble servant did a couple of similar scripts ages ago (extracting enhanced metafiles from Powerpoint slides).
"Does this kind of script already exist somewhere?" - Yes. Probably at hundreds of places, but not sure if any of them get posted to the 'Net. All things considered think you'd be better off learning some scripting and macro programming on your own, since a ready-made script may be not quite fit for your needs - and to understand and rewrite it you'd need more time than to code & debug from scratch.
既然您提到标题+项目符号形式可以,请打开文件,选择另存为并选择大纲作为保存类型。
Since you mention that title+bullet form is ok, open the file, choose to save as and pick Outline as the save-as type.
我认为你可以解析 PowerPoint 文件的格式、文本和图片。 Visual Studio 命名空间可用于此类任务。您打开该文件,对其进行解析并从中创建 Word 文件。工作很复杂,因为您必须考虑元素的类型及其位置,因此您必须为每张幻灯片使用临时结构。
I think you could parse through the PowerPoint file for formatting, text and pictures. There are Visual Studio namespaces available for such a task. You open the file, parse through it and make Word file from these. Complicated work, as you would have to consider type of elements and their position, you would have to use a temporary structure for each slide.
看看这个示例代码:
http://msdn.microsoft.com /en-us/library/office/gg278331.aspx
如何:获取演示文稿中所有幻灯片中的所有文本
基本上,使用 c# 和 openXML SDK 2.0,它循环遍历演示文稿中的所有幻灯片,然后将每张幻灯片中的每个文本添加到字符串生成器中。如果您愿意,您可以将结果写入文本文件(需要修改)。
建议:<2012 年 10 月 25 日>
对于您的学习指南,也许您可以提取每张幻灯片中的所有文本,并以编程方式转储这些文本(通过在迭代以下示例代码时将该函数添加到上面的示例代码中)幻灯片)到每张幻灯片的“注释”部分。这样,您就可以在注释页面视图中打印它。您将在页面的上半部分看到整个幻灯片图像,并在笔记页面视图的底部看到实际的幻灯片文本。它肯定比尝试将幻灯片中的所有文本复制并粘贴到注释部分要好。您甚至可以每页打印 2 张幻灯片,因为幻灯片图像中的小文本不会成为问题,并且图表仍然或多或少可见。
不幸的是,这种方法适用于简单的标准幻灯片格式...也就是说,如果您的幻灯片只有一个标题和一个包含所有项目符号点的中心文本框,那么就可以了...任何复杂的幻灯片布局(可能散布在各处的文本框)都会以无序的方式出现并且会令人困惑。但至少您仍然可以查看上面的幻灯片图像来理解它:)
Have a look at this sample code :
http://msdn.microsoft.com/en-us/library/office/gg278331.aspx
How to: Get All the Text in All Slides in a Presentation
Basically, using c# and openXML SDK 2.0, it loops through all the slides in the presentation, and then adds each text in every slide into a string builder. You can write out the result into a text file if you like (modification required).
Recommendation: <25 oct 2012>
For your study guide, maybe you could extract all the text in each slide, and dump those text programmatically (by adding that function into the sample code above while it's iterating the slides) into the "Notes" section of each slide. With that, you can print it in Notes Page view. You'll get the entire slide image at the top half of the page, and the actual slide texts at the bottom of it in the Notes Page view. It sure beats trying to copy and paste all the text from the slide into the notes section. You can even print it 2 slides per page, as small text would not be an issue inside the slide's image, and diagrams would still be visible more or less.
Unfortunatly, this method works for simple standard slide format ... meaning, it's OK if your slides just have a title, and a center text box with all the bullet points... any complex slide layout (maybe text boxes scattered everywhere) will come out in non-order and will be confusing. But at least you can still look at the slide image above to make sense of it :)