在 Microsoft Word 中将文本转换为图像

发布于 2024-12-27 11:44:33 字数 688 浏览 1 评论 0原文

我有一本用 Microsoft Word 编写的大书，想要创建一个宏，该宏将使用预定义样式查找所有文本并将该文本转换为内联图像。该文本将采用阿拉伯语，一般不超过 4-5 行。这可能吗？

更新：这是一个示例来显示我所指的内容：

在此处输入图像描述

我想将整行替换为带有图像的阿拉伯语（就好像我裁剪了此附加图像以仅包含阿拉伯语，然后用图像替换了阿拉伯语行）。

我想要一个宏或脚本来执行此操作的原因是因为这样的行有数百行，并且逐一更新它们很麻烦，而且会使以后的修改变得困难。

UPDATE2：我在这里发现了一个有趣的选项： http://windowssecrets.com/forums/showthread.php/31344-Convert-Text-to-an-Image-of-Text-in-VBA-(Office-2000-Sr1a)

看起来就像您可以剪切一段文本，然后“选择性粘贴”作为图像。因此，如果有一种方法可以实现自动化，那可能会起作用。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

只等公子 2025-01-03 11:44:33

这不是一个答案，尽管我希望它能成为社区的答案。目前正在探索解决问题所需的内容。

我从超级用户上发布这个问题时的讨论中得知，阿卜杜拉希望在 Kindle 上出版他的书。因此，问题实际上是如何准备英语和阿拉伯语文档以电子书形式出版。

Kindle 不支持阿拉伯语。它支持的语言数量正在缓慢增加，但没有证据表明亚马逊计划在可预见的将来添加阿拉伯语。

亚马逊电子书的格式是 HTML 的简化版本。如果将包含阿拉伯字母的Word文档导出为HTML，则阿拉伯字母将作为字符实体包含在内；例如：“ﭐ ﭑ ﭒ ﭓ”。将原始 Word 或 HTML 版本导入 Kindle 会导致前导位被丢弃，因此这些字符显示为 P、Q、R 和 S，而不是“ﭐ ﭑ ﭒ ﭓ（Alef Wasla 独立形式、Alef Wasla 最终形式、Beeh） Wasla 分离形式和 Beeh Wasla 最终形式）。

我尝试过阿卜杜拉的想法，将一些阿拉伯字母保存在 PNG 文件中，并创建一个包含

的 HTML 文件。 …
...

。该文件在我的 Kindle 2 上的外观完全可以接受，因此这有可能成为一个解决方案。问题是：如何进行必要的转换？

我们需要从 Word 文档或其等效 HTML 中提取每个阿拉伯字符串，并将其导入到可以将它们转换为 PNG 文件的程序中。

据我所知，实现自动化的唯一方法是将每个字符串复制到 PowerPoint 中的幻灯片中。使用 PowerPoint 的“另存为”选项，可以将每张幻灯片另存为单独的 PNG 文件。幻灯片按顺序命名为：SLIDE1.PNG、SLIDE2.PNG、SLIDE3.PNG 等，这将允许宏将结果与原始字符串相关联。然后就可以用图像元素替换 HTML 文件中的阿拉伯字符串。所有这些都不太难实现自动化，但存在一个问题，即幻灯片的大小都是 PowerPoint 页面的大小。页面可以变小，但我们需要将每张幻灯片裁剪为仅大于该幻灯片的文本。我想不出有什么方法可以自动进行这种裁剪。

有没有人有比将每个阿拉伯短语转换为 PNG 文件更好的方法？

我一直在寻找具有某种命令行界面的 PNG 编辑器，但找不到比使用 PowerPoint 更容易的工具。有谁知道 PowerPoint 的替代品吗？

有人对自动裁剪每张图像有什么建议吗？当将字符串放置在 PowerPoint 幻灯片中时，可以将其宽度设置为 6.5 厘米（在我的 Kindle 上看起来不错）并获得由 PowerPoint 确定的高度。如果有人知道如何使用它，可以将其保存以供以后使用。

实施解决方案

在等待任何改进上述方法的建议之前，下面概述了我将如何实施它。

我不会尝试处理 Word 文档。我会将其保存为 Web Page, Filtered HTML 文件，这是创建 Kindle 电子书的必要步骤，并对其进行处理。

在根据我的测试文档创建的 HTML 文件中，阿拉伯短语显示为：

<p class="MsoNormal"></p>
<p class="MsoNormal" align="center" style="text-align:center"><span dir="RTL"
style="font-size:24.0pt;font-family:Arial">
&#64336;&#64337;&#64338;&#64339;&#64340;&#64341;
&#64342;&#64343;&#65153;&#65154;&#65276;&#65275;
&#65274;&#65273;&#65246;&#65226;&#65227;&#65228;
</span><span style="font-size:24.0pt"></span></p>
<p class="MsoNormal"></p>
<p class="MsoNormal"></p>

我假设 Abdullah 的文档会产生类似的结果。注1：以上为阿拉伯字母随机集合。注 2：即使在显示或打印时，它们是从右到左读取的，但它们在读取顺序中是从左到右的。

整个块必须替换为以下内容：

<br><imc src="xxxx.png"><br>

其中文件 xxxx.png 包含阿拉伯文本的图像。

文件名，例如 xxxx.png，可以是系统的（A001.png，A002.png，...），但我本以为将短语的前十个或二十个字符从阿拉伯字母音译为英文字母并使用结果，加上数字后缀，作为文件名会更方便。

我将在 Excel 工作表中保存管理流程所需的记录。我会将 VBA 代码放在同一工作簿中。

我设想的转换过程的步骤是：

VBA 宏从最新的 HTML 文件中提取阿拉伯字符串并将新字符串添加到 Excel 工作表中。（稍后详细介绍 Excel 工作表。）
用于创建 PowerPoint 文件的 VBA 宏，每个新字符串一张幻灯片，并使用 PNG 格式的 SaveAs 在丢弃 PowerPoint 文件之前为每张幻灯片创建一个 PNG 文件。
人工裁剪每个 PNG 文件。（似乎无法自动裁剪，因此可以通过使用 Excel 工作表中的数据来最小化此任务。）
VBA 宏将每张幻灯片从 SLIDEnnn.PNG 重命名为其永久名称，并在 Excel 中记录永久名称工作表。
VBA 宏通过用适当的 HTML IMG 元素替换包含阿拉伯短语的块来更新最新的 HTML 文件。

Excel 工作表需要两列：阿拉伯短语和 PNG 文件名。如果在步骤 2 和 4 之间存在工作表排序的风险，我们可能还需要一个序列号。

宏 1 将从 HTML 文件中提取一个阿拉伯短语，在工作表中的列表中查找该短语，并在底部添加该短语（如果尚不存在）。

宏 2 将在工作表中查找没有 PNG 文件名的短语。这些新短语将被写入 PowerPoint 演示文稿中。也就是说，一个短语只进入这个过程一次。

任务 3，裁剪每个 PNG 文件，将是一件痛苦的事情。我只能说每个短语只会出现一次。

宏 4 将假定 SLIDE001.PNG、SLIDE002.PNG、... 位于工作表中没有 PNG 文件的短语序列中。如果情况并非如此（因为工作表已排序），我们将需要序列号或保留 PowerPoint 文件。该宏将为每个新短语分配一个唯一的名称，将该名称记录在工作表中并重命名 PNG 文件。

宏 5 使用工作表的内容创建最新 HTML 文件的新副本，以确定用哪个 PNG 文件替换哪个短语。

这个过程并不理想，但会达到预期的结果，并且没有明显的并发症。有什么改进建议吗？

This is not an answer although I hope it will grow into a community answer. At the moment it is an exploration of what is required to solve the problem.

I know from the discussion when this question was posted on Super User that Abdullah wishes to publish his book on Kindle. So the question is really about how to get a document in English and Arabic ready for publication as an e-Book.

The Kindle does not support Arabic. The number of languages it does support is slowly increasing but there is no evidence I can find that Amazon has plans to add Arabic in the foreseeable future.

The format behind an Amazon e-Book is a cut down version of HTML. If a Word document containing Arabic letters is exported to HTML, the Arabic letters are included as character entities; for example: “ﭐ &#amp;64337; ﭒ ﭓ”. Importing the original Word or the HTML version to Kindle, results in the leading bits being discarded so these characters are displayed as P, Q, R and S instead of “ﭐ ﭑ ﭒ ﭓ (Alef Wasla isolated form, Alef Wasla final form, Beeh Wasla isolated form and Beeh Wasla final form).

I have tried Abdullah’s idea of saving some Arabic letters in a PNG file and creating an HTML file containing <p> … </p> <img src= “Arabic.png” > <p> … </p>. The appearance of this file on my Kindle 2 is perfectly acceptable so this has the potential to be a solution. The question is: how can the necessary conversions be performed?

We need to extract each Arabic string from either the Word document or its HTML equivalent and import it into a program that can convert them to PNG files.

The only way that I know of automating this would be to copy each string to a slide within PowerPoint. With PowerPoint’s SaveAs option it is possible to save each slide as a separate PNG file. The slides are named: SLIDE1.PNG, SLIDE2.PNG, SLIDE3.PNG and so on in sequence which would allow a macro to relate the results to the original strings. It would then be possible to replace the Arabic strings in the HTML file with the image elements. None of this would be too difficult to automate but there is a problem with the slides all being the size of the PowerPoint page. The page could be made smallish but what we need is for each slide to be cropped to just bigger than that slide’s text. I cannot think of any way of automating this cropping.

Does anyone have a better approach than converting each Arabic phrase to a PNG file?

I have been looking for PNG editors with some sort of command line interface but can find nothing that would be easier than using PowerPoint. Does anyone know of an alternative to PowerPoint?

Does anyone have any suggestions for automating the cropping of each image? When a string is placed in a PowerPoint slide it is possible to set its width to, say, 6.5cm (which looks good on my Kindle) and get the height determined by PowerPoint. This could be saved for later use if anyone knows how to use it.

Implementing solution

Pending any suggestions for improving the approach described above, the following outlines how I would implement it.

I would not attempt to process the Word document. I would save it as a Web Page, Filtered HTML file, which is a required step on the way to creating a Kindle eBook, and process that.

Within the HTML file created from my test document, the Arabic phrase comes out as:

<p class="MsoNormal"></p>
<p class="MsoNormal" align="center" style="text-align:center"><span dir="RTL"
style="font-size:24.0pt;font-family:Arial">
&#64336;&#64337;&#64338;&#64339;&#64340;&#64341;
&#64342;&#64343;&#65153;&#65154;&#65276;&#65275;
&#65274;&#65273;&#65246;&#65226;&#65227;&#65228;
</span><span style="font-size:24.0pt"></span></p>
<p class="MsoNormal"></p>
<p class="MsoNormal"></p>

I assume Abdullah's document will result in something similar. Note 1: the above is a random collection of Arabic letters. Note 2: they are held left-to-right in reading sequence even though, when displayed or printed, they are read right-to-left.

The whole of this block will have to be replaced with something like:

<br><imc src="xxxx.png"><br>

where the file xxxx.png holds an image of the Arabic text.

The file names, such as xxxx.png, could be systematic (A001.png, A002.png, ...) but I would have thought that transliterating the first ten or twenty characters of the phrase from the Arabic to English alphabets and using the result, with a numeric suffix, as the file name would be more convenient.

I would hold the records necessary to manage the process in an Excel worksheet. I would place the VBA code in the same workbook.

The steps in the conversion process that I envisage are:

VBA macro to extract Arabic strings from latest HTML file and add new strings to the Excel worksheet. (More about the Excel worksheet later.)
VBA macro to create PowerPoint file, with one slide per new string, and use SaveAs in PNG format to create one PNG file per slide before discarding the PowerPoint file.
Human to crop each PNG file. (There appears to be no way of automating the cropping so this task will be minimised by use of data in the Excel worksheet.)
VBA macro to rename each slide from SLIDEnnn.PNG to its permanent name and to record the permanent name in the Excel worksheet.
VBA macro to update the latest HTML file by replacing the block containing the Arabic phrase with the appropriate HTML IMG element.

The Excel worksheet needs two columns: Arabic phrase and PNG file name. If there is any risk of the worksheet being sorted between steps 2 and 4, we may need a sequence number as well.

Macro 1 will extract an Arabic phrase from the HTML file, look down the list in the worksheet for this phrase and add the phrase at the bottom if it is not already present.

Macro 2 will look for phrases in the worksheet that do not have a PNG file name. These new phrases are the ones to be written to the PowerPoint presentation. That is, a phrase only goes into this process once.

Task 3, cropping each PNG file, will be a pain. All I can say is that it will only be once per phrase.

Macro 4 will assume that the SLIDE001.PNG, SLIDE002.PNG, … are in the sequence of phrases without PNG files in the worksheet. If this might not be true (because the worksheet has been sorted) we will either need a sequence number or to retain the PowerPoint file. The macro will assign a unique name to each new phrase, record this name in the worksheet and rename the PNG file.

Macro 5 creates a new copy of the latest HTML file using the contents of the worksheet to determine which phrase to replace with which PNG file.

This process is not ideal but it will achieve the desired result and has no obvious complications. Any suggestions for improving it?

回复收藏 0 原文

一身软味 2025-01-03 11:44:33

在开始执行这些说明之前，请在 Microsoft Word 宏编辑器中按“录制”，这样您就可以看到 VBA 代码是什么。

我想知道如果将 docx 文件转换为 .rtf（富文本格式）并用图像替换该行是否会更容易？转到文件>另存为..>>将其命名为“old.rtf”，然后用图像替换该行并另存为..再次将其命名为“new.rtf”，然后下载 Beyond Compare 或您最喜欢的 diff 程序以查看发生了什么。如果您选择的话，用编程的方式来完成此操作应该很容易。我认为文本工作比微软的二进制格式更容易，除非你能找到一个好的库来修改他们的 doc 或 docx 格式。

回复收藏 0 原文

远昼 2025-01-03 11:44:33

Sub CopySelPasteAsPicture()
' Take a picture of a selection and paste it at the
' document end
    With Selection
        .CopyAsPicture
    End With
    ActiveDocument.Content.Select
    With Selection
        .Collapse Direction:=wdCollapseEnd
        .TypeParagraph
        .TypeParagraph
        .PasteSpecial DataType:=wdPasteMetafilePicture
    End With
End Sub

Sub CopySelPasteAsPicture()
' Take a picture of a selection and paste it at the
' document end
    With Selection
        .CopyAsPicture
    End With
    ActiveDocument.Content.Select
    With Selection
        .Collapse Direction:=wdCollapseEnd
        .TypeParagraph
        .TypeParagraph
        .PasteSpecial DataType:=wdPasteMetafilePicture
    End With
End Sub

回复收藏 0 原文

~没有更多了~