需要一个文档来使用 onenote Interop 从图像中提取文本?

发布于 2024-12-01 19:40:10 字数 69 浏览 0 评论 0 原文

我需要做一个简单的程序,需要使用 Onenote Interop 从图像中提取文本?有人可以建议我一份适合我的概念的文件吗?

I need to do the simple Program whcih need to extract text from image using Onenote Interop? Could any one suggest me the appropriate document for my concept please?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

杀お生予夺 2024-12-08 19:40:10

OneNote 的 OCR 识别的文本存储在 OneNote 中 XML 文件结构的 one:OCRText 元素中。例如,

<one:Page ...>
    ...
    <one:Image ...>
        ...
        <one:OCRData lang="en-US">
            <one:OCRText><![CDATA[This is some sampletext]]></one:OCRText>
        </one:OCRData>
    </one:Image>
</one:Page>

您可以使用名为 OMSPY 的程序查看此 XML(它向您显示 OneNote 页面后面的 XML) - http://blogs.msdn.com/b/johnguin/archive/2011/07/28/onenote-spy-omspy-for-onenote-2010.aspx

要提取您的文本将使用 OneNote COM 互操作(正如您所指出的)。例如,

//Instantialize OneNote
ApplicationClass onApp = new ApplicationClass();

//Get the XMl from the selected page
string xml = "";
onApp.GetPageContent("put the page id here", out xml);

//Put it into an XML document (from System.XML.Linq)
XDocument xDoc = XDocument.Parse(xml);

//OneNote's Namespace - for OneNote 2010
XNamespace one = "http://schemas.microsoft.com/office/onenote/2010/onenote";

//Get all the OCRText from the page
string[] OCRText = xDoc.Descendants(one + "OCRText").Select(x => x.Value).ToArray();

请参阅 MSDN 上的“应用程序接口”文档以获取更多信息 - http://msdn .microsoft.com/en-us/library/gg649853.aspx

Text recognized by OneNote's OCR is stored in the one:OCRText element in the XML file structure in OneNote. e.g.

<one:Page ...>
    ...
    <one:Image ...>
        ...
        <one:OCRData lang="en-US">
            <one:OCRText><![CDATA[This is some sampletext]]></one:OCRText>
        </one:OCRData>
    </one:Image>
</one:Page>

You can see this XML using a program called OMSPY (it shows you the XML behind OneNote pages) - http://blogs.msdn.com/b/johnguin/archive/2011/07/28/onenote-spy-omspy-for-onenote-2010.aspx

To extract the text you would use the OneNote COM interop (as you pointed out). e.g.

//Instantialize OneNote
ApplicationClass onApp = new ApplicationClass();

//Get the XMl from the selected page
string xml = "";
onApp.GetPageContent("put the page id here", out xml);

//Put it into an XML document (from System.XML.Linq)
XDocument xDoc = XDocument.Parse(xml);

//OneNote's Namespace - for OneNote 2010
XNamespace one = "http://schemas.microsoft.com/office/onenote/2010/onenote";

//Get all the OCRText from the page
string[] OCRText = xDoc.Descendants(one + "OCRText").Select(x => x.Value).ToArray();

See the "Application Interface" docs on MSDN for more info - http://msdn.microsoft.com/en-us/library/gg649853.aspx

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文