当前位置：文江博客话题详情

PDF asp.net acrobat

从 pdf 中提取选定区域或坐标中的文本和图像

发布于 2024-08-24 08:06:33 字数 239 浏览 6 评论 0原文

我有一个从 pdf 文件中的特定区域提取文本和图像的特定要求。该区域可能是选定的或突出显示的，也可能是来自给定的一组坐标。

当我经历过时，所有方法都是从 PDF 中完全提取图像和文本，而不是在指定位置。我尝试使用 iTextSharp、Syncfussion、Apose，但无法找到更好的方法。

如果有人能在这方面帮助我，那就太好了。您能否分享您关于如何在 .net 中实现这一点的想法和建议？

问候，阿伦·M

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（1）

把昨日还给我 2024-08-31 08:06:33

此代码从pdf中提取图像

using System;
using System.Data;
using System.Configuration;
using System.Collections;
using System.Drawing.Imaging;
using System.IO;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using Bytescout.PDFExtractor;

namespace ExtractAllImages
{
    public partial class _Default : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            // This test file will be copied to the project directory on the pre-build event (see the project properties).
            String inputFile = Server.MapPath("sample1.pdf");

            // Create Bytescout.PDFExtractor.ImageExtractor instance
            ImageExtractor extractor = new ImageExtractor();
            extractor.RegistrationName = "demo";
            extractor.RegistrationKey = "demo";

            // Load sample PDF document
            extractor.LoadDocumentFromFile("sample1.pdf");

            Response.Clear();

            int i = 0;

            // Initialize image enumeration
            if (extractor.GetFirstImage())
            {
                do
                {
                    if (i == 0) // Write the fist image to the Response stream
                    {
                        string imageFileName = "image" + i + ".png";

                        Response.Write("<b>" + imageFileName + "</b>");

                        Response.ContentType = "image/png";
                        Response.AddHeader("Content-Disposition", "inline;filename=" + imageFileName);

                        // Write the image bytes into the Response output stream
                        Response.BinaryWrite(extractor.GetCurrentImageAsArrayOfBytes());
                    }

                    i++;

                } while (extractor.GetNextImage()); // Advance image enumeration
            }

            Response.End();
        }
    }
}

this code extract images from pdf

using System;
using System.Data;
using System.Configuration;
using System.Collections;
using System.Drawing.Imaging;
using System.IO;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using Bytescout.PDFExtractor;

namespace ExtractAllImages
{
    public partial class _Default : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            // This test file will be copied to the project directory on the pre-build event (see the project properties).
            String inputFile = Server.MapPath("sample1.pdf");

            // Create Bytescout.PDFExtractor.ImageExtractor instance
            ImageExtractor extractor = new ImageExtractor();
            extractor.RegistrationName = "demo";
            extractor.RegistrationKey = "demo";

            // Load sample PDF document
            extractor.LoadDocumentFromFile("sample1.pdf");

            Response.Clear();

            int i = 0;

            // Initialize image enumeration
            if (extractor.GetFirstImage())
            {
                do
                {
                    if (i == 0) // Write the fist image to the Response stream
                    {
                        string imageFileName = "image" + i + ".png";

                        Response.Write("<b>" + imageFileName + "</b>");

                        Response.ContentType = "image/png";
                        Response.AddHeader("Content-Disposition", "inline;filename=" + imageFileName);

                        // Write the image bytes into the Response output stream
                        Response.BinaryWrite(extractor.GetCurrentImageAsArrayOfBytes());
                    }

                    i++;

                } while (extractor.GetNextImage()); // Advance image enumeration
            }

            Response.End();
        }
    }
}

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

0 文章

0 评论

23 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

1CH1MKgiKxn9p

文章 0 评论 0

ゞ记忆︶ㄣ

文章 0 评论 0

JackDx

文章 0 评论 0

信远

文章 0 评论 0

yaoduoduo1995

文章 0 评论 0

霞映澄塘

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文