julazor wasm负载并通过将其分开时显示大PDF

发布于 2025-01-28 15:30:25 字数 2437 浏览 3 评论 0原文

我正在开发一款大型WASM应用程序,希望我的用户可以轻松地在包含其他信息的特定页面上打开PDF文件。 我无法自己分发这些文件或将其上传到任何类型的服务器。每个用户必须自己提供他们。

由于文件最高为60MB大,所以我无法将上传的文件转换为base64,并如上所述显示它们在这里

但是,我不必显示整个文件,只能加载所需的页面 + - 周围的一些页面。

为此,我尝试使用itext7 extractPagerange()this 答案表明,我必须覆盖getNextpdfwriter()在收藏中。

class ByteArrayPdfSplitter : PdfSplitter {
public ByteArrayPdfSplitter(PdfDocument pdfDocument) : base(pdfDocument) {
}

protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange) {
    CurrentMemoryStream = new MemoryStream();
    UsedStreams.Add(CurrentMemoryStream);
    return new PdfWriter(CurrentMemoryStream);
}

public MemoryStream CurrentMemoryStream { get; private set; }

public List<MemoryStream> UsedStreams { get; set; } = new List<MemoryStream>();

然后我认为我可以合并这些流并将其转换为base64,

var file = loadedFiles.First();

    using (MemoryStream ms = new MemoryStream())
    {
        var rs = file.OpenReadStream(maxFileSize);
        

        await rs.CopyToAsync(ms);

        ms.Position = 0;

        //rs needed to be converted to ms, because the PdfReader constructer uses a 
        //synchronious read that isn't supported by rs and throws an exception.
        PdfReader pdfReader = new PdfReader(ms);
        
        var document = new PdfDocument(pdfReader);
        var splitter = new ByteArrayPdfSplitter(document);
        
        var range = new PageRange();
        range.AddPageSequence(1, 10);
        
        var splitDoc = splitter.ExtractPageRange(range);

        //Edit commented this out, shouldn't have been here at all leads to an exception
        //splitDoc.Close();

        var outputMs = new MemoryStream();

        foreach (var usedMs in splitter.UsedStreams)
        {
            usedMs.Position = 0;
            outputMs.Position = outputMs.Length;
            await usedMs.CopyToAsync(outputMs);
        }
        
        var data = outputMs.ToArray();
        
        currentPdfContent = "data:application/pdf;base64,";
        currentPdfContent += Convert.ToBase64String(data);
        pdfLoaded = true;
    }

但是这行不通。 有人建议如何使这个工作吗?也许我可以尝试一个简单的解决方案。


编辑:

我在调试中仔细看了一下,似乎所得的流outputms始终是空的。因此,这可能是我如何拆分PDF的问题。

I'm working on a Blazor WASM App and I want my users to easily open pdf files on specific pages that contain additional information.
I cannot distribute those files myself or upload them to any kind of server. Each user has to provide them themselves.

Because the files are up to 60MB big I cannot convert the uploaded file to base64 and display them as described here.

However I don't have to display the whole file and could just load the needed page +- some pages around them.

For that I tried using iText7 ExtractPageRange(). This answer indicates, that I have to override the GetNextPdfWriter() Method and to store all streams in an collection.

class ByteArrayPdfSplitter : PdfSplitter {
public ByteArrayPdfSplitter(PdfDocument pdfDocument) : base(pdfDocument) {
}

protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange) {
    CurrentMemoryStream = new MemoryStream();
    UsedStreams.Add(CurrentMemoryStream);
    return new PdfWriter(CurrentMemoryStream);
}

public MemoryStream CurrentMemoryStream { get; private set; }

public List<MemoryStream> UsedStreams { get; set; } = new List<MemoryStream>();

Then I thought I could merge those streams and convert them to base64

var file = loadedFiles.First();

    using (MemoryStream ms = new MemoryStream())
    {
        var rs = file.OpenReadStream(maxFileSize);
        

        await rs.CopyToAsync(ms);

        ms.Position = 0;

        //rs needed to be converted to ms, because the PdfReader constructer uses a 
        //synchronious read that isn't supported by rs and throws an exception.
        PdfReader pdfReader = new PdfReader(ms);
        
        var document = new PdfDocument(pdfReader);
        var splitter = new ByteArrayPdfSplitter(document);
        
        var range = new PageRange();
        range.AddPageSequence(1, 10);
        
        var splitDoc = splitter.ExtractPageRange(range);

        //Edit commented this out, shouldn't have been here at all leads to an exception
        //splitDoc.Close();

        var outputMs = new MemoryStream();

        foreach (var usedMs in splitter.UsedStreams)
        {
            usedMs.Position = 0;
            outputMs.Position = outputMs.Length;
            await usedMs.CopyToAsync(outputMs);
        }
        
        var data = outputMs.ToArray();
        
        currentPdfContent = "data:application/pdf;base64,";
        currentPdfContent += Convert.ToBase64String(data);
        pdfLoaded = true;
    }

This however doesn't work.
Has anyone a suggestion how to get this working? Or maybe a simpler solution I could try.


Edit:

I took a closer look in debug and it seems like, the resulting stream outputMs is always empty. So it is probably a problem in how I split the pdf.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

谢绝鈎搭 2025-02-04 15:30:26

在至少部分地清除了我对无法从Blazor Wasm访问文件系统的误解之后,我设法找到了一个工作解决方案。

        await using MemoryStream ms = new MemoryStream();
        var rs = file.OpenReadStream(maxFileSize);

        await using var fs = new FileStream("test.pdf", FileMode.Create)
                
        fs.Position = 0;

        await rs.CopyToAsync(fs);
        fs.Close();
            
        string path = "test.pdf";
        string range = "10 - 15";
        var pdfDocument = new PdfDocument(new PdfReader("test.pdf"));
        var split = new MySplitter(pdfDocument);
        var result = split.ExtractPageRange(new PageRange(range));
        result.Close();


        await using var splitFs = new FileStream("split.pdf", FileMode.Open))
        await splitFs.CopyToAsync(ms);

        var data = ms.ToArray();
            
        var pdfContent = "data:application/pdf;base64,";
        pdfContent += System.Convert.ToBase64String(data);
        Console.WriteLine(pdfContent);

        currentPdfContent = pdfContent;

使用

    class MySplitter : PdfSplitter
    {
        public MySplitter(PdfDocument pdfDocument) : base(pdfDocument)
        {
        }

        protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange)
        {
            String toFile = "split.pdf";
            return new PdfWriter(toFile);
        }
    }

After at least partially clearing up my misconception of what it means to not being able to access the file system from blazor WASM I managed to find a working solution.

        await using MemoryStream ms = new MemoryStream();
        var rs = file.OpenReadStream(maxFileSize);

        await using var fs = new FileStream("test.pdf", FileMode.Create)
                
        fs.Position = 0;

        await rs.CopyToAsync(fs);
        fs.Close();
            
        string path = "test.pdf";
        string range = "10 - 15";
        var pdfDocument = new PdfDocument(new PdfReader("test.pdf"));
        var split = new MySplitter(pdfDocument);
        var result = split.ExtractPageRange(new PageRange(range));
        result.Close();


        await using var splitFs = new FileStream("split.pdf", FileMode.Open))
        await splitFs.CopyToAsync(ms);

        var data = ms.ToArray();
            
        var pdfContent = "data:application/pdf;base64,";
        pdfContent += System.Convert.ToBase64String(data);
        Console.WriteLine(pdfContent);

        currentPdfContent = pdfContent;

With the MySplitter Class from this answer.

    class MySplitter : PdfSplitter
    {
        public MySplitter(PdfDocument pdfDocument) : base(pdfDocument)
        {
        }

        protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange)
        {
            String toFile = "split.pdf";
            return new PdfWriter(toFile);
        }
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文