有没有一种方法可以从多个PDF中获取数据

发布于 2025-01-22 03:50:38 字数 351 浏览 0 评论 0原文

我有大量的PDF文件,每个文件都有一定数量的行,可以说产品参考和价格。这些随着文件的变化而不断变化。 例如,输入PDF文件可能是:

文件1

发票号:1

  1. 产品A:5000
  2. 产品B:3000
  3. 产品D:6000

文件2:

发票编号:2

  1. 产品B:5000
  2. 产品C:1000

我需要在Excel表中输出这些使用产品参考和发票NO作为标题以及行中的值。

我尝试使用电源查询,但是由于每行的值并不恒定,因此无法正常工作。

有什么方法可以做到吗?

提前致谢。

I have a large set of pdf file, each file has a certain number of rows, lets say product references and price. These keep changing as files change.
For example the input pdf file could be :

File 1

Invoice number : 1

  1. Product a : 5000
  2. Product b : 3000
  3. Product d : 6000

File 2 :

Invoice number : 2

  1. Product b : 5000
  2. Product c : 1000

I need these outputted in an excel sheet with product references and invoice no as headers and the values in rows.

I tried with power query but since the values of each row aren’t constant it didn’t work.

Is there a way that I can do it ?

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

孤独岁月 2025-01-29 03:50:38

只需将发票编号添加到产品参考。

”在此处输入图像描述”

考虑您导入数据。以下是示例导入步骤之后的电源查询。

  ...
  #"Added Custom1" = Table.AddColumn(#"Changed Type", "test", each Text.Contains([Column1],"Invoice number")),
  #"Added Custom" = Table.AddColumn(#"Added Custom1", "Invoice Number", each if [test] then Text.AfterDelimiter([Column1],": ") else null),
  #"Filled Down" = Table.FillDown(#"Added Custom",{"Invoice Number"}),
  #"Added Custom2" = Table.AddColumn(#"Filled Down", "Product reference", each if [test]= false then Text.BeforeDelimiter(Text.AfterDelimiter([Column1],". ")," : ") & "_" &[Invoice Number] else null),
  #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Values", each if [test]=false then Text.AfterDelimiter([Column1],": ") else null),
  #"Filtered Rows" = Table.SelectRows(#"Added Custom3", each ([Product reference] <> null)),
  #"Removed Other Columns2" = Table.SelectColumns(#"Filtered Rows",{"Invoice Number", "Product reference", "Values"})
in
  #"Removed Other Columns2"

如果答案对您有用,那么期望您检查标记并进行投票。如果答案对您不起作用,则在底部添加评论以及您遇到的问题。对于您的情况,您需要调整答案

Just add the invoice number to the product reference.

enter image description here

enter image description here

Considering that you imported the data. Below is the power query after the imported steps of your example.

  ...
  #"Added Custom1" = Table.AddColumn(#"Changed Type", "test", each Text.Contains([Column1],"Invoice number")),
  #"Added Custom" = Table.AddColumn(#"Added Custom1", "Invoice Number", each if [test] then Text.AfterDelimiter([Column1],": ") else null),
  #"Filled Down" = Table.FillDown(#"Added Custom",{"Invoice Number"}),
  #"Added Custom2" = Table.AddColumn(#"Filled Down", "Product reference", each if [test]= false then Text.BeforeDelimiter(Text.AfterDelimiter([Column1],". ")," : ") & "_" &[Invoice Number] else null),
  #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Values", each if [test]=false then Text.AfterDelimiter([Column1],": ") else null),
  #"Filtered Rows" = Table.SelectRows(#"Added Custom3", each ([Product reference] <> null)),
  #"Removed Other Columns2" = Table.SelectColumns(#"Filtered Rows",{"Invoice Number", "Product reference", "Values"})
in
  #"Removed Other Columns2"

If the answer works for you, the expectation is that you checkmark it and upvote it. If the answer does not work for you you add a comment at the bottom and the problem you experience. For your situation you will need to adapt the answer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文