用关联的标签/单元提取数据

发布于 2025-01-27 12:54:00 字数 1149 浏览 1 评论 0原文

我一直在尝试将隐藏在文本句子中的密钥数据分开：

我在以下代码方面取得了一些进步，但是它也提取了不希望的值：

let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}, {"Desired OutPut", type any}, {"Bonus", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each if Text.Contains([Input], "mmHg") then Text.Remove([Input],Text.ToList(Text.Remove([Input],{"0".."9","-", " ", "."}))) else null),
    #"Trimmed Text" = Table.TransformColumns(#"Added Custom",{{"Custom", Text.Trim, type text}})
in
    #"Trimmed Text"

可以看到其他数值数据正在拉出。

我认为遵循这些规则可能是错误的方法，并想知道是否可以将MMHG用作提取“附近”数据的标签。理想情况下，价值或范围将接触“ MMHG”，但是在某些情况下，情况并非如此，因此附近的逻辑想法。我感谢我可以删除除号码和MMGH以外的所有数据，但是我认为，如果可能的话，标记的想法将非常有用。在我的脑海中，我想像：如果文本包含MMHG，则在X Charecters中搜索{0..9，“ - ”}（例如左侧10）。这可能吗？

作为额外的额外，我将尝试提取出这种压力的眼睛。在这里，我希望使用一些首先发作的逻辑柔软的逻辑。我认为这是一个可以的假设，即第一个压力将与每句话的第一个提到的眼睛有关。我不确定如何在M代码中执行此操作。但是，这可能需要一个单独的问题。

原文

I have been attempting to separate out key data hidden within sentences of text e.g:

I have made some progress with the following code however it pulls undesired values too:

let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}, {"Desired OutPut", type any}, {"Bonus", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each if Text.Contains([Input], "mmHg") then Text.Remove([Input],Text.ToList(Text.Remove([Input],{"0".."9","-", " ", "."}))) else null),
    #"Trimmed Text" = Table.TransformColumns(#"Added Custom",{{"Custom", Text.Trim, type text}})
in
    #"Trimmed Text"

As you can see other numerical data is being pulled.

I think however following these rules is perhaps the wrong way to go about this and wonder If it's possible to use mmHg as a Tag to extract 'nearby` data. Ideally the value or range will be touching "mmHg" however there are instances where this isnt the case hence this idea of nearby logic. I apprecaite I could remove all data except numbers and mmgH however I think this idea of tagging if possible will be very useful going forward. In my mind im thinking like: if Text contains mmHg then search for {0..9,"-"} within X charecters (say 10 to the left). Is this possible?

As sort of extra I will attempt to extract the Eye that this pressure is found in. Here I wish to use some soft of logic with a sort of first come first serve basis. I think this it an okay assumption that the first pressure will relate to the first mentioned eye per sentence. I am unsure how to do this in M code. This may however warrant a seperate question.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

笛声青案梦长安 2025-02-03 12:54:00

我认为您可以在这里使用正则表达式：

步骤1）：

向表的组添加自定义功能：

此处就像您问的一个问题一样）。我使用的源函数来自在这里和是正则重置功能。

(x,y,z)=>
let 
   Source = Web.Page(
                     "<script>var x="&"'"&x&"'"&";var z="&"'"&z&
                     "'"&";var y=new RegExp('"&y&"','g');
                     var b=x.replace(y,z);document.write(b);</script>")
                     [Data]{0}[Children]{0}[Children]{1}[Text]{0}
in 
   Source

步骤2）：

在“添加列”选项卡上，调用此自定义功能。使用以下参数：

x - 输入
y - （\\ d+（？： - \\ d+）？）\\ d*mmhg |。
z - $ 1

< < <强>步骤3）：

我们可以使用具有不同参数的相同函数添加另一列：

x - 输入
y - \\ b（右|左）\\ s*eye \\ b |。
z - $ 1

请注意尾随空间。使用捕获组1之间的空间，该PQ将自动构成结果。

步骤4）：

在'tab'transform'下，我只是用'null'值替换错误。

步骤5）：

编辑的M代码，用逗号空间定界符替换值之间的空间。

结果：

m-code ：

let
    Source = Excel.CurrentWorkbook(){[Name="Tabel1_2"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "mmHg", each Text.Replace(fnRegexExtr([Input], "(\\d+(?:-\\d+)?)\\D*mmHg|.", "$1 ")," ",", ")),
    #"Invoked Custom Function1" = Table.AddColumn(#"Invoked Custom Function", "Side", each Text.Replace(fnRegexExtr([Input], "\\b(right|left)\\s*eye\\b|.", "$1 ")," ",", ")),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Invoked Custom Function1", {{"mmHg", null}, {"Side", null}})
in
    #"Replaced Errors"

I think you can utilize regular expressions here:

Step 1):

Add a custom function to the group of your table:

In this case I called it 'fnRegexExtr' (much like a previous question you asked). The source function I used came from here and is a regex-replace function.

(x,y,z)=>
let 
   Source = Web.Page(
                     "<script>var x="&"'"&x&"'"&";var z="&"'"&z&
                     "'"&";var y=new RegExp('"&y&"','g');
                     var b=x.replace(y,z);document.write(b);</script>")
                     [Data]{0}[Children]{0}[Children]{1}[Text]{0}
in 
   Source

Step 2):

On the 'Add Column' tab, invoke this custom function. Use the following parameters:

x - Input
y - (\\d+(?:-\\d+)?)\\D*mmHg|.
z - $1

Step 3):

We can add another column using the same function with different parameters:

x - Input
y - \\b(right|left)\\s*eye\\b|.
z - $1

Please note the trailing spaces. Using spaces inbetween capture group 1 makes that PQ will auto-trim the result.

Step 4):

Under tab 'Transform' I simply replaced errors with 'null' values.

Step 5):

Edited the M-code to replace spaces inbetween values with comma-space delimiters.

Result:

M-Code:

let
    Source = Excel.CurrentWorkbook(){[Name="Tabel1_2"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "mmHg", each Text.Replace(fnRegexExtr([Input], "(\\d+(?:-\\d+)?)\\D*mmHg|.", "$1 ")," ",", ")),
    #"Invoked Custom Function1" = Table.AddColumn(#"Invoked Custom Function", "Side", each Text.Replace(fnRegexExtr([Input], "\\b(right|left)\\s*eye\\b|.", "$1 ")," ",", ")),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Invoked Custom Function1", {{"mmHg", null}, {"Side", null}})
in
    #"Replaced Errors"

回复收藏 0 原文

~没有更多了~