PostgreSQL 字符串字符替换

发布于 2024-10-16 20:04:37 字数 1851 浏览 1 评论 0原文

我正在尝试编写一个词汇数据库来存储由词根和模式组成的单词,我想知道如何创建一个为我组合词根和模式的列,同时忽略不具有两个列的行已填充 SELECT 查询。

基本上,我有来自 PostgreSQL 数据库的输出:

SELECT root, root_i FROM tbl_roots NATURAL JOIN tbl_patterns NATURAL JOIN tbl_patterns_triliteral;

  root   | root_i
---------+--------
 {s,ş,m} | 1u2u3a
 {p,l,t} | 1u2u3a
 {t,m,s} | 1u2u3a
 {n,t,l} | 1u2u3a
 {s,ş,m} | 1a2oi3
 {p,l,t} | 1a2oi3
 {t,m,s} | 1a2oi3
 {n,t,l} | 1a2oi3
 {s,ş,m} | 1o2i3
 {p,l,t} | 1o2i3
 {t,m,s} | 1o2i3
 {n,t,l} | 1o2i3
 {s,ş,m} | a12e3
 {p,l,t} | a12e3
 {t,m,s} | a12e3
 {n,t,l} | a12e3
 {s,ş,m} | 1u2á3
 {p,l,t} | 1u2á3
 {t,m,s} | 1u2á3
 {n,t,l} | 1u2á3
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} | 1e2é3
 {p,l,t} | 1e2é3
 {t,m,s} | 1e2é3
 {n,t,l} | 1e2é3
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |

我想将其即时转换为类似这样的内容:

  root   | root_i | word_i
---------+--------+--------
 {s,ş,m} | 1u2u3a | suşuma
 {p,l,t} | 1u2u3a | puluta
 {t,m,s} | 1u2u3a | tumusa
 {n,t,l} | 1u2u3a | nutula
 {s,ş,m} | 1a2oi3 | saşoim
 {p,l,t} | 1a2oi3 | paloit
 {t,m,s} | 1a2oi3 | tamois
 {n,t,l} | 1a2oi3 | natoil
 {s,ş,m} | 1o2i3  | soşim
 {p,l,t} | 1o2i3  | polit
 {t,m,s} | 1o2i3  | tomis
 {n,t,l} | 1o2i3  | notil
 {s,ş,m} | a12e3  | asşem
 {p,l,t} | a12e3  | aplet
 {t,m,s} | a12e3  | atmes
 {n,t,l} | a12e3  | antel
 {s,ş,m} | 1u2á3  | suşám
 {p,l,t} | 1u2á3  | pulát
 {t,m,s} | 1u2á3  | tumás
 {n,t,l} | 1u2á3  | nutál
 {s,ş,m} | 1e2é3  | seşém
 {p,l,t} | 1e2é3  | pelét
 {t,m,s} | 1e2é3  | temés
 {n,t,l} | 1e2é3  | neşél

其中 word 列是通过替换 中的数字动态生成的root_i 列,其中包含 root 列中该数字索引中的字符。我还需要删除两列中都没有条目的查询行,以减少输出中的混乱。

谁能帮我设计一个 postgres 函数来合并字符[]和文本字符串?我需要的一点正则表达式不应该很复杂,但我不知道如何将其与查询混合,或者更好的是,将其转换为函数。

I'm trying to write a lexical database for storing words comprised of roots and patterns, and I was wondering how I could create a column that will combine the root and pattern for me, while ignoring rows which don't have both columns of the SELECT query populated.

Basically, I have this output from a PostgreSQL DB:

SELECT root, root_i FROM tbl_roots NATURAL JOIN tbl_patterns NATURAL JOIN tbl_patterns_triliteral;

  root   | root_i
---------+--------
 {s,ş,m} | 1u2u3a
 {p,l,t} | 1u2u3a
 {t,m,s} | 1u2u3a
 {n,t,l} | 1u2u3a
 {s,ş,m} | 1a2oi3
 {p,l,t} | 1a2oi3
 {t,m,s} | 1a2oi3
 {n,t,l} | 1a2oi3
 {s,ş,m} | 1o2i3
 {p,l,t} | 1o2i3
 {t,m,s} | 1o2i3
 {n,t,l} | 1o2i3
 {s,ş,m} | a12e3
 {p,l,t} | a12e3
 {t,m,s} | a12e3
 {n,t,l} | a12e3
 {s,ş,m} | 1u2á3
 {p,l,t} | 1u2á3
 {t,m,s} | 1u2á3
 {n,t,l} | 1u2á3
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} | 1e2é3
 {p,l,t} | 1e2é3
 {t,m,s} | 1e2é3
 {n,t,l} | 1e2é3
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |

And I want to convert it on-the-fly into something resembling this:

  root   | root_i | word_i
---------+--------+--------
 {s,ş,m} | 1u2u3a | suşuma
 {p,l,t} | 1u2u3a | puluta
 {t,m,s} | 1u2u3a | tumusa
 {n,t,l} | 1u2u3a | nutula
 {s,ş,m} | 1a2oi3 | saşoim
 {p,l,t} | 1a2oi3 | paloit
 {t,m,s} | 1a2oi3 | tamois
 {n,t,l} | 1a2oi3 | natoil
 {s,ş,m} | 1o2i3  | soşim
 {p,l,t} | 1o2i3  | polit
 {t,m,s} | 1o2i3  | tomis
 {n,t,l} | 1o2i3  | notil
 {s,ş,m} | a12e3  | asşem
 {p,l,t} | a12e3  | aplet
 {t,m,s} | a12e3  | atmes
 {n,t,l} | a12e3  | antel
 {s,ş,m} | 1u2á3  | suşám
 {p,l,t} | 1u2á3  | pulát
 {t,m,s} | 1u2á3  | tumás
 {n,t,l} | 1u2á3  | nutál
 {s,ş,m} | 1e2é3  | seşém
 {p,l,t} | 1e2é3  | pelét
 {t,m,s} | 1e2é3  | temés
 {n,t,l} | 1e2é3  | neşél

Where the word column is dynamically generated by replacing the digits in the root_i column with the character in that digit's index in the root column. I also need to drop queried rows that don't have an entry in both columns to reduce clutter in my output.

Can anyone help me devise a postgres function that will do the merging of character[] and text strings? The little bit of regex I need shouldn't be complicated, but I have no idea how to mix this with a query, or better yet, turn it into a function.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

烟燃烟灭 2024-10-23 20:04:37
select
  root,
  root_i,
  translate(root_i, "123", array_to_string(root,'')) as word_i
NATURAL JOIN tbl_patterns
NATURAL JOIN tbl_patterns_triliteral
where root is not null and root_i is not null;
select
  root,
  root_i,
  translate(root_i, "123", array_to_string(root,'')) as word_i
NATURAL JOIN tbl_patterns
NATURAL JOIN tbl_patterns_triliteral
where root is not null and root_i is not null;
暗藏城府 2024-10-23 20:04:37

我必须承认我不喜欢在 sql/plpgsql 函数中进行太多字符串操作。 Perl 有一个用生成的替换来替换正则表达式匹配的操作符,它工作得相当好:

create or replace function splice_to_word(root text, root_i text)
  returns text strict immutable language plperl as $
  my $roots = shift;
  my $template = shift;
  $template =~ s{(\d+)}{substr($roots,$1-1,1)}ge;
  return $template;
$;

postgresql 数组似乎没有被翻译成 Perl 列表,所以我假设根作为字符串传入,这有点令人讨厌,例如:

select root, root_i, splice_to_word(array_to_string(root, ''), root_i) from data

I must admit to not liking to do much string manipulation in sql/plpgsql functions. Perl has an operator for replacing regexp matches with generated replacements, which works fairly nicely:

create or replace function splice_to_word(root text, root_i text)
  returns text strict immutable language plperl as $
  my $roots = shift;
  my $template = shift;
  $template =~ s{(\d+)}{substr($roots,$1-1,1)}ge;
  return $template;
$;

There is some nastiness in that postgresql arrays don't seem to be translated into Perl lists, so I've assumed the roots are passed in as a string, e.g.:

select root, root_i, splice_to_word(array_to_string(root, ''), root_i) from data
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文