优化 PostgreSQL 9.0 中的转义 JSON

发布于 2024-10-17 12:06:22 字数 1880 浏览 2 评论 0原文

我目前在 PostgreSQL 中使用这个 JSON 转义函数作为未来原生 JSON 支持的替代。虽然它有效,但它也限制了我们的系统性能。我该如何优化它?也许某种查找数组?

CREATE OR REPLACE FUNCTION escape_json(i_text TEXT) 
RETURNS TEXT AS
$body$                                                  
DECLARE
  idx INTEGER;
  text_len INTEGER;   
  cur_char_unicode INTEGER;
  rtn_value TEXT := i_text;
BEGIN
  -- $Rev: $ --
  text_len = LENGTH(rtn_value);
  idx = 1; 

  WHILE (idx <= text_len) LOOP
    cur_char_unicode = ASCII(SUBSTR(rtn_value, idx, 1));

    IF cur_char_unicode > 255 THEN
      rtn_value = OVERLAY(rtn_value PLACING (E'\\u' || LPAD(UPPER(TO_HEX(cur_char_unicode)),4,'0')) FROM idx FOR 1);
      idx = idx + 5;
      text_len = text_len + 5;
    ELSE
      /* is the current character one of the following: " \ / bs ff nl cr tab */
      IF cur_char_unicode IN (34, 92, 47, 8, 12, 10, 13, 9) THEN
        rtn_value = OVERLAY(rtn_value PLACING (E'\\' || (CASE cur_char_unicode
                                                         WHEN 34 THEN '"'
                                                         WHEN 92 THEN E'\\'
                                                         WHEN 47 THEN '/'
                                                         WHEN  8 THEN 'b'
                                                         WHEN 12 THEN 'f'
                                                         WHEN 10 THEN 'n'
                                                         WHEN 13 THEN 'r'
                                                         WHEN  9 THEN 't'
                                                          END)
                                        )
                                FROM idx FOR 1);

        idx = idx + 1;
        text_len = text_len + 1;
      END IF;
    END IF;

    idx = idx + 1;
  END LOOP;                   

  RETURN rtn_value;
END;
$body$
LANGUAGE plpgsql;

I'm currently using this JSON escaping function in PostgreSQL as a stand in for future native JSON support. While it works, it's also limiting our systems performance. How can I go about optimizing it? Maybe some kind of lookup array?

CREATE OR REPLACE FUNCTION escape_json(i_text TEXT) 
RETURNS TEXT AS
$body$                                                  
DECLARE
  idx INTEGER;
  text_len INTEGER;   
  cur_char_unicode INTEGER;
  rtn_value TEXT := i_text;
BEGIN
  -- $Rev: $ --
  text_len = LENGTH(rtn_value);
  idx = 1; 

  WHILE (idx <= text_len) LOOP
    cur_char_unicode = ASCII(SUBSTR(rtn_value, idx, 1));

    IF cur_char_unicode > 255 THEN
      rtn_value = OVERLAY(rtn_value PLACING (E'\\u' || LPAD(UPPER(TO_HEX(cur_char_unicode)),4,'0')) FROM idx FOR 1);
      idx = idx + 5;
      text_len = text_len + 5;
    ELSE
      /* is the current character one of the following: " \ / bs ff nl cr tab */
      IF cur_char_unicode IN (34, 92, 47, 8, 12, 10, 13, 9) THEN
        rtn_value = OVERLAY(rtn_value PLACING (E'\\' || (CASE cur_char_unicode
                                                         WHEN 34 THEN '"'
                                                         WHEN 92 THEN E'\\'
                                                         WHEN 47 THEN '/'
                                                         WHEN  8 THEN 'b'
                                                         WHEN 12 THEN 'f'
                                                         WHEN 10 THEN 'n'
                                                         WHEN 13 THEN 'r'
                                                         WHEN  9 THEN 't'
                                                          END)
                                        )
                                FROM idx FOR 1);

        idx = idx + 1;
        text_len = text_len + 1;
      END IF;
    END IF;

    idx = idx + 1;
  END LOOP;                   

  RETURN rtn_value;
END;
$body$
LANGUAGE plpgsql;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

入画浅相思 2024-10-24 12:06:22

坦白:我是 Google Summer of Code 2010 的学生,我打算尝试为 PostgreSQL 9.1 带来 JSON 支持。尽管我的代码功能相当完整,但它还没有完全准备好用于上游,并且 PostgreSQL 开发社区正在寻找一些替代实现。不过,随着春假的临近,我希望能在本周完成重写并进行最后的推动。

同时,您可以下载并安装 正在进行中的 JSON 数据类型模块,应该适用于 PostgreSQL 8.4.0 及更高版本。它是一个 PGXS 模块,因此您可以编译并安装它,而无需编译所有 PostgreSQL。但是,您将需要 PostgreSQL 服务器开发标头。

安装过程如下:

git clone git://git.postgresql.org/git/json-datatype.git
cd json-datatype/
USE_PGXS=1 make
sudo USE_PGXS=1 make install
psql -f json.sql <DBNAME1> # requires database superuser privileges

虽然构建和安装只需完成一次,但需要在您计划使用 JSON 数据类型的每个数据库上运行 json.sql。

安装后,您现在可以运行:

=> SELECT to_json(E'"quotes and \n newlines"\n'::TEXT);
            to_json             
--------------------------------
 "\"quotes and \n newlines\"\n"
(1 row)

请注意,这不会转义非 ASCII 字符。

Confession: I am the Google Summer of Code 2010 student who was going to try to bring JSON support to PostgreSQL 9.1. Although my code was fairly feature-complete , it wasn't completely ready for upstream, and the PostgreSQL development community was looking at some alternative implementations. However, with spring break coming up, I'm hoping to finish my rewrite and give it a final push this week.

In the mean time, you can download and install the work-in-progress JSON data type module, which should work on PostgreSQL 8.4.0 and up. It is a PGXS module, so you can compile and install it without having to compile all of PostgreSQL. However, you will need the PostgreSQL server development headers.

Installation goes something like this:

git clone git://git.postgresql.org/git/json-datatype.git
cd json-datatype/
USE_PGXS=1 make
sudo USE_PGXS=1 make install
psql -f json.sql <DBNAME1> # requires database superuser privileges

Although the build and install only needs to be done once, json.sql needs to be run on every database you plan to use the JSON data type on.

With that installed, you can now run:

=> SELECT to_json(E'"quotes and \n newlines"\n'::TEXT);
            to_json             
--------------------------------
 "\"quotes and \n newlines\"\n"
(1 row)

Note that this does not escape non-ASCII characters.

苏佲洛 2024-10-24 12:06:22

我所有的方法都归结为“用其他方式来做”:

  • 用其他语言编写,例如使用 pl/perl、pl/python、pl/ruby
  • 围绕用 C 编写的一些外部 JSON 库编写包装器 用 C 语言
  • 进行 JSON 转义客户端而不是查询中(假设你的客户端有一些良好的 JSON 转义支持)

根据我的经验,pl/pgsql 在这类事情上速度并不快 - 它的优势在于它对与数据库交换数据的整体支持,而不是像一种通用编程语言。

示例:

create or replace function escape_json_perl(text) returns text 
  strict immutable
  language plperlu as $
    use JSON;
    return JSON->new->allow_nonref->encode($_[0]);
  $;

快速测试表明这比 plpgsql 函数快 15 倍(尽管它返回您可能想要删除的值周围的引号)

All my approaches boil down to "do it some other way":

  • Write it in some other language, e.g. use pl/perl, pl/python, pl/ruby
  • Write a wrapper round some external JSON library written in C
  • Do the JSON escaping in the client rather than in the query (assuming your client has some good JSON escaping support)

In my experience pl/pgsql isn't fast at this sort of thing- its strength is in its integral support for exchanging data with the database, not as a general-purpose programming language.

Example:

create or replace function escape_json_perl(text) returns text 
  strict immutable
  language plperlu as $
    use JSON;
    return JSON->new->allow_nonref->encode($_[0]);
  $;

A quick test suggests this is on the order of 15x faster than the plpgsql function (although it returns quotes around the value which you probably want to strip off)

鲜血染红嫁衣 2024-10-24 12:06:22

我在这里找到了用 C 实现的 PostgreSQL 函数: http://code.google .com/p/pg-to-json-serializer/

我没有将它与您的 PLSQL 方法进行比较,但它应该比任何解释语言都要快。

另一种:http://miketeo.net/wp/index.html php/projects/json-functions-for-postgresql

I have found a PostgreSQL function implemented in C here : http://code.google.com/p/pg-to-json-serializer/

I have not compared it with your PLSQL method but it should be faster than any interpreted language.

Another one : http://miketeo.net/wp/index.php/projects/json-functions-for-postgresql

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文