提供人类可读的标识符表示?

发布于 2024-11-16 14:21:14 字数 1638 浏览 4 评论 0原文

作为上一个问题的后续,我要求解决一个损坏的问题,我试图找到一种以“可读”方式表达任意标识符的方法。

上下文:我们正在使用实体(来自DDD),它有一个身份。此标识(映射到数据库主键)可以表示为字符串:'123''ABC'

某些实体可以具有复合身份,即由两个或多个其他实体的身份组成:array('123',' ABC')

有时,我们想要漂亮地打印这个身份,或者在只允许单个字符串的地方使用它(例如,在 HTML 价值)。该过程必须是可预测和可逆的,即如何将其反转回其原始状态不应该存在歧义。

当我们想要人工读取此身份时,出于调试目的,读取 123ABC123~ABC 会更容易 而不是 a:2:{i:0;s:3:"123";i:1;s:3:"ABC";},这就是我们不这样做的原因想要使用内置函数,例如serialize()json_encode()

json_encode() 做得非常好,但是当在 HTML 中使用它时,引号必须被正确编码,它就变得非常不可读:

<option value="[&quot;123&quot;,&quot;ABC&quot;]">

我们可以使用像这样的好的格式:

<option value="123~ABC">

当发布 HTML 表单时,我们必须能够将此编码的身份恢复到其原始状态:array('123','ABC')以检索正确的实体 >。

最后,如果身份包含除字母和数字之外的其他字符,则格式变得(人类)阅读变得复杂是完全可以接受的。

一些基本示例:

'123' => '123'
'ABC' => 'ABC'
array('123','ABC') =>; '123~ABC'(只是一个想法)

'非字母数字的字符串,甚至非 àscìì char$' => ?

对于包含其他字符的字符串,任何(或多或少复杂的)表示形式都是可以接受的。即使原始字符串包含非 ASCII 字符,结果字符串也应仅包含 ASCII 字符。整个过程必须是完全可逆的。

关于如何做到这一点有什么想法吗?

As a follow-up to a previous question where I asked for a solution to a broken problem, I'm trying to find a way to express an arbitrary identifier in a "readable" way.

Context: we are working with entities (domain model objects from DDD), which have an identity. This identity (mapped to a database primary key) can be expressed as a string: '123', 'ABC'.

Some entities can have a compound identity, i.e. composed of two or more other entities' identity: array('123','ABC').

Sometimes, we want to pretty-print this identity, or to use it in a place where just a single string is allowed (for example, in an HTML <option> value). The process has to be predictable and reversible, i.e. there should be no ambiguity in how to reverse it back to its original state.

When we want to human-read this identity, for debugging purposes, it's easier to read 123, ABC, or 123~ABC rather than a:2:{i:0;s:3:"123";i:1;s:3:"ABC";}, that's why we don't want to use built-in functions such as serialize() or json_encode().

json_encode() does a pretty good job, but when it comes to use it in HTML, where quotes have to be properly encoded, it becomes quite unreadable:

<option value="["123","ABC"]">

Where we could use a nice format just like this one:

<option value="123~ABC">

When posting the HTML form, we have to be able to revert this encoded identity to its original state: array('123','ABC') to retrieve the correct entity.

Finally, it is perfectly acceptable that the format becomes complicated to (humanly) read if the identity contains other chars than letters and figures.

Some basic examples:

'123' => '123'
'ABC' => 'ABC'
array('123','ABC') => '123~ABC' (just an idea)

'string with non-alphanumeric, even non-àscìì char$' => ?

Any (more or less complicated) representation is acceptable for strings containing other chars. The resulting string should contain only ASCII chars, even if the original string contains non-ASCII chars. The whole process must be entirely reversible.

Any idea on how to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

掩耳倾听 2024-11-23 14:21:15

根据您在评论中提供的反馈,我建议您使用 urlencoderawurlencode

然后您可以使用 , 创建原子组合冒号。

class Identifier {
    static function encode(array $identifier) {
        return implode(', ', array_map('rawurlencode', $identifier));
    }
    static function decode($identifier) {
        return array_map('rawurldecode', 
            array_map('trim', explode(',', $identifier))
        );
    }
}

$identifier = array('111', 'abc');
var_dump($identifier);

$encoded = Identifier::encode($identifier);
var_dump($encoded);

$decoded = Identifier::decode($encoded);
var_dump($decoded);

Based on the feedback you gave in the comments I would suggest that you encode identifier-atoms with urlencode or rawurlencode

You can then create atom-composition by using , colons.

class Identifier {
    static function encode(array $identifier) {
        return implode(', ', array_map('rawurlencode', $identifier));
    }
    static function decode($identifier) {
        return array_map('rawurldecode', 
            array_map('trim', explode(',', $identifier))
        );
    }
}

$identifier = array('111', 'abc');
var_dump($identifier);

$encoded = Identifier::encode($identifier);
var_dump($encoded);

$decoded = Identifier::decode($encoded);
var_dump($decoded);
黑寡妇 2024-11-23 14:21:15
str_replace( array('[',']','"',',') ,
             array('','','','~'),
            json_encode($stuff)
);

您的问题非常冗长,并且无法解释您真正想要实现的目标。

str_replace( array('[',']','"',',') ,
             array('','','','~'),
            json_encode($stuff)
);

Your questions is utterly verbous and doenst' explain what you really want to achive.

甚是思念 2024-11-23 14:21:15

您可以使用 2 个特殊字符:

~ - 分隔符

* - 转义字符(转义分隔符或转义字符本身)

示例:

array('123','ABC') => 123~ABC
array('12*3','A~BC') => 12**3~A*~BC

您可以为分隔符和转义选择不同的字符特点。如果所选字符很少可用,则该字符串通常可读性良好。

You can use 2 special characters:

~ - delimiter

* - escape character (to escape a delimiter or an escape character itself)

Examples:

array('123','ABC') => 123~ABC
array('12*3','A~BC') => 12**3~A*~BC

You can choose different characters for delimiter and escape character. If selected characters will be rarely usable then the string usually will be well readable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文