将 UInt64 表示为字符串的最短方法

发布于 2024-12-16 00:43:51 字数 714 浏览 2 评论 0原文

我得到一个可能很大的数字(UInt.MaxValue:18446744073709551615)作为正常的base10数字。 这个数字最终会成为一个文件名:12345678945768.txt

由于Windows上的文件名不仅限于数字,我想将其“压缩”为更短的字符串,但需要确保字符串可以映射回数字。

对于较小的数字:0001365555,十六进制比其他数字短得多。 到目前为止我发现的所有内容都表明 Base64 是最短的,但事实并非如此。

到目前为止,我已经尝试过:

//18446744073709551615 - 20
UInt64 i = UInt64.MaxValue; // 0001365555

//"//////////8=" - 12
string encoded = Convert.ToBase64String(BitConverter.GetBytes(i)); 

//"FFFFFFFFFFFFFFFF" - 16
string hexed = i.ToString("X"); 

//"MTg0NDY3NDQwNzM3MDk1NTE2MTU=" - 28
string utf = Convert.ToBase64String(System.Text.Encoding.ASCII.GetBytes(i.ToString())); 

是否有更好的方法来“压缩”整数以类似于十六进制进行转换,但使用 00-zz 而不仅仅是 00-FF?

提前致谢!

I get a possibly large number (UInt.MaxValue: 18446744073709551615) as a normal base10 number.
This number would eventually become a filename: 12345678945768.txt

Since filenames on Windows aren't limited to just numerical digits, I would like to "compress" this in to a shorter string but need to make sure the strings can be mapped back to a number.

For smaller numbers: 0001365555, hexed is much shorter than anything else.
Everything I've found so far states that Base64 would be shortest, but it isn't.

So far I've tried this:

//18446744073709551615 - 20
UInt64 i = UInt64.MaxValue; // 0001365555

//"//////////8=" - 12
string encoded = Convert.ToBase64String(BitConverter.GetBytes(i)); 

//"FFFFFFFFFFFFFFFF" - 16
string hexed = i.ToString("X"); 

//"MTg0NDY3NDQwNzM3MDk1NTE2MTU=" - 28
string utf = Convert.ToBase64String(System.Text.Encoding.ASCII.GetBytes(i.ToString())); 

Is there a better way to "compress" integer to convert similar to Hex but use 00-zz and not just 00-FF?

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

就是爱搞怪 2024-12-23 00:43:51

到目前为止,我发现的所有内容都表明 Base64 是最短的,但事实并非如此。

您不想使用 Base64。 Base64 编码文本可以使用 / 字符,这在 Windows 上的文件名中是不允许的。你需要想出别的办法。

还有什么?

好吧,您可以编写自己的基数转换,也许是这样的:

public static string Convert(ulong number)
{
    var validCharacters = "qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM1234567890!@#$%^&()_-";
    char[] charArray = validCharacters.ToCharArray();
    var buffer = new StringBuilder();
    var quotient = number;
    ulong remainder;
    while (quotient != 0)
    {
        remainder = quotient % (ulong)charArray.LongLength;
        quotient = quotient / (ulong)charArray.LongLength;
        buffer.Insert(0, charArray[remainder].ToString());
    }
    return buffer.ToString();
}

这是一个“base-73”结果,validCharacters 中的字符越多,输出就越小。请随意添加更多字符,只要它们在您的文件系统中是合法字符即可。

Everything I've found so far states that Base64 would be shortest, but it isn't.

You don't want to use Base64. Base64 encoded text can use the / character, which is disallowed in file names on Windows. You need to come up with something else.

What else?

Well, you could write your own base conversion, perhaps something like this:

public static string Convert(ulong number)
{
    var validCharacters = "qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM1234567890!@#$%^&()_-";
    char[] charArray = validCharacters.ToCharArray();
    var buffer = new StringBuilder();
    var quotient = number;
    ulong remainder;
    while (quotient != 0)
    {
        remainder = quotient % (ulong)charArray.LongLength;
        quotient = quotient / (ulong)charArray.LongLength;
        buffer.Insert(0, charArray[remainder].ToString());
    }
    return buffer.ToString();
}

This is a "base-73" result, The more characters in validCharacters, the smaller the output will be. Feel free to add more, so long as they are legal characters in your file system.

您允许的字符集是什么?如果您可以识别可以安全使用的 7132 个不同的 Unicode 字符,则可以将 64 位数字编码为五个 Unicode 字符。另一方面,并​​非所有文件系统都支持此类字符。如果您可以识别 139 个合法字符,则可以将数据压缩为九个字符的字符串。对于 85,您可以使用十个字符的字符串。

What is your allowed character set? If you could identify 7132 different Unicode characters that were safe to use, you could encode a 64-bit number as five Unicode characters. On the other hand, not all file systems will support such characters. If you could identify 139 legal characters, you could compress the data to a nine-character string. With 85, you could use a ten-character string.

杀お生予夺 2024-12-23 00:43:51

你滥用了 Base64。

(System.Text.Encoding.ASCII.GetBytes(i.ToString())

这会生成一个包含 Base10 编码整数的字节序列,并再次以 Base64 对其进行编码。这显然是低效的。

您需要获取整数的原始字节并使用 base64 对其进行编码。哪种编码最有效取决于您想要允许的字符数量。如果你想要 sho

并且你应该在数组的一侧修剪 0 字节。

var bytes=BitConverter.GetBytes(input);
int len=8;
for(int i=7;i>=0;i--)
{
  if(bytes[i]!=0)
  {
    len=i+1;
    break;
  }
}
string s=Convert.ToBase64String(bytes,0,len).ReplaceString('/','-');

请注意,这在大端系统上不会按预期工作。

但也许您应该完全避免字节编码,而只使用基数更高的整数编码。

一个简单的版本可能是:

string digitChars="0123..."
while(i!=0)
{
  int digit=i%digitChars.Length;
  i/=digitChars.Length;
  result=digitChars[digit]+result;
}

You misused Base64.

(System.Text.Encoding.ASCII.GetBytes(i.ToString())

This produces a byte sequence that contains the base10 encoded integer and the encode it again in base64. That's obviously inefficient.

You need to get the raw bytes of your integer and encode them with base64. Which encoding is the most efficient depends on how many characters you want to allow. If you want the sho

And you should trim 0 bytes on one side of the array.

var bytes=BitConverter.GetBytes(input);
int len=8;
for(int i=7;i>=0;i--)
{
  if(bytes[i]!=0)
  {
    len=i+1;
    break;
  }
}
string s=Convert.ToBase64String(bytes,0,len).ReplaceString('/','-');

Note that this will not work as expected on big-endian systems.

But perhaps you should avoid byte encodings all together, and just use integer encodings with a higher base.

A simple version might be:

string digitChars="0123..."
while(i!=0)
{
  int digit=i%digitChars.Length;
  i/=digitChars.Length;
  result=digitChars[digit]+result;
}
葵雨 2024-12-23 00:43:51

这是一些使用上面 vcsjones 答案的代码,但也包含反向转换。就像他的回答一样,如果需要减少字符串大小,请随意添加更多字符。下面的字符为 ulong.MaxValue 生成大小为 13 的字符串。

private const string _conversionCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

public static string UlongToCompressedString(ulong number)
{
    char[] charArray = _conversionCharacters.ToCharArray();
    var buffer = new System.Text.StringBuilder();
    var quotient = number;
    ulong remainder;
    do
    {
        remainder = quotient % (ulong)charArray.LongLength;
        quotient = quotient / (ulong)charArray.LongLength;
        buffer.Insert(0, charArray[remainder].ToString());
    } while (quotient != 0);
    return buffer.ToString();
}

public static ulong? CompressedStringToULong(string compressedNumber)
{
    if (compressedNumber == null)
        return null;

    if (compressedNumber.Length == 0))
        return 0;
    
    ulong result   = 0;
    int   baseNum  = _conversionCharacters.Length;
    ulong baseMult = 1;
    
    for (int i=compressedNumber.Length-1; i>=0; i--)
    {
        int cPos = _conversionCharacters.IndexOf(compressedNumber[i]);
        if (cPos < 0)
            return null;
        result += baseMult * (ulong)cPos;
        baseMult *= (ulong)baseNum;
    }

    return result;
}

Here's some code that uses vcsjones answer above, but has the reverse conversion included also. Like in his answer, feel free to add more characters if needed to reduce the string size. The characters below produce a string size of 13 for ulong.MaxValue.

private const string _conversionCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

public static string UlongToCompressedString(ulong number)
{
    char[] charArray = _conversionCharacters.ToCharArray();
    var buffer = new System.Text.StringBuilder();
    var quotient = number;
    ulong remainder;
    do
    {
        remainder = quotient % (ulong)charArray.LongLength;
        quotient = quotient / (ulong)charArray.LongLength;
        buffer.Insert(0, charArray[remainder].ToString());
    } while (quotient != 0);
    return buffer.ToString();
}

public static ulong? CompressedStringToULong(string compressedNumber)
{
    if (compressedNumber == null)
        return null;

    if (compressedNumber.Length == 0))
        return 0;
    
    ulong result   = 0;
    int   baseNum  = _conversionCharacters.Length;
    ulong baseMult = 1;
    
    for (int i=compressedNumber.Length-1; i>=0; i--)
    {
        int cPos = _conversionCharacters.IndexOf(compressedNumber[i]);
        if (cPos < 0)
            return null;
        result += baseMult * (ulong)cPos;
        baseMult *= (ulong)baseNum;
    }

    return result;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文