当处理大的“long”值时,这段代码有什么问题?

发布于 2024-12-04 04:27:30 字数 5407 浏览 2 评论 0原文

我编写了一个实用程序类,用于对具有基数 N 的自定义 数字系统 中的数字进行编码。尊重 Java 程序员,然后我编写了一个单元测试来检查代码是否按预期工作(对于我可以抛出的任何数字)。

事实证明,对于少数人来说,这种方法是有效的。然而,对于足够大的数量,测试失败了。

代码:

public class EncodeUtil {

    private String symbols;

    private boolean isCaseSensitive;
    private boolean useDefaultSymbols;

    private int[] symbolLookup = new int[255];

    public EncodeUtil() {
        this(true);
    }

    public EncodeUtil(boolean isCaseSensitive) {
        this.useDefaultSymbols = true;
        setCaseSensitive(isCaseSensitive);
    }

    public EncodeUtil(boolean isCaseSensitive, String symbols) {
        this.useDefaultSymbols = false;
        setCaseSensitive(isCaseSensitive);
        setSymbols(symbols);
    }

    public void setSymbols(String symbols) {
        this.symbols = symbols;
        fillLookupArray();
    }

    public void setCaseSensitive(boolean isCaseSensitive) {
        this.isCaseSensitive = isCaseSensitive;
        if (useDefaultSymbols) {
            setSymbols(makeAlphaNumericString(isCaseSensitive));
        }
    }

    private void fillLookupArray() {
        //reset lookup array
        for (int i = 0; i < symbolLookup.length; i++) {
            symbolLookup[i] = -1;
        }
        for (int i = 0; i < symbols.length(); i++) {
            char c = symbols.charAt(i);
            if (symbolLookup[(int) c] == -1) {
                symbolLookup[(int) c] = i;
            } else {
                throw new IllegalArgumentException("duplicate symbol:" + c);
            }
        }
    }

    private static String makeAlphaNumericString(boolean caseSensitive) {
        StringBuilder sb = new StringBuilder(255);
        int caseDiff = 'a' - 'A';
        for (int i = 'A'; i <= 'Z'; i++) {
            sb.append((char) i);
            if (caseSensitive) sb.append((char) (i + caseDiff));
        }
        for (int i = '0'; i <= '9'; i++) {
            sb.append((char) i);
        }
        return sb.toString();
    }

    public String encodeNumber(long decNum) {
        return encodeNumber(decNum, 0);
    }

    public String encodeNumber(long decNum, int minLen) {
        StringBuilder result = new StringBuilder(20);
        long num = decNum;
        long mod = 0;
        int base = symbols.length();
        do {
            mod = num % base;
            result.append(symbols.charAt((int) mod));
            num = Math.round(Math.floor((num-mod) / base));
        } while (num > 0);
        if (result.length() < minLen) {
            for (int i = result.length(); i < minLen; i++) {
                result.append(symbols.charAt(0));
            }
        }
        return result.toString();
    }

    public long decodeNumber(String encNum) {
        if (encNum == null) return 0;
        if (!isCaseSensitive) encNum = encNum.toUpperCase();
        long result = 0;
        int base = symbols.length();
        long multiplier = 1;
        for (int i = 0; i < encNum.length(); i++) {
            char c = encNum.charAt(i);
            int pos = symbolLookup[(int) c];
            if (pos == -1) {
                String debugValue = encNum.substring(0, i) + "[" + c + "]";
                if (encNum.length()-1 > i) {
                    debugValue += encNum.substring(i + 1);
                }
                throw new IllegalArgumentException(
                    "invalid symbol '" + c + "' at position " 
                    + (i+1) + ": " + debugValue);
            } else {
                result += pos * multiplier;
                multiplier = multiplier * base;
            }
        }
        return result;
    }

    @Override
    public String toString() {
        return symbols;
    }

}

测试:

public class EncodeUtilTest {

    @Test
    public void testRoundTrip() throws Exception {
        //for some reason, numbers larger than this range will not be decoded correctly
        //maybe some bug in JVM with arithmetic with long values?
        //tried also BigDecimal, didn't make any difference
        //anyway, it is highly improbable that we ever need such large numbers
        long value = 288230376151711743L;
        test(value, new EncodeUtil());
        test(value, new EncodeUtil(false));
        test(value, new EncodeUtil(true, "1234567890qwertyuiopasdfghjklzxcvbnm"));
    }

    @Test
    public void testRoundTripMax() throws Exception {
        //this will fail, see above
        test(Long.MAX_VALUE, new EncodeUtil());
    }

    @Test
    public void testRoundTripGettingCloserToMax() throws Exception {
        //here we test different values, getting closer to Long.MAX_VALUE
        //this will fail, see above
        EncodeUtil util = new EncodeUtil();
        for (long i = 1000; i > 0; i--) {
            System.out.println(i);
            test(Long.MAX_VALUE / i, util);
        }
    }

    private void test(long number, EncodeUtil util) throws Exception {
        String encoded = util.encodeNumber(number);
        long result = util.decodeNumber(encoded);
        long diff = number - result;
        //System.out.println(number + " = " + encoded + " diff " + diff);
        assertEquals("original=" + number + ", result=" + result + ", encoded=" + encoded, 0, diff);
    }

}

当值变大时,为什么事情开始失败,有什么想法吗?我也尝试过 BigInteger,但似乎没有什么区别。

I wrote an utility class to encode numbers in a custom numeral system with base N. As any self-respecting Java programmer I then wrote a unit test to check that the code works as expected (for any number I could throw at it).

It turned out, that for small numbers, it worked. However, for sufficiently large numbers, the tests failed.

The code:

public class EncodeUtil {

    private String symbols;

    private boolean isCaseSensitive;
    private boolean useDefaultSymbols;

    private int[] symbolLookup = new int[255];

    public EncodeUtil() {
        this(true);
    }

    public EncodeUtil(boolean isCaseSensitive) {
        this.useDefaultSymbols = true;
        setCaseSensitive(isCaseSensitive);
    }

    public EncodeUtil(boolean isCaseSensitive, String symbols) {
        this.useDefaultSymbols = false;
        setCaseSensitive(isCaseSensitive);
        setSymbols(symbols);
    }

    public void setSymbols(String symbols) {
        this.symbols = symbols;
        fillLookupArray();
    }

    public void setCaseSensitive(boolean isCaseSensitive) {
        this.isCaseSensitive = isCaseSensitive;
        if (useDefaultSymbols) {
            setSymbols(makeAlphaNumericString(isCaseSensitive));
        }
    }

    private void fillLookupArray() {
        //reset lookup array
        for (int i = 0; i < symbolLookup.length; i++) {
            symbolLookup[i] = -1;
        }
        for (int i = 0; i < symbols.length(); i++) {
            char c = symbols.charAt(i);
            if (symbolLookup[(int) c] == -1) {
                symbolLookup[(int) c] = i;
            } else {
                throw new IllegalArgumentException("duplicate symbol:" + c);
            }
        }
    }

    private static String makeAlphaNumericString(boolean caseSensitive) {
        StringBuilder sb = new StringBuilder(255);
        int caseDiff = 'a' - 'A';
        for (int i = 'A'; i <= 'Z'; i++) {
            sb.append((char) i);
            if (caseSensitive) sb.append((char) (i + caseDiff));
        }
        for (int i = '0'; i <= '9'; i++) {
            sb.append((char) i);
        }
        return sb.toString();
    }

    public String encodeNumber(long decNum) {
        return encodeNumber(decNum, 0);
    }

    public String encodeNumber(long decNum, int minLen) {
        StringBuilder result = new StringBuilder(20);
        long num = decNum;
        long mod = 0;
        int base = symbols.length();
        do {
            mod = num % base;
            result.append(symbols.charAt((int) mod));
            num = Math.round(Math.floor((num-mod) / base));
        } while (num > 0);
        if (result.length() < minLen) {
            for (int i = result.length(); i < minLen; i++) {
                result.append(symbols.charAt(0));
            }
        }
        return result.toString();
    }

    public long decodeNumber(String encNum) {
        if (encNum == null) return 0;
        if (!isCaseSensitive) encNum = encNum.toUpperCase();
        long result = 0;
        int base = symbols.length();
        long multiplier = 1;
        for (int i = 0; i < encNum.length(); i++) {
            char c = encNum.charAt(i);
            int pos = symbolLookup[(int) c];
            if (pos == -1) {
                String debugValue = encNum.substring(0, i) + "[" + c + "]";
                if (encNum.length()-1 > i) {
                    debugValue += encNum.substring(i + 1);
                }
                throw new IllegalArgumentException(
                    "invalid symbol '" + c + "' at position " 
                    + (i+1) + ": " + debugValue);
            } else {
                result += pos * multiplier;
                multiplier = multiplier * base;
            }
        }
        return result;
    }

    @Override
    public String toString() {
        return symbols;
    }

}

The test:

public class EncodeUtilTest {

    @Test
    public void testRoundTrip() throws Exception {
        //for some reason, numbers larger than this range will not be decoded correctly
        //maybe some bug in JVM with arithmetic with long values?
        //tried also BigDecimal, didn't make any difference
        //anyway, it is highly improbable that we ever need such large numbers
        long value = 288230376151711743L;
        test(value, new EncodeUtil());
        test(value, new EncodeUtil(false));
        test(value, new EncodeUtil(true, "1234567890qwertyuiopasdfghjklzxcvbnm"));
    }

    @Test
    public void testRoundTripMax() throws Exception {
        //this will fail, see above
        test(Long.MAX_VALUE, new EncodeUtil());
    }

    @Test
    public void testRoundTripGettingCloserToMax() throws Exception {
        //here we test different values, getting closer to Long.MAX_VALUE
        //this will fail, see above
        EncodeUtil util = new EncodeUtil();
        for (long i = 1000; i > 0; i--) {
            System.out.println(i);
            test(Long.MAX_VALUE / i, util);
        }
    }

    private void test(long number, EncodeUtil util) throws Exception {
        String encoded = util.encodeNumber(number);
        long result = util.decodeNumber(encoded);
        long diff = number - result;
        //System.out.println(number + " = " + encoded + " diff " + diff);
        assertEquals("original=" + number + ", result=" + result + ", encoded=" + encoded, 0, diff);
    }

}

Any ideas why things start failing when the values get large? I also tried BigInteger, but it did not seem to make a difference.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦在深巷 2024-12-11 04:27:30

您在 encodeNumber 方法中使用浮点数学,这使得您的代码依赖于 double 类型的精度。

替换

num = Math.round(Math.floor((num-mod) / base));

num = (num - mod) / base;

使测试通过。实际上

num = num / base;

应该同样有效(思想实验:当 / 是整数除法时,19 / 10 是什么?)。

You're using floating point maths in your encodeNumber method, which makes your code rely on the precision of the double type.

Replacing

num = Math.round(Math.floor((num-mod) / base));

with

num = (num - mod) / base;

Makes the tests pass. Actually

num = num / base;

Should work just as well (thought experiment: what is 19 / 10 when / is integer division?).

请远离我 2024-12-11 04:27:30

您在代码中进行了双精度转换,这可能会为大值生成奇怪的结果。

num = Math.round(Math.floor((num-mod) / base));

那将是我的第一个停靠港。

You have a conversion to double in your code, which could be generating strange results for large values.

num = Math.round(Math.floor((num-mod) / base));

that would be my first port of call.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文