当处理大的“long”值时,这段代码有什么问题?
我编写了一个实用程序类,用于对具有基数 N 的自定义 数字系统 中的数字进行编码。尊重 Java 程序员,然后我编写了一个单元测试来检查代码是否按预期工作(对于我可以抛出的任何数字)。
事实证明,对于少数人来说,这种方法是有效的。然而,对于足够大的数量,测试失败了。
代码:
public class EncodeUtil {
private String symbols;
private boolean isCaseSensitive;
private boolean useDefaultSymbols;
private int[] symbolLookup = new int[255];
public EncodeUtil() {
this(true);
}
public EncodeUtil(boolean isCaseSensitive) {
this.useDefaultSymbols = true;
setCaseSensitive(isCaseSensitive);
}
public EncodeUtil(boolean isCaseSensitive, String symbols) {
this.useDefaultSymbols = false;
setCaseSensitive(isCaseSensitive);
setSymbols(symbols);
}
public void setSymbols(String symbols) {
this.symbols = symbols;
fillLookupArray();
}
public void setCaseSensitive(boolean isCaseSensitive) {
this.isCaseSensitive = isCaseSensitive;
if (useDefaultSymbols) {
setSymbols(makeAlphaNumericString(isCaseSensitive));
}
}
private void fillLookupArray() {
//reset lookup array
for (int i = 0; i < symbolLookup.length; i++) {
symbolLookup[i] = -1;
}
for (int i = 0; i < symbols.length(); i++) {
char c = symbols.charAt(i);
if (symbolLookup[(int) c] == -1) {
symbolLookup[(int) c] = i;
} else {
throw new IllegalArgumentException("duplicate symbol:" + c);
}
}
}
private static String makeAlphaNumericString(boolean caseSensitive) {
StringBuilder sb = new StringBuilder(255);
int caseDiff = 'a' - 'A';
for (int i = 'A'; i <= 'Z'; i++) {
sb.append((char) i);
if (caseSensitive) sb.append((char) (i + caseDiff));
}
for (int i = '0'; i <= '9'; i++) {
sb.append((char) i);
}
return sb.toString();
}
public String encodeNumber(long decNum) {
return encodeNumber(decNum, 0);
}
public String encodeNumber(long decNum, int minLen) {
StringBuilder result = new StringBuilder(20);
long num = decNum;
long mod = 0;
int base = symbols.length();
do {
mod = num % base;
result.append(symbols.charAt((int) mod));
num = Math.round(Math.floor((num-mod) / base));
} while (num > 0);
if (result.length() < minLen) {
for (int i = result.length(); i < minLen; i++) {
result.append(symbols.charAt(0));
}
}
return result.toString();
}
public long decodeNumber(String encNum) {
if (encNum == null) return 0;
if (!isCaseSensitive) encNum = encNum.toUpperCase();
long result = 0;
int base = symbols.length();
long multiplier = 1;
for (int i = 0; i < encNum.length(); i++) {
char c = encNum.charAt(i);
int pos = symbolLookup[(int) c];
if (pos == -1) {
String debugValue = encNum.substring(0, i) + "[" + c + "]";
if (encNum.length()-1 > i) {
debugValue += encNum.substring(i + 1);
}
throw new IllegalArgumentException(
"invalid symbol '" + c + "' at position "
+ (i+1) + ": " + debugValue);
} else {
result += pos * multiplier;
multiplier = multiplier * base;
}
}
return result;
}
@Override
public String toString() {
return symbols;
}
}
测试:
public class EncodeUtilTest {
@Test
public void testRoundTrip() throws Exception {
//for some reason, numbers larger than this range will not be decoded correctly
//maybe some bug in JVM with arithmetic with long values?
//tried also BigDecimal, didn't make any difference
//anyway, it is highly improbable that we ever need such large numbers
long value = 288230376151711743L;
test(value, new EncodeUtil());
test(value, new EncodeUtil(false));
test(value, new EncodeUtil(true, "1234567890qwertyuiopasdfghjklzxcvbnm"));
}
@Test
public void testRoundTripMax() throws Exception {
//this will fail, see above
test(Long.MAX_VALUE, new EncodeUtil());
}
@Test
public void testRoundTripGettingCloserToMax() throws Exception {
//here we test different values, getting closer to Long.MAX_VALUE
//this will fail, see above
EncodeUtil util = new EncodeUtil();
for (long i = 1000; i > 0; i--) {
System.out.println(i);
test(Long.MAX_VALUE / i, util);
}
}
private void test(long number, EncodeUtil util) throws Exception {
String encoded = util.encodeNumber(number);
long result = util.decodeNumber(encoded);
long diff = number - result;
//System.out.println(number + " = " + encoded + " diff " + diff);
assertEquals("original=" + number + ", result=" + result + ", encoded=" + encoded, 0, diff);
}
}
当值变大时,为什么事情开始失败,有什么想法吗?我也尝试过 BigInteger,但似乎没有什么区别。
I wrote an utility class to encode numbers in a custom numeral system with base N. As any self-respecting Java programmer I then wrote a unit test to check that the code works as expected (for any number I could throw at it).
It turned out, that for small numbers, it worked. However, for sufficiently large numbers, the tests failed.
The code:
public class EncodeUtil {
private String symbols;
private boolean isCaseSensitive;
private boolean useDefaultSymbols;
private int[] symbolLookup = new int[255];
public EncodeUtil() {
this(true);
}
public EncodeUtil(boolean isCaseSensitive) {
this.useDefaultSymbols = true;
setCaseSensitive(isCaseSensitive);
}
public EncodeUtil(boolean isCaseSensitive, String symbols) {
this.useDefaultSymbols = false;
setCaseSensitive(isCaseSensitive);
setSymbols(symbols);
}
public void setSymbols(String symbols) {
this.symbols = symbols;
fillLookupArray();
}
public void setCaseSensitive(boolean isCaseSensitive) {
this.isCaseSensitive = isCaseSensitive;
if (useDefaultSymbols) {
setSymbols(makeAlphaNumericString(isCaseSensitive));
}
}
private void fillLookupArray() {
//reset lookup array
for (int i = 0; i < symbolLookup.length; i++) {
symbolLookup[i] = -1;
}
for (int i = 0; i < symbols.length(); i++) {
char c = symbols.charAt(i);
if (symbolLookup[(int) c] == -1) {
symbolLookup[(int) c] = i;
} else {
throw new IllegalArgumentException("duplicate symbol:" + c);
}
}
}
private static String makeAlphaNumericString(boolean caseSensitive) {
StringBuilder sb = new StringBuilder(255);
int caseDiff = 'a' - 'A';
for (int i = 'A'; i <= 'Z'; i++) {
sb.append((char) i);
if (caseSensitive) sb.append((char) (i + caseDiff));
}
for (int i = '0'; i <= '9'; i++) {
sb.append((char) i);
}
return sb.toString();
}
public String encodeNumber(long decNum) {
return encodeNumber(decNum, 0);
}
public String encodeNumber(long decNum, int minLen) {
StringBuilder result = new StringBuilder(20);
long num = decNum;
long mod = 0;
int base = symbols.length();
do {
mod = num % base;
result.append(symbols.charAt((int) mod));
num = Math.round(Math.floor((num-mod) / base));
} while (num > 0);
if (result.length() < minLen) {
for (int i = result.length(); i < minLen; i++) {
result.append(symbols.charAt(0));
}
}
return result.toString();
}
public long decodeNumber(String encNum) {
if (encNum == null) return 0;
if (!isCaseSensitive) encNum = encNum.toUpperCase();
long result = 0;
int base = symbols.length();
long multiplier = 1;
for (int i = 0; i < encNum.length(); i++) {
char c = encNum.charAt(i);
int pos = symbolLookup[(int) c];
if (pos == -1) {
String debugValue = encNum.substring(0, i) + "[" + c + "]";
if (encNum.length()-1 > i) {
debugValue += encNum.substring(i + 1);
}
throw new IllegalArgumentException(
"invalid symbol '" + c + "' at position "
+ (i+1) + ": " + debugValue);
} else {
result += pos * multiplier;
multiplier = multiplier * base;
}
}
return result;
}
@Override
public String toString() {
return symbols;
}
}
The test:
public class EncodeUtilTest {
@Test
public void testRoundTrip() throws Exception {
//for some reason, numbers larger than this range will not be decoded correctly
//maybe some bug in JVM with arithmetic with long values?
//tried also BigDecimal, didn't make any difference
//anyway, it is highly improbable that we ever need such large numbers
long value = 288230376151711743L;
test(value, new EncodeUtil());
test(value, new EncodeUtil(false));
test(value, new EncodeUtil(true, "1234567890qwertyuiopasdfghjklzxcvbnm"));
}
@Test
public void testRoundTripMax() throws Exception {
//this will fail, see above
test(Long.MAX_VALUE, new EncodeUtil());
}
@Test
public void testRoundTripGettingCloserToMax() throws Exception {
//here we test different values, getting closer to Long.MAX_VALUE
//this will fail, see above
EncodeUtil util = new EncodeUtil();
for (long i = 1000; i > 0; i--) {
System.out.println(i);
test(Long.MAX_VALUE / i, util);
}
}
private void test(long number, EncodeUtil util) throws Exception {
String encoded = util.encodeNumber(number);
long result = util.decodeNumber(encoded);
long diff = number - result;
//System.out.println(number + " = " + encoded + " diff " + diff);
assertEquals("original=" + number + ", result=" + result + ", encoded=" + encoded, 0, diff);
}
}
Any ideas why things start failing when the values get large? I also tried BigInteger, but it did not seem to make a difference.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您在
encodeNumber
方法中使用浮点数学,这使得您的代码依赖于double
类型的精度。替换
为
使测试通过。实际上
应该同样有效(思想实验:当
/
是整数除法时,19 / 10
是什么?)。You're using floating point maths in your
encodeNumber
method, which makes your code rely on the precision of thedouble
type.Replacing
with
Makes the tests pass. Actually
Should work just as well (thought experiment: what is
19 / 10
when/
is integer division?).您在代码中进行了双精度转换,这可能会为大值生成奇怪的结果。
那将是我的第一个停靠港。
You have a conversion to double in your code, which could be generating strange results for large values.
that would be my first port of call.