JNI/OpenGL ES 加载代码期间非常规且狡猾的 Android 崩溃
赏金
因为这对我来说是一个重要的问题,所以我悬赏了。我并不是在寻找确切的答案——无论什么答案引导我解决这个问题都会获得赏金。请确保您已看到下面的编辑。
编辑:我已经设法在 Gdb 死机时捕获它的崩溃(通过“adb shell setprop debug.db.uid 32767”),并注意到这与 Google 网上论坛上的此帖子。显示的回溯与我的崩溃线程相同(除了精确地址)。我承认,我不是调试工具向导,所以如果您对我应该寻找什么有任何想法,请告诉我。
我已经削减
了大部分相当大的应用程序的代码,以便应用程序执行以下操作:通过 JNI 包装器(来自 C++ --> Java)加载一堆纹理,以便 Java 库为我处理解码,从中生成 OpenGL 纹理,并将屏幕清除为相当漂亮但嘲讽的深蓝色。它在 libc 中正在死亡,但只有十分之一。
更糟糕的是,它甚至看起来与我编写的任何代码都没有关系——它似乎以延迟的方式发生,但它似乎与方便归咎的事情无关作为垃圾收集器。我自己的代码中没有发生崩溃的具体点——它似乎在每次运行的基础上发生变化。
更长的故事
我最终得到了一个标准的崩溃转储,其中的堆栈几乎没有告诉我任何信息,因为它有两个条目,一个指向 libc,另一个指向看起来无效或空堆栈帧的条目。 libc 中解析的符号是 pthread_mutex_unlock。我自己甚至不再使用这个函数,因为我已经不再需要多线程了。 (本机代码在表面视图中调用并仅进行渲染。)
pthread_mutex_unlock 会导致分段错误,通常位于地址 0,但有时会出现一个较小的值(小于 0x200)而不是 0。默认(也是最常见的)互斥体Bionic 只有一个可以发生段错误的指针,那就是指向 pthread_mutex_t 结构本身的指针。但是,更复杂的互斥体(有多个选项)可能会使用额外的指针。因此,很可能 libc 没问题,而 libdvm 存在问题(假设我可以信任我的堆栈跟踪)。
让我注意这个问题似乎只有在我执行以下两件事之一时才可以重现:禁用图像数据部分的加载(但仍然读取格式/尺寸信息)并保留用于将纹理加载到 OpenGL 的缓冲区未初始化,或者通过仅禁用最终的 glTexImage2D 调用来禁用 OpenGL 纹理的创建。
请注意,上述用于将纹理加载到 OpenGL 的缓冲区仅创建一次并销毁一次。我尝试放大它,并确定我不会受到特定于该缓冲区的缓冲区溢出问题的困扰。
我能想到的罪魁祸首是:
- 我没有正确使用 JNI,它对堆栈做了一些令人讨厌的事情。
- 我在某个地方出现了一个差一错误,导致堆栈帧损坏。
- 我向 OpenGL ES 传递了一些不好的东西,它也做了同样糟糕的事情(tm)。
- 我的自定义滚动内存分配器无法正常工作。
几天来我一直在梳理我的代码以找出这些罪魁祸首(以及更多!)。我对使用调试器犹豫不决,因为这次崩溃似乎对时间敏感。但是,在启用调试选项的情况下,我自己的本机代码在完全未优化的情况下仍然可能会发生崩溃。 (gdb 本身运行速度极快,应用程序连接时也是如此)
我所做的事情
- 使用了 CheckJNI。
- 尽可能多地精简代码,直到它停止崩溃。
- 编写了一个信号处理程序并编写了一个小型日志系统,以转储信号抛出之前所做的最后事情。
- 试图(但失败了)加剧问题。
- 用金丝雀在两端填充本机堆数组。他们从未改变。
- 审计了代码路径中 100% 的代码。 (我只是没有看到这个问题。)
- 当我修复一个小错误时,我以为问题神奇地消失了,运行代码五十次以确保情况如此,然后第二天我第一次运行时崩溃了。 (哦,我以前从未对错误如此生气过!)
这是 LogCat 中常见的本机崩溃信息的片段:
I/DEBUG ( 5818): signal 11 (SIGSEGV), fault addr 00000000
I/DEBUG ( 5818): r0 0000006e r1 00000080 r2 fffffc5e r3 100ffe58
I/DEBUG ( 5818): r4 00000000 r5 00000000 r6 00000000 r7 00000000
I/DEBUG ( 5818): r8 00000000 r9 8054f999 10 10000000 fp 0013e768
I/DEBUG ( 5818): ip 3b9aca00 sp 100ffe58 lr afd10640 pc 00000000 cpsr 60000010
I/DEBUG ( 5818): d0 643a64696f72646e d1 6472656767756265
I/DEBUG ( 5818): d2 8083297880832965 d3 8083298880832973
I/DEBUG ( 5818): d4 8083291080832908 d5 8083292080832918
I/DEBUG ( 5818): d6 8083293080832928 d7 8083294880832938
I/DEBUG ( 5818): d8 0000000000000000 d9 0000000000000000
I/DEBUG ( 5818): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 5818): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 5818): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 5818): d16 0000000000000000 d17 3fe999999999999a
I/DEBUG ( 5818): d18 42eccefa43de3400 d19 3fe00000000000b4
I/DEBUG ( 5818): d20 4008000000000000 d21 3fd99a27ad32ddf5
I/DEBUG ( 5818): d22 3fd24998d6307188 d23 3fcc7288e957b53b
I/DEBUG ( 5818): d24 3fc74721cad6b0ed d25 3fc39a09d078c69f
I/DEBUG ( 5818): d26 0000000000000000 d27 0000000000000000
I/DEBUG ( 5818): d28 0000000000000000 d29 0000000000000000
I/DEBUG ( 5818): d30 0000000000000000 d31 0000000000000000
I/DEBUG ( 5818): scr 80000012
I/DEBUG ( 5818):
I/DEBUG ( 5818): #00 pc 00000000
I/DEBUG ( 5818): #01 pc 0001063c /system/lib/libc.so
I/DEBUG ( 5818):
I/DEBUG ( 5818): code around pc:
I/DEBUG ( 5818):
I/DEBUG ( 5818): code around lr:
I/DEBUG ( 5818): afd10620 e1a01008 e1a02007 e1a03006 e1a00005
I/DEBUG ( 5818): afd10630 ebfff95d e1a05000 e1a00004 ebffff46
I/DEBUG ( 5818): afd10640 e375006e 03a0006e 13a00000 e8bd81f0
I/DEBUG ( 5818): afd10650 e304cdd3 e3043240 e92d4010 e341c062
I/DEBUG ( 5818): afd10660 e1a0e002 e24dd008 e340300f e1a0200d
I/DEBUG ( 5818):
I/DEBUG ( 5818): stack:
I/DEBUG ( 5818): 100ffe18 00000000
I/DEBUG ( 5818): 100ffe1c 00000000
I/DEBUG ( 5818): 100ffe20 00000000
I/DEBUG ( 5818): 100ffe24 ffffff92
I/DEBUG ( 5818): 100ffe28 100ffe58
I/DEBUG ( 5818): 100ffe2c 00000000
I/DEBUG ( 5818): 100ffe30 00000080
I/DEBUG ( 5818): 100ffe34 8054f999 /system/lib/libdvm.so
I/DEBUG ( 5818): 100ffe38 10000000
I/DEBUG ( 5818): 100ffe3c afd10640 /system/lib/libc.so
I/DEBUG ( 5818): 100ffe40 00000000
I/DEBUG ( 5818): 100ffe44 00000000
I/DEBUG ( 5818): 100ffe48 00000000
I/DEBUG ( 5818): 100ffe4c 00000000
I/DEBUG ( 5818): 100ffe50 e3a07077
I/DEBUG ( 5818): 100ffe54 ef900077
I/DEBUG ( 5818): #01 100ffe58 00000000
I/DEBUG ( 5818): 100ffe5c 00000000
I/DEBUG ( 5818): 100ffe60 00000000
I/DEBUG ( 5818): 100ffe64 00000000
I/DEBUG ( 5818): 100ffe68 00000000
I/DEBUG ( 5818): 100ffe6c 00000000
I/DEBUG ( 5818): 100ffe70 00000000
I/DEBUG ( 5818): 100ffe74 00000000
I/DEBUG ( 5818): 100ffe78 00000000
I/DEBUG ( 5818): 100ffe7c 00000000
I/DEBUG ( 5818): 100ffe80 00000000
I/DEBUG ( 5818): 100ffe84 00000000
I/DEBUG ( 5818): 100ffe88 00000000
I/DEBUG ( 5818): 100ffe8c 00000000
I/DEBUG ( 5818): 100ffe90 00000000
I/DEBUG ( 5818): 100ffe94 00000000
I/DEBUG ( 5818): 100ffe98 00000000
I/DEBUG ( 5818): 100ffe9c 00000000
使用 ndk r6,Android 平台 2.2(API 级别 8),使用 -Wall -Werror 进行编译,ARM 模式仅有的。
我正在研究任何想法,尤其是那些可以确定性方式验证的想法。如果更多信息有帮助,请留下评论(或者如果不能,请留下答案),我会尽快更新我的问题。感谢您阅读本文!
JNI 接口
有 j2n 和 n2j 调用。现在唯一的 j2n 调用在这里:
private static class Renderer implements GLSurfaceView.Renderer {
public void onDrawFrame(GL10 gl) {
GraphicsLib.graphicsStep();
}
public void onSurfaceChanged(GL10 gl, int width, int height) {
GraphicsLib.graphicsInit(width, height);
}
public void onSurfaceCreated(GL10 gl, EGLConfig config) {
// Do nothing.
}
}
此代码通过此接口:
public class GraphicsLib {
static {
System.loadLibrary("graphicslib");
}
public static native void graphicsInit(int width, int height);
public static native void graphicsStep();
}
在本机端看起来像:
extern "C" {
JNIEXPORT void JNICALL FN(graphicsInit)(JNIEnv* env, jobject obj, jint width, jint height);
JNIEXPORT void JNICALL FN(graphicsStep)(JNIEnv* env, jobject obj);
};
函数定义本身以原型的副本开始。
GraphicsInit 只是存储它传递的尺寸并设置 OpenGL,没有任何特别有趣的东西。 GraphicsStep 将屏幕清除为漂亮的颜色,并调用 LoadSprites(env)
。
更复杂的部分由 LoadSprites() 中使用的 n2j 调用组成,该调用每帧加载一个精灵。这不是一个优雅的解决方案,但除了这次崩溃之外它一直在工作。
LoadSprites 的工作方式如下:
GameAssetsInfo gai;
void LoadSprites(JNIEnv* env)
{
InitGameAssets(gai, env);
CatchJNIException(env, "j0");
...
static int z = 0;
if (z < numSprites)
{
CatchJNIException(env, "j1");
OpenGameImage(gai, SpriteIDFromNumber(z));
CatchJNIException(env, "j2");
unsigned int actualWidth = GetGameImageWidth(gai);
CatchJNIException(env, "j3");
unsigned int actualHeight = GetGameImageHeight(gai);
CatchJNIException(env, "j4");
...
jint i;
int r = 0;
CatchJNIException(env, "j5");
do {
CatchJNIException(env, "j6");
i = ReadGameImage(gai);
CatchJNIException(env, "j7");
if (i > 0)
{
// Deal with the pure data chunk -- One line at a time.
CatchJNIException(env, "j8");
StoreGameImageChunk(gai, (int*)sprites[z].data + r, 0, i);
...
r += sprites[z].width;
CatchJNIException(env, "j9");
UnreadGameImage(gai);
CatchJNIException(env, "j10");
} else {
break;
}
} while (true);
CatchJNIException(env, "j11");
CloseGameImage(gai);
CatchJNIException(env, "j12");
... OpenGL ES calls ...
glTexImage2D( ... );
z++;
}
CatchJNIException(env, "j13");
}
其中 CatchJNIException 是这样的(并且 从不 为我打印任何内容):
void CatchJNIException(JNIEnv* env, const char* str)
{
jthrowable exc = env->ExceptionOccurred();
if (exc) {
jclass newExcCls;
env->ExceptionDescribe();
env->ExceptionClear();
newExcCls = env->FindClass(
"java/lang/IllegalArgumentException");
if (newExcCls == NULL) {
// Couldn't find the exception class.. Uuh..
LOGE("Failed to catch JNI exception entirely -- could not find exception class.");
return;
abort();
}
LOGE("Caught JNI exception. (%s)", str);
env->ThrowNew( newExcCls, "thrown from C code");
// abort();
}
}
GameAssetInfo 的相关部分和关联代码仅从本机代码调用,工作方式如下:
void InitGameAssets(GameAssetsInfo& gameasset, JNIEnv* env)
{
CatchJNIException(env, "jS0");
FST;
char str[64];
sprintf(str, "%s/GameAssets", ROOTSTR);
gameasset.env = env;
CatchJNIException(gameasset.env, "jS1");
gameasset.cls = gameasset.env->FindClass(str);
CatchJNIException(gameasset.env, "jS2");
gameasset.openAsset = gameasset.env->GetStaticMethodID(gameasset.cls, "OpenAsset", "(I)V");
CatchJNIException(gameasset.env, "jS3");
gameasset.readAsset = gameasset.env->GetStaticMethodID(gameasset.cls, "ReadAsset", "()I");
CatchJNIException(gameasset.env, "jS4");
gameasset.closeAsset = gameasset.env->GetStaticMethodID(gameasset.cls, "CloseAsset", "()V");
CatchJNIException(gameasset.env, "jS5");
gameasset.buffID = gameasset.env->GetStaticFieldID(gameasset.cls, "buff", "[B");
CatchJNIException(gameasset.env, "jS6");
gameasset.openImage = gameasset.env->GetStaticMethodID(gameasset.cls, "OpenImage", "(I)V");
CatchJNIException(gameasset.env, "jS7");
gameasset.readImage = gameasset.env->GetStaticMethodID(gameasset.cls, "ReadImage", "()I");
CatchJNIException(gameasset.env, "jS8");
gameasset.closeImage = gameasset.env->GetStaticMethodID(gameasset.cls, "CloseImage", "()V");
CatchJNIException(gameasset.env, "jS9");
gameasset.buffIntID = gameasset.env->GetStaticFieldID(gameasset.cls, "buffInt", "[I");
CatchJNIException(gameasset.env, "jS10");
gameasset.imageWidth = gameasset.env->GetStaticFieldID(gameasset.cls, "imageWidth", "I");
CatchJNIException(gameasset.env, "jS11");
gameasset.imageHeight = gameasset.env->GetStaticFieldID(gameasset.cls, "imageHeight", "I");
CatchJNIException(gameasset.env, "jS12");
gameasset.imageHasAlpha = gameasset.env->GetStaticFieldID(gameasset.cls, "imageHasAlpha", "I");
CatchJNIException(gameasset.env, "jS13");
}
void OpenGameAsset(GameAssetsInfo& gameasset, int rsc)
{
FST;
CatchJNIException(gameasset.env, "jS14");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.openAsset, rsc);
CatchJNIException(gameasset.env, "jS15");
}
void CloseGameAsset(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS16");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.closeAsset);
CatchJNIException(gameasset.env, "jS17");
}
int ReadGameAsset(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS18");
int ret = gameasset.env->CallStaticIntMethod(gameasset.cls, gameasset.readAsset);
CatchJNIException(gameasset.env, "jS19");
if (ret > 0)
{
CatchJNIException(gameasset.env, "jS20");
gameasset.obj = gameasset.env->GetStaticObjectField(gameasset.cls, gameasset.buffID);
CatchJNIException(gameasset.env, "jS21");
gameasset.arr = reinterpret_cast<jbyteArray*>(&gameasset.obj);
}
return ret;
}
void UnreadGameAsset(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS22");
gameasset.env->DeleteLocalRef(gameasset.obj);
CatchJNIException(gameasset.env, "jS23");
}
void StoreGameAssetChunk(GameAssetsInfo& gameasset, void* store, int offset, int length)
{
FST;
CatchJNIException(gameasset.env, "jS24");
gameasset.env->GetByteArrayRegion(*gameasset.arr, offset, length, (jbyte*)store);
CatchJNIException(gameasset.env, "jS25");
}
void OpenGameImage(GameAssetsInfo& gameasset, int rsc)
{
FST;
CatchJNIException(gameasset.env, "jS26");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.openImage, rsc);
CatchJNIException(gameasset.env, "jS27");
gameasset.l_imageWidth = (int)gameasset.env->GetStaticIntField(gameasset.cls, gameasset.imageWidth);
CatchJNIException(gameasset.env, "jS28");
gameasset.l_imageHeight = (int)gameasset.env->GetStaticIntField(gameasset.cls, gameasset.imageHeight);
CatchJNIException(gameasset.env, "jS29");
gameasset.l_imageHasAlpha = (int)gameasset.env->GetStaticIntField(gameasset.cls, gameasset.imageHasAlpha);
CatchJNIException(gameasset.env, "jS30");
}
void CloseGameImage(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS31");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.closeImage);
CatchJNIException(gameasset.env, "jS32");
}
int ReadGameImage(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS33");
int ret = gameasset.env->CallStaticIntMethod(gameasset.cls, gameasset.readImage);
CatchJNIException(gameasset.env, "jS34");
if ( ret > 0 )
{
CatchJNIException(gameasset.env, "jS35");
gameasset.obj = gameasset.env->GetStaticObjectField(gameasset.cls, gameasset.buffIntID);
CatchJNIException(gameasset.env, "jS36");
gameasset.arrInt = reinterpret_cast<jintArray*>(&gameasset.obj);
}
return ret;
}
void UnreadGameImage(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS37");
gameasset.env->DeleteLocalRef(gameasset.obj);
CatchJNIException(gameasset.env, "jS38");
}
void StoreGameImageChunk(GameAssetsInfo& gameasset, void* store, int offset, int length)
{
FST;
CatchJNIException(gameasset.env, "jS39");
gameasset.env->GetIntArrayRegion(*gameasset.arrInt, offset, length, (jint*)store);
CatchJNIException(gameasset.env, "jS40");
}
int GetGameImageWidth(GameAssetsInfo& gameasset) { return gameasset.l_imageWidth; }
int GetGameImageHeight(GameAssetsInfo& gameasset) { return gameasset.l_imageHeight; }
int GetGameImageHasAlpha(GameAssetsInfo& gameasset) { return gameasset.l_imageHasAlpha; }
并且它受此支持在 Java 方面:
public class GameAssets {
static public Resources res = null;
static public InputStream is = null;
static public byte buff[];
static public int buffInt[];
static public final int buffSize = 1024;
static public final int buffIntSize = 2048;
static public int imageWidth;
static public int imageHeight;
static public int imageHasAlpha;
static public int imageLocX;
static public int imageLocY;
static public Bitmap mBitmap;
static public BitmapFactory.Options decodeResourceOptions = new BitmapFactory.Options();
public GameAssets(Resources r) {
res = r;
buff = new byte[buffSize];
buffInt = new int[buffIntSize];
decodeResourceOptions.inScaled = false;
}
public static final void OpenAsset(int id) {
is = res.openRawResource(id);
}
public static final int ReadAsset() {
int num = 0;
try {
num = is.read(buff);
} catch (Exception e) {
;
}
return num;
}
public static final void CloseAsset() {
try {
is.close();
} catch (Exception e) {
;
}
is = null;
}
// We want all the advantages that BitmapFactory can provide -- reading
// images of compressed image formats -- so we provide our own interface
// for it.
public static final void OpenImage(int id) {
mBitmap = BitmapFactory.decodeResource(res, id, decodeResourceOptions);
imageWidth = mBitmap.getWidth();
imageHeight = mBitmap.getHeight();
imageHasAlpha = mBitmap.hasAlpha() ? 1 : 0;
imageLocX = 0;
imageLocY = 0;
}
public static final int ReadImage() {
if (imageLocY >= imageHeight) return 0;
int numReadPixels = buffIntSize;
if (imageLocX + buffIntSize >= imageWidth)
{
numReadPixels = imageWidth - imageLocX;
mBitmap.getPixels(buffInt, 0, imageWidth, imageLocX, imageLocY, numReadPixels, 1);
imageLocY++;
}
else
{
mBitmap.getPixels(buffInt, 0, imageWidth, imageLocX, imageLocY, numReadPixels, 1);
imageLocX += numReadPixels;
}
return numReadPixels;
}
public static final void CloseImage() {
}
}
请注意游戏资产代码中明显缺乏线程安全性。
如果更多信息有用,请告诉我。
Bounty
Since this is an important problem to me I've stuck a bounty on. I'm not looking for the exact answer -- whatever answer leads me to fix this problem gets the bounty. Please make sure you've seen the edit just below.
Edit: I've since managed to catch the crash in Gdb just as it dies (via "adb shell setprop debug.db.uid 32767") and noticed this is the exact same problem as is mentioned on this post on Google Groups. The backtrace shown is the same (except for precise addresses) as my crashing thread. I'll admit, I'm no debugging tool wizard, so if you've any ideas of what I should be looking for please let me know.
The quick and dirty rundown
I've whittled away most of my reasonably large application's code so that the app does the following: Loads in a bunch of textures via JNI'd wrappers (from C++ --> Java) so that the Java libraries handle the decoding for me, makes OpenGL textures out of them, and clears the screen to a rather pretty but mocking dark blue color. It's dying in libc, but only one in every ten times.
To make matters worse, it doesn't even look like it's dying related to any of the code I've written -- it seems to happen in a delayed fashion, but it doesn't seem to be related to something as convenient to blame as the garbage collector. There is no specific point in my own code that the crash occurs at -- it seems to shift around on a per-run basis.
The longer story
I'm ending up with a standard crash dump with a stack that tells me just about nothing because it's got two entries, one to libc and one to what looks like an invalid or null stack frame. The resolved symbol in libc is pthread_mutex_unlock. I no longer even use this function myself since I've stripped out the need for multi-threading. (The native code is called in a surface view and just renders.)
pthread_mutex_unlock is resulting in a segmentation fault, usually at address 0 but sometimes a small value (less than 0x200) instead of 0. The default (and most common) mutex in Bionic only has one pointer it can segfault on, and that's the pointer to the pthread_mutex_t structure itself. However, a more complex mutex (there's several options) may use additional pointers. So, chances are libc is fine and libdvm is having the issue (assuming I can trust my stack trace even that far).
Let me note this problem only seems to be reproducible if I do one of these two things: disable loading in the data portion of images (but still reading format/dimension information) and leaving the buffer which I use for loading textures into OpenGL uninitialized, or disabling the creation of the OpenGL texture via disabling only the final glTexImage2D call.
Note that the aforementioned buffer for loading textures into OpenGL is only created once and destroyed once. I've tried enlarging it and determined that I'm not troubled by a buffer overrun issue specific to that buffer.
The main culprits I can think of are:
- I'm not using JNI right and it's doing something nasty to the stack.
- I have an off-by-one error someplace that's corrupting a stack frame.
- I'm passing OpenGL ES something bad and it's doing something equally bad(tm).
- My custom-rolled memory allocator isn't functioning properly.
I've been combing my code for such culprits (and more!) for days. I'm hesitant to use a debugger because this crash seems to be timing-sensitive. However, I can still get the crash with my own native code entirely unoptimized with debug options enabled. (gdb itself runs at a crawl and so does the app when it's connected)
Things I've done
- Used CheckJNI.
- Stripped down as much as the code as I possibly can until it stops crashing.
- Written a signal handler and coded a small logging system to dump out the last things done before the signal was thrown.
- Tried (and failed) to exacerbate the problem.
- Padded native heap arrays on both ends with canaries. They never changed.
- Audited 100% of the code in the code path. (I'm just not seeing the issue.)
- Thought the problem magically disappeared when I fixed a minor error, ran the code fifty times to make sure this was so, and then crashed the next day the first time I ran. (Ooh, I've never been so angry at a bug before!)
Here's a snippet of the usual native crash info from LogCat:
I/DEBUG ( 5818): signal 11 (SIGSEGV), fault addr 00000000
I/DEBUG ( 5818): r0 0000006e r1 00000080 r2 fffffc5e r3 100ffe58
I/DEBUG ( 5818): r4 00000000 r5 00000000 r6 00000000 r7 00000000
I/DEBUG ( 5818): r8 00000000 r9 8054f999 10 10000000 fp 0013e768
I/DEBUG ( 5818): ip 3b9aca00 sp 100ffe58 lr afd10640 pc 00000000 cpsr 60000010
I/DEBUG ( 5818): d0 643a64696f72646e d1 6472656767756265
I/DEBUG ( 5818): d2 8083297880832965 d3 8083298880832973
I/DEBUG ( 5818): d4 8083291080832908 d5 8083292080832918
I/DEBUG ( 5818): d6 8083293080832928 d7 8083294880832938
I/DEBUG ( 5818): d8 0000000000000000 d9 0000000000000000
I/DEBUG ( 5818): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 5818): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 5818): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 5818): d16 0000000000000000 d17 3fe999999999999a
I/DEBUG ( 5818): d18 42eccefa43de3400 d19 3fe00000000000b4
I/DEBUG ( 5818): d20 4008000000000000 d21 3fd99a27ad32ddf5
I/DEBUG ( 5818): d22 3fd24998d6307188 d23 3fcc7288e957b53b
I/DEBUG ( 5818): d24 3fc74721cad6b0ed d25 3fc39a09d078c69f
I/DEBUG ( 5818): d26 0000000000000000 d27 0000000000000000
I/DEBUG ( 5818): d28 0000000000000000 d29 0000000000000000
I/DEBUG ( 5818): d30 0000000000000000 d31 0000000000000000
I/DEBUG ( 5818): scr 80000012
I/DEBUG ( 5818):
I/DEBUG ( 5818): #00 pc 00000000
I/DEBUG ( 5818): #01 pc 0001063c /system/lib/libc.so
I/DEBUG ( 5818):
I/DEBUG ( 5818): code around pc:
I/DEBUG ( 5818):
I/DEBUG ( 5818): code around lr:
I/DEBUG ( 5818): afd10620 e1a01008 e1a02007 e1a03006 e1a00005
I/DEBUG ( 5818): afd10630 ebfff95d e1a05000 e1a00004 ebffff46
I/DEBUG ( 5818): afd10640 e375006e 03a0006e 13a00000 e8bd81f0
I/DEBUG ( 5818): afd10650 e304cdd3 e3043240 e92d4010 e341c062
I/DEBUG ( 5818): afd10660 e1a0e002 e24dd008 e340300f e1a0200d
I/DEBUG ( 5818):
I/DEBUG ( 5818): stack:
I/DEBUG ( 5818): 100ffe18 00000000
I/DEBUG ( 5818): 100ffe1c 00000000
I/DEBUG ( 5818): 100ffe20 00000000
I/DEBUG ( 5818): 100ffe24 ffffff92
I/DEBUG ( 5818): 100ffe28 100ffe58
I/DEBUG ( 5818): 100ffe2c 00000000
I/DEBUG ( 5818): 100ffe30 00000080
I/DEBUG ( 5818): 100ffe34 8054f999 /system/lib/libdvm.so
I/DEBUG ( 5818): 100ffe38 10000000
I/DEBUG ( 5818): 100ffe3c afd10640 /system/lib/libc.so
I/DEBUG ( 5818): 100ffe40 00000000
I/DEBUG ( 5818): 100ffe44 00000000
I/DEBUG ( 5818): 100ffe48 00000000
I/DEBUG ( 5818): 100ffe4c 00000000
I/DEBUG ( 5818): 100ffe50 e3a07077
I/DEBUG ( 5818): 100ffe54 ef900077
I/DEBUG ( 5818): #01 100ffe58 00000000
I/DEBUG ( 5818): 100ffe5c 00000000
I/DEBUG ( 5818): 100ffe60 00000000
I/DEBUG ( 5818): 100ffe64 00000000
I/DEBUG ( 5818): 100ffe68 00000000
I/DEBUG ( 5818): 100ffe6c 00000000
I/DEBUG ( 5818): 100ffe70 00000000
I/DEBUG ( 5818): 100ffe74 00000000
I/DEBUG ( 5818): 100ffe78 00000000
I/DEBUG ( 5818): 100ffe7c 00000000
I/DEBUG ( 5818): 100ffe80 00000000
I/DEBUG ( 5818): 100ffe84 00000000
I/DEBUG ( 5818): 100ffe88 00000000
I/DEBUG ( 5818): 100ffe8c 00000000
I/DEBUG ( 5818): 100ffe90 00000000
I/DEBUG ( 5818): 100ffe94 00000000
I/DEBUG ( 5818): 100ffe98 00000000
I/DEBUG ( 5818): 100ffe9c 00000000
Using ndk r6, Android platform 2.2 (API level 8), compiling with -Wall -Werror, ARM mode only.
I'm looking at any ideas, especially those which are verifiable in a deterministic way. If more information would help, just leave a comment (or if you can't, an answer) and I'll update my question ASAP. Thanks for reading this far!
JNI Interface
There are both j2n and n2j calls. The only j2n calls right now are here:
private static class Renderer implements GLSurfaceView.Renderer {
public void onDrawFrame(GL10 gl) {
GraphicsLib.graphicsStep();
}
public void onSurfaceChanged(GL10 gl, int width, int height) {
GraphicsLib.graphicsInit(width, height);
}
public void onSurfaceCreated(GL10 gl, EGLConfig config) {
// Do nothing.
}
}
This code goes through this interface:
public class GraphicsLib {
static {
System.loadLibrary("graphicslib");
}
public static native void graphicsInit(int width, int height);
public static native void graphicsStep();
}
Which on the native side looks like:
extern "C" {
JNIEXPORT void JNICALL FN(graphicsInit)(JNIEnv* env, jobject obj, jint width, jint height);
JNIEXPORT void JNICALL FN(graphicsStep)(JNIEnv* env, jobject obj);
};
The function definitions themselves begin with a copy of the prototypes.
graphicsInit just stores away the dimensions it was passed and sets up OpenGL a bit without anything particularly interesting. graphicsStep clears the screen to a nice color and and calls LoadSprites(env)
.
The more complex side is comprised of n2j calls used in LoadSprites() which loads in a sprite every frame. Not an elegant solution, but it's been working with exception of this crash.
LoadSprites works like this:
GameAssetsInfo gai;
void LoadSprites(JNIEnv* env)
{
InitGameAssets(gai, env);
CatchJNIException(env, "j0");
...
static int z = 0;
if (z < numSprites)
{
CatchJNIException(env, "j1");
OpenGameImage(gai, SpriteIDFromNumber(z));
CatchJNIException(env, "j2");
unsigned int actualWidth = GetGameImageWidth(gai);
CatchJNIException(env, "j3");
unsigned int actualHeight = GetGameImageHeight(gai);
CatchJNIException(env, "j4");
...
jint i;
int r = 0;
CatchJNIException(env, "j5");
do {
CatchJNIException(env, "j6");
i = ReadGameImage(gai);
CatchJNIException(env, "j7");
if (i > 0)
{
// Deal with the pure data chunk -- One line at a time.
CatchJNIException(env, "j8");
StoreGameImageChunk(gai, (int*)sprites[z].data + r, 0, i);
...
r += sprites[z].width;
CatchJNIException(env, "j9");
UnreadGameImage(gai);
CatchJNIException(env, "j10");
} else {
break;
}
} while (true);
CatchJNIException(env, "j11");
CloseGameImage(gai);
CatchJNIException(env, "j12");
... OpenGL ES calls ...
glTexImage2D( ... );
z++;
}
CatchJNIException(env, "j13");
}
Where CatchJNIException is this (and never prints anything for me):
void CatchJNIException(JNIEnv* env, const char* str)
{
jthrowable exc = env->ExceptionOccurred();
if (exc) {
jclass newExcCls;
env->ExceptionDescribe();
env->ExceptionClear();
newExcCls = env->FindClass(
"java/lang/IllegalArgumentException");
if (newExcCls == NULL) {
// Couldn't find the exception class.. Uuh..
LOGE("Failed to catch JNI exception entirely -- could not find exception class.");
return;
abort();
}
LOGE("Caught JNI exception. (%s)", str);
env->ThrowNew( newExcCls, "thrown from C code");
// abort();
}
}
And the relevant part of GameAssetInfo and associated code is only called from native code and works like this:
void InitGameAssets(GameAssetsInfo& gameasset, JNIEnv* env)
{
CatchJNIException(env, "jS0");
FST;
char str[64];
sprintf(str, "%s/GameAssets", ROOTSTR);
gameasset.env = env;
CatchJNIException(gameasset.env, "jS1");
gameasset.cls = gameasset.env->FindClass(str);
CatchJNIException(gameasset.env, "jS2");
gameasset.openAsset = gameasset.env->GetStaticMethodID(gameasset.cls, "OpenAsset", "(I)V");
CatchJNIException(gameasset.env, "jS3");
gameasset.readAsset = gameasset.env->GetStaticMethodID(gameasset.cls, "ReadAsset", "()I");
CatchJNIException(gameasset.env, "jS4");
gameasset.closeAsset = gameasset.env->GetStaticMethodID(gameasset.cls, "CloseAsset", "()V");
CatchJNIException(gameasset.env, "jS5");
gameasset.buffID = gameasset.env->GetStaticFieldID(gameasset.cls, "buff", "[B");
CatchJNIException(gameasset.env, "jS6");
gameasset.openImage = gameasset.env->GetStaticMethodID(gameasset.cls, "OpenImage", "(I)V");
CatchJNIException(gameasset.env, "jS7");
gameasset.readImage = gameasset.env->GetStaticMethodID(gameasset.cls, "ReadImage", "()I");
CatchJNIException(gameasset.env, "jS8");
gameasset.closeImage = gameasset.env->GetStaticMethodID(gameasset.cls, "CloseImage", "()V");
CatchJNIException(gameasset.env, "jS9");
gameasset.buffIntID = gameasset.env->GetStaticFieldID(gameasset.cls, "buffInt", "[I");
CatchJNIException(gameasset.env, "jS10");
gameasset.imageWidth = gameasset.env->GetStaticFieldID(gameasset.cls, "imageWidth", "I");
CatchJNIException(gameasset.env, "jS11");
gameasset.imageHeight = gameasset.env->GetStaticFieldID(gameasset.cls, "imageHeight", "I");
CatchJNIException(gameasset.env, "jS12");
gameasset.imageHasAlpha = gameasset.env->GetStaticFieldID(gameasset.cls, "imageHasAlpha", "I");
CatchJNIException(gameasset.env, "jS13");
}
void OpenGameAsset(GameAssetsInfo& gameasset, int rsc)
{
FST;
CatchJNIException(gameasset.env, "jS14");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.openAsset, rsc);
CatchJNIException(gameasset.env, "jS15");
}
void CloseGameAsset(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS16");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.closeAsset);
CatchJNIException(gameasset.env, "jS17");
}
int ReadGameAsset(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS18");
int ret = gameasset.env->CallStaticIntMethod(gameasset.cls, gameasset.readAsset);
CatchJNIException(gameasset.env, "jS19");
if (ret > 0)
{
CatchJNIException(gameasset.env, "jS20");
gameasset.obj = gameasset.env->GetStaticObjectField(gameasset.cls, gameasset.buffID);
CatchJNIException(gameasset.env, "jS21");
gameasset.arr = reinterpret_cast<jbyteArray*>(&gameasset.obj);
}
return ret;
}
void UnreadGameAsset(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS22");
gameasset.env->DeleteLocalRef(gameasset.obj);
CatchJNIException(gameasset.env, "jS23");
}
void StoreGameAssetChunk(GameAssetsInfo& gameasset, void* store, int offset, int length)
{
FST;
CatchJNIException(gameasset.env, "jS24");
gameasset.env->GetByteArrayRegion(*gameasset.arr, offset, length, (jbyte*)store);
CatchJNIException(gameasset.env, "jS25");
}
void OpenGameImage(GameAssetsInfo& gameasset, int rsc)
{
FST;
CatchJNIException(gameasset.env, "jS26");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.openImage, rsc);
CatchJNIException(gameasset.env, "jS27");
gameasset.l_imageWidth = (int)gameasset.env->GetStaticIntField(gameasset.cls, gameasset.imageWidth);
CatchJNIException(gameasset.env, "jS28");
gameasset.l_imageHeight = (int)gameasset.env->GetStaticIntField(gameasset.cls, gameasset.imageHeight);
CatchJNIException(gameasset.env, "jS29");
gameasset.l_imageHasAlpha = (int)gameasset.env->GetStaticIntField(gameasset.cls, gameasset.imageHasAlpha);
CatchJNIException(gameasset.env, "jS30");
}
void CloseGameImage(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS31");
gameasset.env->CallStaticVoidMethod(gameasset.cls, gameasset.closeImage);
CatchJNIException(gameasset.env, "jS32");
}
int ReadGameImage(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS33");
int ret = gameasset.env->CallStaticIntMethod(gameasset.cls, gameasset.readImage);
CatchJNIException(gameasset.env, "jS34");
if ( ret > 0 )
{
CatchJNIException(gameasset.env, "jS35");
gameasset.obj = gameasset.env->GetStaticObjectField(gameasset.cls, gameasset.buffIntID);
CatchJNIException(gameasset.env, "jS36");
gameasset.arrInt = reinterpret_cast<jintArray*>(&gameasset.obj);
}
return ret;
}
void UnreadGameImage(GameAssetsInfo& gameasset)
{
FST;
CatchJNIException(gameasset.env, "jS37");
gameasset.env->DeleteLocalRef(gameasset.obj);
CatchJNIException(gameasset.env, "jS38");
}
void StoreGameImageChunk(GameAssetsInfo& gameasset, void* store, int offset, int length)
{
FST;
CatchJNIException(gameasset.env, "jS39");
gameasset.env->GetIntArrayRegion(*gameasset.arrInt, offset, length, (jint*)store);
CatchJNIException(gameasset.env, "jS40");
}
int GetGameImageWidth(GameAssetsInfo& gameasset) { return gameasset.l_imageWidth; }
int GetGameImageHeight(GameAssetsInfo& gameasset) { return gameasset.l_imageHeight; }
int GetGameImageHasAlpha(GameAssetsInfo& gameasset) { return gameasset.l_imageHasAlpha; }
And it's backed by this on the Java side:
public class GameAssets {
static public Resources res = null;
static public InputStream is = null;
static public byte buff[];
static public int buffInt[];
static public final int buffSize = 1024;
static public final int buffIntSize = 2048;
static public int imageWidth;
static public int imageHeight;
static public int imageHasAlpha;
static public int imageLocX;
static public int imageLocY;
static public Bitmap mBitmap;
static public BitmapFactory.Options decodeResourceOptions = new BitmapFactory.Options();
public GameAssets(Resources r) {
res = r;
buff = new byte[buffSize];
buffInt = new int[buffIntSize];
decodeResourceOptions.inScaled = false;
}
public static final void OpenAsset(int id) {
is = res.openRawResource(id);
}
public static final int ReadAsset() {
int num = 0;
try {
num = is.read(buff);
} catch (Exception e) {
;
}
return num;
}
public static final void CloseAsset() {
try {
is.close();
} catch (Exception e) {
;
}
is = null;
}
// We want all the advantages that BitmapFactory can provide -- reading
// images of compressed image formats -- so we provide our own interface
// for it.
public static final void OpenImage(int id) {
mBitmap = BitmapFactory.decodeResource(res, id, decodeResourceOptions);
imageWidth = mBitmap.getWidth();
imageHeight = mBitmap.getHeight();
imageHasAlpha = mBitmap.hasAlpha() ? 1 : 0;
imageLocX = 0;
imageLocY = 0;
}
public static final int ReadImage() {
if (imageLocY >= imageHeight) return 0;
int numReadPixels = buffIntSize;
if (imageLocX + buffIntSize >= imageWidth)
{
numReadPixels = imageWidth - imageLocX;
mBitmap.getPixels(buffInt, 0, imageWidth, imageLocX, imageLocY, numReadPixels, 1);
imageLocY++;
}
else
{
mBitmap.getPixels(buffInt, 0, imageWidth, imageLocX, imageLocY, numReadPixels, 1);
imageLocX += numReadPixels;
}
return numReadPixels;
}
public static final void CloseImage() {
}
}
Please note the distinct lack of thread safety in the game asset code.
Let me know if more information would be useful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
从我之前的评论中发帖。 “可能会发生 JNI 异常,并且由于异常后不返回,因此可能会导致崩溃。我不知道 Android 的日志记录是如何工作的,但在 C 中一个简单的 printf,不需要立即输出日志。所以在崩溃发生的场景中,可能是发生了异常,但是系统在日志输出之前就崩溃了”
这几天没上线。希望崩溃不会再发生……我讨厌某些问题在没有明确解释的情况下神奇地消失。它们通常会立即回来咬你;-) 不管怎样希望你不要被咬
Posting from my earlier comments. "A JNI exception could be happening, and since you don't return after the exception, it could cause a crash. I don't know how Android's logging works, but in C a simple printf, need not output the log immediately. So in the scenario when the crash occurred, it could be that an exception happened but the system crashed before the log could be output"
Wasnt online for a few days. Hope the crash isnt back..I hate it when some issues magically disappear without a clear explanation. They usually come right back and bite you ;-) Anyway hope you dont get bitten
我无法给出答复:(,我刚刚遇到了类似的问题,并注意到java文档某处说何时有线程(如果你正在做OpenGL那么就有线程)。你需要小心第一个参数(env)你不能在不同的线程上共享它,因为它们是特定于线程的,
我有一个全局 env 和 jself 变量,它们是传入的两个参数。打电话时我更改了代码以确保只有渲染线程接触 env/jself 变量。在事件线程中,我传入原始数据,并且只需注意需要做什么,因此不需要 env/jself 变量。使用互斥锁来锁定我的事件结构
看起来您正在此处设置 env 可能是全局的。
游戏资产.env = env;
如果游戏资产是全局的和/或被不同的线程使用,那么简单地通过互斥体/锁定共享 env 或 jobject 类变量将不起作用(它们是特定于线程的)。
TL:博士;当从java调用jni方法时,我只访问env变量和渲染线程上的jobject第二个变量,而没有其他地方,到目前为止这已经缓解了我的问题。
I cant give a reply :(, I just had a similar issue and noticed that the java doc somewhere says when there is threading (if you are doing OpenGL then there is threading). You need to be carefull about the first parameter (env) and second parameter (jobject). You cant share the too on different threads because they are thread specific.
For my case there is the event thread, and rendering thread. I had a global env & jself variables which were the 2 parameters passed in when doing calling jni. I changed my code to ensure only the rendering thread is touching the env/jself variables. In event thread I pass in primitive data, and simply take note of what needs to be done thus not needing env/jself variables. Of course I use mutexes to lock my event structure.
looks like you are setting env here potentially globaly
gameasset.env = env;
if gameasset is global and or being used by different threads, simply sharing env or the jobject class variable by mutexes/locking wont work (they are thread specific).
TL:DR; when calling jni method from java, I only access the env variable, and jobject second variable on rendering thread, and no where else, which so far has relieved my problem.
乍一看来这可能很愚蠢,但你的问题让我想起了我们在 Android 应用程序中遇到的一个问题。我们尽最大努力通过使用“静态”对象来实现最佳效果,我们确信这些对象只应该存在一次并且我们只想被创建一次。这似乎与 Android 中的 Activity 生命周期不一致(经过大量调试和头痛),因此我们转而使用实例并允许操作系统处理 Activity 的清理和优化。这解决了我们的问题,并且我们的应用程序足够稳定。
It may seem foolish at first but your issue reminds me of a problem we had in one of our Android applications. We were doing our best to be optimal by using "static" objects for things that we were sure should only exist once and we would only want to be created one time. This seemed to be at odds with the our Activities lifecycle within Android (after lots of debugging and headache) so we switched to using instances and allowed the OS to handle cleanup and optimization of the Activity. This fixed our problem and our application was sufficiently stable.
冒着指出极其明显的风险......你确定你没有溢出
str
吗?At the risk of pointing out the extremely obvious... are you sure you're not overflowing
str
?