根据此图像的文本实例，在JSON中单独的不同图像名称

发布于 2025-02-07 00:33:58 字数 2476 浏览 1 评论 0 原文

。嗨，有人可以帮助我解决这个问题吗？这是用于在JSON文件中生成边界框的坐标的代码。我正在生成带有文字的图像。因此，现在在JSON文件中，我需要用相应的图像编写文本实例。

您可以在此图片上看到此问题：

在此处输入图像描述单词 couch 在 clouds_125.jpg_0

但是，证明是kerala_25.jpg_0 上的，

如果我生成了5个带有文本的图像，则将是同一gt_ []中的图像的5个名称。我只想根据上面的文本实例分开图像名称。

因此，我需要的是这种格式：

“gt_1”:  [   {“points”: [[x1, y1], [x2, y2], …, [xn, yn]], “transcription” : “trans1”, “language” : “Latin”, "illegibility": false },
                …
             {“points”: [[x1, y1], [x2, y2], …, [xn, yn]], “transcription” : “trans2”, “language” : “Chinese”, "illegibility": false }],

“gt_2”:  [
             {“points”: [[x1, y1], [x2, y2], …, [xn, yn]] , “transcription” : “trans3”, “language” : “Latin”, "illegibility": false }],

……

以下是用于在JSON文件中生成边界框的坐标的代码。结果存储在 output.h5 文件中，并从中读取数据。

db = h5py.File('results/Output.h5', 'r')
dsets = sorted(db['data'].keys())
for k in dsets:
    db = get_data()
    imnames = sorted(db['data'].keys())

start = 0
coordinate = []
name = []
final = []
FinalFinal = [] 
for eachWord in textToList:
        length = len(eachWord)
        for i in range(0,4):
            for j in range(start,length+start):
                coordinate.append([charBB_list[0][0][i][j],
charBB_list[0][1][i][j]])
            name.append(coordinate)
            coordinate = []
        for j in range(0, length):
            for i in range(len(name)) :
                print(i,j, name[i][j]) ## koordinate da se snađem, treba
                final.append(name[i][j])

        with open("annotation.json") as json_file:
                    data=json.load(json_file)
                    temp=data["annotations"]
                    y = {
                        f'gt_{imnames}':
                            {
                        "transcription": eachWord,
                        "language": "Latin",
                        "illegibility": "False",
                        "points": [final]
                            }  
                        }
                    temp.append(y)
        
        write_json(data)
        finalToList = np.array(final)
        
        name=[]
        final = []
        start = len(eachWord) + start

我知道 imnames 变量包含生成的图像的所有名称，但我不知道如何将它们分开。

原文

. Hi, does someone can help me with this problem. Here is code for generating coordinates of bounding box in JSON file. I'm generating images with text on it. So now in JSON file I need to write text instance with corresponding images.

You can see this problem on this picture:

enter image description here
The word couch is on clouds_125.jpg_0

enter image description here
But the word proof is on kerala_25.jpg_0

If I generate 5 images with text then will be 5 names of images in the same gt_[ ]. I just want to separate image names depending of what text instances is on it.

So what I need is this format:

“gt_1”:  [   {“points”: [[x1, y1], [x2, y2], …, [xn, yn]], “transcription” : “trans1”, “language” : “Latin”, "illegibility": false },
                …
             {“points”: [[x1, y1], [x2, y2], …, [xn, yn]], “transcription” : “trans2”, “language” : “Chinese”, "illegibility": false }],

“gt_2”:  [
             {“points”: [[x1, y1], [x2, y2], …, [xn, yn]] , “transcription” : “trans3”, “language” : “Latin”, "illegibility": false }],

……

Here is code for generating coordinates of bounding box in JSON file. The results is stored in Output.h5 file and from it JSON read data.

db = h5py.File('results/Output.h5', 'r')
dsets = sorted(db['data'].keys())
for k in dsets:
    db = get_data()
    imnames = sorted(db['data'].keys())

start = 0
coordinate = []
name = []
final = []
FinalFinal = [] 
for eachWord in textToList:
        length = len(eachWord)
        for i in range(0,4):
            for j in range(start,length+start):
                coordinate.append([charBB_list[0][0][i][j],
charBB_list[0][1][i][j]])
            name.append(coordinate)
            coordinate = []
        for j in range(0, length):
            for i in range(len(name)) :
                print(i,j, name[i][j]) ## koordinate da se snađem, treba
                final.append(name[i][j])

        with open("annotation.json") as json_file:
                    data=json.load(json_file)
                    temp=data["annotations"]
                    y = {
                        f'gt_{imnames}':
                            {
                        "transcription": eachWord,
                        "language": "Latin",
                        "illegibility": "False",
                        "points": [final]
                            }  
                        }
                    temp.append(y)
        
        write_json(data)
        finalToList = np.array(final)
        
        name=[]
        final = []
        start = len(eachWord) + start

I know that imnames variable contains all of names of generated images, but I don't know how to separate them.