传递针对已知边界盒坐标的Tesseract OCR的图像
我在一个文件夹中几乎没有图像,并且它们的边界盒坐标是每个图像的TXT文件,为:
0 0.503 0.503 0.334 0.994 (类,x,y,w,h)
我的问题是我想使用图像上此边界框使用Tesseract OCR提取文本。 我在编码部分有一些麻烦。 任何帮助将不胜感激。
(我的文件夹中基本上有2个文件,其中一个具有所有图像,另一个图像分别为每个图像的txt文件中的边界框坐标。) 这是下面的代码。
import cv2
import pytesseract
config = ('-l eng --oem 3 --psm 3')
image_path='D:\\Object detection\\test images\\'
labels_path='D:\\Object detection\\labels\\'
for images in os.listdir(image_path):
spl=images.split('.')[0]
img_name =os.path.join(image_path,images)
image=cv2.imread(img_name)
print(image.shape)
with open(os.path.join(labels_path,spl+'.txt')) as f:
t = f.read()
arr=t.split()
x,y,w,h=arr[1],arr[2],arr[3],arr[4]
x=int(x*image.shape[0])
y=int(y*image.shape[1])
w=int(w*image.shape[0])
h=int(h*image.shape[1])
x1 = round(x-w/2)
y1 = round(y-h/2)
x2 = round(x+w/2)
y2 = round(y+h/2)
rect=cv2.rectangle(image,(x1,y1),(x2,y2),(0,0,200),3)
cropped_img = image[y1:y2, x1:x2]
data = pytesseract.image_to_string(cropped_img, lang='eng',config=config)
print(data)
有2个带有图像及其边界框坐标的文件作为上述形式。 我想要的是在图像名称和标签名称相同的文件上循环循环,并希望在ROI上提取文本。我遇到了这个错误。
13 arr=t.split()
14 x,y,w,h=arr[1],arr[2],arr[3],arr[4]
---> 15 x=int(x*image.shape[0])
16 y=int(y*image.shape[1])
17 w=int(w*image.shape[0])
ValueError: invalid literal for int() with base 10: '0.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.50
I have few images in one folder and have their bounding box coordinates as a txt file for every image as:
0 0.503 0.503 0.334 0.994
(class,x,y,w,h)
My issue is I want to extract text using tesseract OCR using this bounding box on the image.
I have some trouble in the coding part.
Any help would be appreciated.
(There are basically 2 files in my folder where one has all the images and another one has the bounding box coordinates in a txt file for each image respectively.)
THis is the code below.
import cv2
import pytesseract
config = ('-l eng --oem 3 --psm 3')
image_path='D:\\Object detection\\test images\\'
labels_path='D:\\Object detection\\labels\\'
for images in os.listdir(image_path):
spl=images.split('.')[0]
img_name =os.path.join(image_path,images)
image=cv2.imread(img_name)
print(image.shape)
with open(os.path.join(labels_path,spl+'.txt')) as f:
t = f.read()
arr=t.split()
x,y,w,h=arr[1],arr[2],arr[3],arr[4]
x=int(x*image.shape[0])
y=int(y*image.shape[1])
w=int(w*image.shape[0])
h=int(h*image.shape[1])
x1 = round(x-w/2)
y1 = round(y-h/2)
x2 = round(x+w/2)
y2 = round(y+h/2)
rect=cv2.rectangle(image,(x1,y1),(x2,y2),(0,0,200),3)
cropped_img = image[y1:y2, x1:x2]
data = pytesseract.image_to_string(cropped_img, lang='eng',config=config)
print(data)
There are 2 files with images and their bounding box coordinates as the form given above.
What i want is to loop over both the file where the image name and the label name is same and want to extract text over the ROI. I am getting this error.
13 arr=t.split()
14 x,y,w,h=arr[1],arr[2],arr[3],arr[4]
---> 15 x=int(x*image.shape[0])
16 y=int(y*image.shape[1])
17 w=int(w*image.shape[0])
ValueError: invalid literal for int() with base 10: '0.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.5030.50
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论