我需要在Vertex AI中训练自定义OCR。我的数据带有裁剪图像的文件夹,每个图像都是一行,一个带有2列的CSV文件:图像名称和图像中的文本。
但是,当我尝试将其导入 dataset 在Vertex AI中,我看到图像数据集仅支持分类,分割,对象检测。所有数据集都有固定的标签,但是我的数据具有无限数量的标签(如果我们在图像中将文本视为标签),因此所有类型都与我的要求不匹配。我可以使用顶点AI进行培训,以及如何做?
I need to train a custom OCR in vertex AI. My data with have folder of cropped image, each image is a line, and a csv file with 2 columns: image name and text in image.
But when I tried to import it into a dataset in vertex AI, I see that image dataset only support for classification, segmentation, object detection. All of dataset have fixed number of label, but my data have a infinite number of labels(if we view text in image as label), so all types doesn't match with my requirement. Can I use vertex AI for training, and how to do that ?
发布评论
评论(1)
由于顶点AI托管数据集不支持OCR应用程序,因此您可以使用Vertex AI的培训和预测服务来培训和部署自定义模型。
我发现一个很好的从头开始构建OCR系统。该OCR系统是在2个步骤中实现的
请注意,本文不受Google Cloud正式支持。
在本地测试了该模型后,您可以使用自定义模型培训服务。请按照此 codelab 用于训练和部署定制模型的步骤指令。
Once the training is complete, the model can be deployed for inference using a 前构建的容器由顶点AI或A 根据您的要求,自定义容器。您还可以在同步请求的批处理预测和异步请求的在线预测之间进行选择。
Since Vertex AI managed datasets do not support OCR applications, you can train and deploy a custom model using Vertex AI’s training and prediction services.
I found a good article on building an OCR system from scratch. This OCR system is implemented in 2 steps
Please note that this article is not officially supported by Google Cloud.
Once you have tested the model locally, you can train the same on Vertex AI using the custom model training service. Please follow this codelab for step-by-step instructions on training and deploying a custom model.
Once the training is complete, the model can be deployed for inference using a pre-built container offered by Vertex AI or a custom container based on your requirements. You can also choose between batch predictions for synchronous requests and online predictions for asynchronous requests.