如何开发一个程序来最大限度地减少手写调查的人工转录错误

发布于 2024-09-04 00:44:07 字数 870 浏览 2 评论 0原文

我需要开发定制软件来进行调查。问题可能是多项选择,或者在极少数情况下是自由文本。

我被要求设计一个子系统来检查多项选择部分的手动数据输入是否有错误。我们正在努力加快用户数据输入过程,并尽量减少数字表格和原始调查问卷之间的人工输入差异。调查充满了人类采访者的手写标记和文本,因此可能会发现难以阅读的标记,或者用户也可能会意外地在某些问题中选择不同的值,我们希望避免这种情况。

该软件必须包括一些自动控制来检测可能的打字差异。多项选择题的每个答案被选中的概率相同。

这个问题有两个部分:

  • GUI。

我想到的最简单的事情是实现问题显示的最有用的设计:使用大且可读的字体和慷慨的空间选择。还有别的事吗?为了更快地输入,我想使用下拉列表(更喜欢键盘而不是鼠标)。鉴于问题按部分分组,我想显示为该部分的问题选择的答案,但这可能会减慢该过程。还有其他想法吗?

  • 错误检查子系统。

我还能做些什么来最大程度地减少或检查多项选择题中的人为拼写错误?这是一个可以解决的问题吗?是否有一些统计方法来检查用户输入的值是否与手工填写的表格相同?例如,假设调查有 5 个问题,每个问题有 4 个选项。假设我有n份由采访者以纸质形式填写的调查表格,并且已准备好输入软件中,那么如何最大限度地减少手动转录n份调查的意外差异,而不必仔细检查所有内容n次调查的5个问题?

我的第一个建议是,在处理完所有手工填写的表格后,软件可以随机选择一些表格,以便在少数情况下对答复进行双重检查,但我可以根据什么标准做出这种选择?这种验证是否足以以一种重要的方式涵盖所有内容?

实际调查是国家级的,有 56 页,总共 200 多个问题,因此将是很多人手写的页面,目的是减少错误的可能性并优化数据输入的速度;过程。考虑到采访者携带笔记本电脑或手持设备的复杂性,调查必须首先填写纸张。

I need to develop custom software to do surveys. Questions may be of multiple choice, or free text in a very few cases.

I was asked to design a subsystem to check if there is any error in the manual data entry for the multiple choices part. We're trying to speed up the user data entry process and to minimize human input differences between digital forms and the original questionnaires. The surveys are filled with handwritten marks and text by human interviewers, so it's possible to find hard to read marks, or also the user could accidentally select a different value in some question, and we would like to avoid that.

The software must include some automatic control to detect possible typing differences. Each answer of the multiple choice questions has the same probability of being selected.

This question has two parts:

  • The GUI.

The most simple thing I have in mind is to implement the most usable design of the questions display: use of large and readable fonts and space generously the choices. Is there something else? For faster input, I would like to use drop down lists (favoring keyboard over mouse). Given the questions are grouped in sections, I would like to show the answers selected for the questions of that section, but this could slow down the process. Any other ideas?

  • The error checking subsystem.

What else can I do to minimize or to check human typos in the multiple choice questions? Is this a solvable problem? is there some statistical methodology to check values that were entered by the users are the same from the hand filled forms? For example, let's suppose the survey has 5 questions, and each has 4 options. Let's say I have n survey forms filled in paper by interviewers, and they're ready to be entered in the software, then how to minimize the accidental differences that can have the manual transcription of the n surveys, without having to double check everything in the 5 questions of the n surveys?

My first suggestion is that at the end of the processing of all the hand filled forms, the software could choose some forms randomly to make a double check of the responses in a few instances, but on what criteria can I make this selection? This validation would be enough to cover everything in a significant way?

The actual survey is nation level and it has 56 pages with over 200 questions in total, so it will be a lot of hand written pages by many people, and the intention is to reduce the likelihood of errors and to optimize speed in the data entry process. The surveys must filled in paper first, given the complications of taking laptops or handhelds with the interviewers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

我只土不豪 2024-09-11 00:44:07

你可以说我是老派,但我仍然认为最实用的方法是使用复式记账法。两名数据录入职员输入他们的调查,然后交换堆栈并输入另一名职员的调查。每当你的系统检测到两者之间的差异时,它就会抛出一个标志 - 然后两个职员一起思考并决定正确的答案(或者可能会由更高级的研究人员等进行审查)。结合这里的一些其他建议(我非常喜欢 mdma 对 GUI 的建议),这将形成一个低错误系统。

是的,这可能会使您的数据输入时间加倍 - 但它非常简单,并且会大大减少您的错误。 OMR 想法是一个很棒的想法,但在我看来,这个项目(一项全国性的 52 页调查)并不是一个单独的黑客首次尝试实现该想法的最佳案例。你需要什么软件?有什么硬件可以做到这一点?识别愚蠢的东西仍然需要大量的人工工作,面试官会标记所有四个可能的答案,然后在旁边写下注释 - 您可能需要随机抽样调查来了解机器的内容-读取错误率是。即使如此,您仍然只是错误率的估计,而不是更正的数据。

这次尝试一种更简单的方法来为您的雇主提供高质量的结果 - 然后使用这些结果作为预先验证的数据集,以便在下次试验 OMR 内容。

Call me old-school, but I still think the most pragmatic way to do this is to use double entry. Two data entry clerks enter their surveys, then swap stacks and enter the other clerk's surveys. Whenever your system detects a difference between the two, it throws up a flag - then the two clerks put their heads together and decide on the correct answer (or maybe it gets reviewed by a more senior research staff member, etc.). Combined with some of the other suggestions here (I like mdma's suggestions for the GUI a lot), this would make for a low-error system.

Yes, this will double your data entry time (maybe) - but it's dead simple and will cut your errors way, way down. The OMR idea is a great one, but it doesn't sound to me like this project (a national, 52-page survey) is the best case for a lone hacker to try to implement that for the first time. What software do you need? What hardware is available to do that? There will still be a lot of human work involved in identifying the goofy stuff where an interviewer marks all four possible answers and then writes a note off to the side - you'll likely want to randomly sample surveys to get a sense of what the machine-read error rate is. Even then you still just have an estimate of the error rate, not corrected data.

Try a simpler method to give your employer quality results this time - then use those results as a pre-validated data set for experimenting with the OMR stuff for next time.

一世旳自豪 2024-09-11 00:44:07

OCR/OMR 可能是最好的选择,因为您可以排除不可预测的人为错误,并将其替换为相当可预测的机器错误。甚至可以过滤掉 OCR 可能难以处理的表格,并对这些表格进行修改以提高扫描准确性。

但是,正面解决最初的问题:

错误检查

  • 具有相关的问题,因此本质上同一件事会被多次询问,或者再次以否定的方式询问。如果相关问题的答案也不相关,则这可能表明输入错误。
  • 与标准的偏差:如果典型响应中存在模式,那么与这些典型响应的偏差可以被视为潜在的输入错误。例如,如果问题 2 和问题 3 的答案为 A,则问题很可能是 C 或 D。这是上述相关性的概括。可以根据已输入的数据动态计算相关性。

GUI

  • 让 GUI 模仿纸质形式,以便录入员在纸质上看到的内容反映在屏幕上。这样,在 GUI 中将纸质问题答案输入到错误问题的可能性就较小。
  • 为数据输入人员提供视觉帮助,例如使用滑块在纸上保留当前问题的位置。
  • 用于输入数据的自定义输入设备可能比键盘导航和列表框更容易使用。例如,所有选项都拼写为 ABC D 的触摸屏。店员只需点击一个选项,它就会被选中,并在短暂的停顿后显示下一个问题。如果职员出错,他们可以使用每个问题旁边的上一个/下一个按钮。
  • 提供输入数据的音频反馈,因此当店员输入“A”时,他们会听到“A”。

编辑:
如果您考虑执行数据双重输入或实施改进的 GUI,则可能值得进行试点计划来评估各种方法的有效性。双重录入的成本可能很高(数据录入任务的成本加倍)——这可能会或可能不会因准确性的提高而得到证实。试点计划将使您能够快速且相对便宜地评估双重记账的有效性。它还可以让您了解单个数据输入员在没有任何 UI 更改的情况下所犯的错误级别,这有助于确定是否需要更改 UI 或其他减少错误的策略,以及实施这些策略需要多少成本。

相关链接

OCR/OMR is probably the best choice, since you rule out unpredictable human error and replace it with fairly predicatable machine error. It may even be possible to filter out forms that the OCR may struggle with and have these amended to improve scan accuracy.

But, tackling the original question head on:

Error Checking

  • have questions correlated, so that essentially the same thing is asked more than once, or asked again in the negative. If the answers from correlated questions do not also correlate, then this could be an indication of input error.
  • deviations from the norm: if there are patterns in the typical responses then deviations from these typical reponses could be considered potential input errors. E.g. if questions 2 and 3 answer A, then question for is likely to be C or D. This is a generalization of correlation above. The correlations can be computed dynamically based on already inputted data.

GUI

  • have the GUI mimic the paper form, so that what entry clerks see on paper is reflected on the screen. Entering a paper question response into the wrong question in the GUI is less likely then.
  • provide visual assistance to data entry clerks, such as using a slider to maintain the current question location on paper.
  • A custom entry device for inputting the data may be easier to use than keyboard navigation and listboxes. For example, a touch display with all options spelled out A B C D. The clerk only has to hit an option, and it is selected and the next question shown - after a brief pause. In the event the clerk makes an error, they can use the prev/next buttons next to each question.
  • provide audio feedback of entered data, so when the clerk enters "A" they hear "A".

EDIT:
If you consider performing dual-entry of data or implementing an improved GUI, it may be worth conducting a pilot scheme to assess the effectiveness of various approaches. Dual-entry can be expensive (doubling the cost of the data entry task) - which may or may not be justified by the improvement in accuracy. A pilot scheme will allow you to assess the effectiveness of dual-entry, quickly and relatively inexpensively. It will also give you an idea of the level of error from a single data entry clerk without any UI changes which can help help determine whether UI changes or other error-reducing strategies are needed and how much cost can be justified in implementing them.

Related links

你与昨日 2024-09-11 00:44:07

我的第一个建议是,在处理完所有手工填写的表格后,软件可以随机选择一些表格,以便在少数情况下对答复进行双重检查

我认为这实际上不会产生有意义的结果。据推测,这些错误是无意的和随机的。随机检查会发现系统性错误,但如果您仔细检查 10% 的表格,您只能发现 10% 的随机错误(如果您检查 20% 的表格,则只能发现 20% 的错误,等等)。

纸质调查是什么样的?如果可能的话,我猜想扫描手写测试并将 OCR 检测到的答案与数据输入操作员给出的答案进行比较的 OCR 系统将是一个更好的解决方案。您最终可能仍会手动仔细检查相当数量的调查,但您会确信,与随机挑选的调查相比,您仔细检查的调查更有可能包含错误。

如果您还可以控制纸质调查的外观,那就更好了:您可以专门设计它们,以便 OCR 尽可能准确。

My first suggestion is that at the end of the processing of all the hand filled forms, the software could choose some forms randomly to make a double check of the responses in a few instances

I don't think this will actually produce a meaningful outcome. Presumably the errors are unintentional and random. Random checks would find systemic errors, but you'll only find 10% of random errors if you double-check 10% of the forms (and 20% of errors if you check 20% of forms, etc).

What do the paper surveys look like? If possible, I would guess that an OCR system which scans the hand-written tests and compares what the OCR detects the answer to be with what the data entry operator gave would be a better solution. You might still end up manually double-checking a fair number of surveys but you'll have some confidence that the surveys you double-check are more likely to contain an error than if you just picked them out at random.

If you also control what the paper surveys look like, then that's even better: you can design them specifically so that OCR can be made as accurate as possible.

清风无影 2024-09-11 00:44:07

请原谅我完全回避这个问题,但昨天我去了 eBay,花了 99 美元购买了一台 7 英寸 Android 平板电脑。不是世界上最好的贴纸处理器,也没有大量的 RAM,但肯定足以填写该领域的用户调查。

我不敢相信您的组织无法支付每位面试官 99 美元的费用来解决这个问题。

至少值得向你的老板建议,不是吗?

Forgive me for totally side-stepping the question, but yesterday I went to eBay and paid US $99 for a 7 inch Android o/s slate PC. Not the world's paster processor, nor with heaps of RAM, but certainly enough to fill in user surveys in the field.

I can't believe that your organization can't afford $99 per interviewer to make this problem go away.

It's worth suggesting to your boss, at least, isn't it?

伏妖词 2024-09-11 00:44:07

我支持马特·帕克关于使用复式记账法减少错误的建议。我什至看到三重输入用于对错误非常敏感的数据输入任务。

复式记账的好处是,它使您能够通过做出一些假设(主要是条目和职员之间的错误率是一致的)并使用遇到条目冲突的比率来对总体错误率进行大致估计。

更复杂的复式输入系统还可以测量部分数据输入任务和个别职员的错误率,以便您可以进行改进以降低错误率。

I would support Matt Parker's suggestion of using double entry to reduce errors. I have even seen triple entry used for very error-sensitive data entry tasks.

The good thing about double entry is it enables you to come up with a ballpark estimate of your overall error rate by making some assumptions (mainly that the error rate is consistent across entry items and clerks) and using the rate at which entry conflicts are encountered.

More sophisticated double entry systems can also measure the error rates of parts of the data entry task and individual clerks so that you can make improvements to reduce the error rate.

川水往事 2024-09-11 00:44:07

听起来需要一种组合方法,实际的表单应该适合自动化处理。您可以扫描文档并只处理电子版本,如果可以自动处理多项选择输入,则可以通过将用户排除在外来获得更好的错误率。根据 OCR 包,我猜您会得到一个返回值,告诉您系统对其所做选择的确定程度,根据该值,您希望有人验证表单。注意我说的是在多项选择的标记上使用 ocr,而不是在自由格式条目上使用,这本身可能就是一个问题。

同时,您可能需要进行随机检查以找出 OCR 系统的错误率。然后,该值可用于确定多项选择问题的总和的置信度值。

我认为,如果您只是采用人工输入,类似的方法会很有帮助,您可能不会消除所有错误,因为人们会犯错误,并且他们会犯错误来纠正错误,但如果样本量足够大,您可能会能够确定人类输入中的错误率。然后可以使用该数字来确定调查结果。

至于其他 UI 想法,您可以使用扫描的表单并以 UI 复选框靠近书面复选框的方式覆盖 UI。如果您有几条已知的角度线,则拉直和缩放形状应该不会太难。如果 UI 输入元素靠近铅笔标记,您就有可能获得更高的正确分类率。

您也可以使用统计分析来选择看起来不相符的表格,但是您可能会通过不均匀的选择来扭曲结果,这可能比均匀的随机误差更糟糕。根据纸质调查的设计,将其复制到 UI 中可能会有所帮助,如果两者看起来相似,每个人都会更容易发现错误,如果您不坚持这一点,可能会参考调查中的一些参考资料设计(例如这个可能会有所帮助。

这似乎是一个相当在大型操作中,我确信员工中有一些统计学家,与他们讨论他们需要什么以及你可以做些什么来帮助他们,而不应该做更多地扭曲结果。

It sounds like there is need for a combined approach, the actual forms should be suitable for automated processing. You could scan the documents and just deal with the electronic version, if the multiple choice input can be automatically process you might get better error ratios by keeping the user out of the loop. Depending on the OCR package I would guess that you will get a value back that tells you how sure the system is about a selection it has made, dependent on that value you will want to have the form verified by a person. Note I am talking about using ocr on the marks on the multiple choice not the freeform entries, that is probably an issue by itself.

In parallel you will probably want to do random checks to find the error ratio of the ocr system. This value can then be used for determining the confidence value for the sum of the multiple choice question.

I think a similar approach would be helpful if you just go with human input, you will probably not get rid of all the errors because people will make errors and they will make errors correcting errors, but with a large enough sample size you will probably be able to determine the ratio of errors in the human input. This number can then be used for determining the results of the survey.

As for other UI ideas, you could use the scanned forms and overlay the UI in a way that the UI checkbox is close to the written checkbox. If you have a couple of known lines at angles, straightening and scaling the form should not be too hard. If the UI input element is close to the pencil marks chances are you are going to get higher rates for correct classification.

You can also probably use statistical analysis to pick forms that seem out of line, but you might then be skewing the result by non uniform selection which might be worse than a uniform random error. Depending on the design of the paper survey it might be helpful to copy that in the UI, it will be easier for everybody to find errors if the two should look similar, if you don't stick to that may some of the references on survey design (like this might be helpful.

This seems to be a rather large operation, I am sure there are some statisticians on staff, talk to them on what they need and what you could do to help them and should not do to skew results even more.

岁月打碎记忆 2024-09-11 00:44:07

在针对此问题实施最佳软件组合后,您还可以考虑通过 Amazon's Mechanical turk 对转录内容进行编程并与原始内容进行人工交叉检查。类似的其他项目有 reCaptcha (尽管据我所知,它仅适用于打印文本 OCR),我只是遇到了 Beextra 它似乎正在做一些事情,比如对史密森尼媒体进行编目。

After you've implemented your best mix of software approaches to this problem, you could also consider running the output through Amazon's mechanical turk program and perform a human cross-check of the transcription to the originals. Other projects along those lines are reCaptcha (though it's only for printed text OCR as far as I can tell), and I just came across Beextra which seems to be doing things like cataloging Smithsonian media.

燃情 2024-09-11 00:44:07

关于多项选择答案转录错误的检测,我的建议是使用多个数据输入人员和统计分析。

统计学家可以比较结果,看看是否存在任何问题,因为一个数据输入用户输入的答案与其他用户输入的答案的答案分布明显不同。如果是这样,那么可以标记这些问题以便从表格中重新输入。

假设表格被随机分配给数据输入人员,对于每个数据输入用户足够多的表格,输入的结果应该具有相当相似的答案分布。

Regarding detection of errors in transcription of multiple-choice answers, my suggestion is to use multiple data entry people and statistical profiling.

A statistician could compare the results to see if any questions stand out as having a markedly different answer distribution for answers entered by one data entry user vs. those of others. If so, then those questions can be flagged to be reentered from the forms.

Assuming that the forms are randomly assigned to data entry personnel, the entered results should have fairly similar answer distributions for a sufficiently large number of forms per data entry user.

-黛色若梦 2024-09-11 00:44:07

人工双重检查可能是实现低错误数的最流行的方法。 。如果您想加快速度,一个人只能计算给定答案的总数并将该数字写在调查底部(类似于“控制总和”)。向您的应用程序输入数据的人还应该在特殊字段中填写该数字,然后系统可以计算给定答案的数量并与预期值进行比较。这样可以解决数量正确的问题,但不能解决数据正确的问题。

您还可以使用数据挖掘中的一些方法来检测插入数据中的错误。示例:如果您询问年龄和工资范围,您可以创建规则:如果年龄 < X 很可能该人的收入不会超过 Y,因此发出警报并要求修改。这称为关联规则

GUI:它应该与纸质形式的表示为1:1。一些键盘快捷键可能有助于加快工作速度。

Human double checking is probably the most popular way to reach low errors number. . If you'd like to speed it up one person can only calculate total number of given answers and write this number at the bottom of survey(sort of 'control sum'). Person who enters data to your application should also fill that number in a special field and then system can calculate number of given answers and compare with expected value. This can solve problem of correct quantity but not correctness of data.

You can also use some methods from data-minig to detect errors in inserted data. Example: if you ask for age and salary range per you can create rule that says: if age < X it is most likely that person does not earn more than Y so give an alert and ask for revision. This is called association rules

GUI: it should be 1:1 to representation of paper form. some keyboard shortcuts might be helpful to speed up work.

留蓝 2024-09-11 00:44:07

正如已经提到的,键入两次。是的,这是“双倍的工作”,但这引出了第 2 点。

让调查易于键入。

对于键控人员来说,它们应该易于阅读。关于他们的注意力的部分很好地突出显示,因此它从表格的噪音中脱颖而出。

你的“GUI”不应该是这样。 GUI 的主要好处是“可发现性”,这些人不应该“发现”任何东西。一旦他们开始输入内容,键盘导航应该是“唯一”的方式。一两只手放在键盘上,一只手用于更改调查页面==没有手来使用鼠标。对屏幕(对于鼠标或其他任何东西)的注意力会远离对键控调查的注意力。

键控者应该“低着头”,根本不必看屏幕。如果可行,您可以使用音频提示告诉键控者他们在哪里切换了页面,以帮助确保他们键控的内容和计算机键控的内容基本上是相同的。如果无法提供音频提示,则只需让人们在他们所在的调查页面中键入条目即可。计算机已经“知道”它在“2”页上,因此当键控者键入页码时,它可以验证它们是否位于同一位置。

对于键入错误,请务必使用声音提示。不要让他们输入垃圾,点击“保存”,然后纠正错误。如果您立即知道数据有误,请阻止他们并让他们立即修复。没有什么比 5 或 6 次“叮叮叮”更能引起他们的注意了,因为在他们意识到计算机阻止他们之前,他们已经键入了 3 个字段。审核冗长的调查问卷是否有错误是浪费时间。

不要“滚动”您的数据屏幕。来回翻页。滚动很糟糕。当您滚动时,屏幕上的字段会移动。如果您不这样做,他们总是在同一个位置,因此当进入人员确实需要看屏幕时,他们总是可以看同一个地方。

因此,任何长度的下拉列表都会很糟糕。无论如何,他们不应该使用下拉菜单,因为他们不应该看屏幕。表格应该准确地告诉他们需要输入什么内容。

与数据录入保持一致。尽可能使用 10 键。如果您有超过 10 个选项,并且 0-9 对于整个调查问卷来说不实用,那么您应该使用 00-99。不要使用 AZ 来表示选项,因为人们不会那样考虑键。他们记住键盘上的字母不如记住键盘上的单词模式。 01-26 比一周中任何一天的 AZ 都快得多。

另外,SHIFT 键也不是你的朋友。但当他们处于“输入英语”模式时就没事了。

最后,组织调查,使所有“打字”、“填空”内容都集中在一个部分中(最好在最后)。这让他们可以将其余的 10 个键集中到一个区域,而无需来回移动双手。许多人在输入“english”(即使用顶行)时会“顶键”数字,而在不输入时会输入 10 个关键数字。

As has been mentioned, key it twice. Yes it's "double the work", but that leads to point 2.

Make the surveys EASY TO KEY.

They should be simple to read for the keyers. With section regarding their attention well highlighted so it stands out from the noise of the form.

Your "GUI" shouldn't be. The GUIs primary benefit is "discoverability", these folks shouldn't be "discovering" anything. Keyboard navigation should be the "only" way once they start keying stuff in. One or two hands on the keyboard, one hand for changing survey page == no hands for a mouse. Attention to the screen (for a mouse, or anything really) is attention away from the survey for keying.

The keyers should be "heads down", and not having to look at the screen at all. If practical, you can used audio prompts to tell the keyers where they've switched pages, to help ensure that what they're keying and what the computer is keying are basically the same thing. If audio prompts aren't possible, then simply have the entry people key in the page of the survey that they are on. The computer will already "know" it's on page "2", and so when the keyers keys in the page number, it can validate that they're on the same spot.

DO use audible prompts for keying errors. Don't let them key in garbage, hit "save" and then correct errors. If you KNOW the data is wrong right away, STOP them and have them fix it immediately. Nothing catches their attention than 5 or 6 "ding ding dings", because they're already keying 3 fields later before they realize the computer stopped them. Auditing a long questionnaire for errors is a waste of time.

Do NOT "scroll" your data screens. Page back and forth. Scrolling sucks. When you scroll, fields on the screens move. When you don't they're always in the same spot so when the entry person DOES need to look at the screen, they can always look at the same place.

Because of this, drop down lists of any length -- suck. They shouldn't be using drop downs anyway, as they shouldn't be looking at the screen anyway. The form should TELL THEM EXACTLY what they need to key.

Be consistent with the data entry. Use the 10 key as much as possible. If you have more than 10 options, and 0-9 isn't practical for the entire questionnaire, then you should use 00-99. Don't use A-Z for options, as people don't think of keys that way. They don't memorize letters on the keyboard as much as they memorize word patterns on the keyboard. 01-26 is far faster to key than A-Z any day of the week.

Also, the SHIFT key is NOT your friend. But it'll be fine when they're in "typing english" mode.

Finally, organize the survey so all the "typing", "fill in the blank" stuff is in one section (ideally at the end). This lets them 10 key the rest in a blaze, get in to a zone, and not have to move their hands back and forth. Many folks will "top key" numbers when typing "english" (i.e. use the top row) and 10 key numbers when not.

此岸叶落 2024-09-11 00:44:07

对于多项选择题,自动扫描似乎相当可靠。如果您可以选择在开始数据输入之前扫描所有文档,则可以将扫描结果合并到用户界面中,并进行计算机猜测。

对于多项选择题,将数据输入表放在一侧,将原始扫描件放在另一侧。如果计算机的猜测高于某个阈值,请在数据输入区域中填写该选项。如果计算机猜测低于某个阈值(多个答案或未找到答案),则不要标记初始答案并将该问题突出显示为需要注意。即使没有猜测,在数据输入旁边的屏幕上显示扫描的纸张似乎也很有帮助。

对于手写答案,除了在数据输入区域旁边扫描输入之外,我没有任何真正的建议。即使图像不如原始文档清晰,也有助于确保为每个问题输入正确的文本。一个相当常见的输入错误是差一,即为错误的问题输入了正确的答案。将图像显示在屏幕上可以稍微减少一点,并使其他人更容易验证。

这假设所有表单的布局都是相同的,因此您可以编写一些代码来显示特定页面的特定部分,并期望它是表单的正确部分。

For the multiple choice questions, it seems like an automated scan would be fairly reliable. If you have the option of scanning in all the documents before data entry starts, then incorporate the scans into the UI with computer guesses in place.

For a multiple choice question, have the data entry form on one side and the original scan on the other side. If the computer guess is above a certain threshold, fill in that choice in the data entry area. If the computer guess is below a certain threshold (multiple answers or no answer found) then do not mark an initial answer and highlight that question as needing attention. Even without the guesses, having the scanned paper visible on screen next to the data entry seems helpful.

For the handwritten answers I have no real suggestions beyond having the scanned input beside the the data entry area. Even if the image is not as legible as the original document, it will help ensure that the correct text is entered for each question. A fairly common input error is to be off by one, where the correct answer is entered for the wrong question. Having the image on screen could reduce that a little, and make it easier for another human to verify.

This assumes that all the forms are identical in layout so you can write some code to display a certain part of a certain page and expect it to be the right part of the form.

青柠芒果 2024-09-11 00:44:07

设计闭环系统。

您必须偶尔注入双盲“参考表格”,由您的正式人员输入,以自动评估他们的表现,并根据成功率提供反馈。

这将控制人为因素动机并消除输入错误的主要来源。

Design a closed loop system.

You have to inject, once in a while, doubly blind "reference forms" to be entered by your regular personnel to automatically rate their performance, and provide feedback based on the success rate.

This will control the human factor motivation and eliminate the major source of input errors.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文