如何在Java中组合英语单词?

发布于 2024-08-28 08:08:29 字数 259 浏览 11 评论 0原文

假设我有一个单词的基本形式和一个来自 宾夕法尼亚树库标签集。如何获得共轭形式?例如,对于“do”和“VBN”,我怎样才能“完成”?

我认为这个任务已经在一些 nlp 库中实现了,所以我宁愿不发明自行车。存在这样的东西吗?

Say I have a base form of a word and a tag from the Penn Treebank Tag Set. How can I get the conjugated form? For example for "do" and "VBN" how can I get "done"?

I thinks this task is already implemented in some nlp library, so I'd rather not invent the bicycle. Does something like that exist?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

征﹌骨岁月お 2024-09-04 08:08:29

如果您有一个类:

public Treebank {
    public String conjugate(String base, String formTag);

    ...
}

那么:

String conjugated = treebank.conjugate(base, formTag);

如果您没有 Treebank 类,它可能看起来有点像这样:

public Treebank {
    private Map<String, Map<String, String>> m_map = new HashMap<String, Map<String, String>>();

    public Treebank() {
        populate();
    }

    public String conjugate(String base, String formTag) {
        return m_map.get(base, formTag);
    }

    private void populate() {
        InputStream istream = openDataFile();

        try {
            for (Record record = readRecord(istream); record !== null; record = readRecord(istream)) {

                // Add the entry
                Map<String, String> entry = m_map.get(record.base);

                if (entry == null)
                    entry = new HashMap<String, String>();

                entry.put(record.formTag, record.conjugatedForm);
                m_map.put(record.base, entry);
           }
        }
        finally {
            closeDataFile(istream);
        }
    }

    // Data management - to be implemented.
    private InputStream openDataFile()                     { ... }
    private Record      readRecord(InputStream istream)    { ... }
    private void        closeDataFile(InputStream istream) { ... }

    private static class Record {
        String base;
        String formTag;
        String conjugatedForm;
    }
}

更好的解决方案可能涉及数据库而不是数据文件。我还将数据访问代码重构为数据访问对象。

If you have a class:

public Treebank {
    public String conjugate(String base, String formTag);

    ...
}

Then:

String conjugated = treebank.conjugate(base, formTag);

If you don't have the Treebank class it might look a bit like this:

public Treebank {
    private Map<String, Map<String, String>> m_map = new HashMap<String, Map<String, String>>();

    public Treebank() {
        populate();
    }

    public String conjugate(String base, String formTag) {
        return m_map.get(base, formTag);
    }

    private void populate() {
        InputStream istream = openDataFile();

        try {
            for (Record record = readRecord(istream); record !== null; record = readRecord(istream)) {

                // Add the entry
                Map<String, String> entry = m_map.get(record.base);

                if (entry == null)
                    entry = new HashMap<String, String>();

                entry.put(record.formTag, record.conjugatedForm);
                m_map.put(record.base, entry);
           }
        }
        finally {
            closeDataFile(istream);
        }
    }

    // Data management - to be implemented.
    private InputStream openDataFile()                     { ... }
    private Record      readRecord(InputStream istream)    { ... }
    private void        closeDataFile(InputStream istream) { ... }

    private static class Record {
        String base;
        String formTag;
        String conjugatedForm;
    }
}

A better solution might involve a database instead of a data file. I would also refactor the data access code into a Data Access Object.

无需解释 2024-09-04 08:08:29

您想要在这里做的是创建一个保存答案的稀疏数组,可通过术语本身作为一个键进行索引,并通过 PTTS 代码(CC、TO、VBD)作为另一个键进行索引。

What you want to do here is create a sparse array holding the answers, indexable via the term itself as one key, and the PTTS-code (CC, TO, VBD) as the other key.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文