当前位置：文江博客话题详情

atomic Java vespa

Vespa 访客索引文档

发布于 2025-01-10 19:33:00 字数 1471 浏览 3 评论 0 原文

我想为 vespa 集群中的每个文档分配一个 ID。

但我不完全理解 vespa 中的访客是如何工作的。

我是否可以获得一个共享字段（即由访问者的所有实例共享），每次访问文档时我都可以自动递增该字段（使用一些锁）？

我尝试的方法显然不起作用，但您会看到总体思路：

public class MyVisitor extends DocumentProcessor {

    // where should i put this ? 
    private int document_id;

    private final Lock lock = new ReentrantLock();

    @Override
    public Progress process(Processing processing) {
        Iterator<DocumentOperation> it = processing.getDocumentOperations().iterator();
        while (it.hasNext()) {

            DocumentOperation op = it.next();
            if (op instanceof DocumentPut) {

                Document doc = ((DocumentPut) op).getDocument();
                /*
                 * Remove the PUT operation from the iterator so that it is not indexed back in
                 * the document cluster
                 */
                it.remove();

                try {
                    try {
                        lock.lock();
                        document_id += 1;
                    } finally {
                        lock.unlock();
                    }
                } catch (StatusRuntimeException | IllegalArgumentException e) {
                }
            }
        }
        return Progress.DONE;
    }
}

另一个想法是获取我当前正在处理的存储桶数量和存储桶 ID，并使用此模式进行增量：

document_id = bucket_id
document_id += bucked_count

这会起作用（如果我可以确保我的访问者一次对一个存储桶进行操作）但我不知道如何从访问者那里获取这些信息。

原文

I want to attribute an ID to every document in a vespa cluster.

But I don't completely understand how visitors work in vespa.

Can I get a shared field (meaning shared by all instances of my visitor), which I can atomically increment (using some lock) every time I visit a document ?

What I tried obviously doesn't work, but you'll see the general idea :

public class MyVisitor extends DocumentProcessor {

    // where should i put this ? 
    private int document_id;

    private final Lock lock = new ReentrantLock();

    @Override
    public Progress process(Processing processing) {
        Iterator<DocumentOperation> it = processing.getDocumentOperations().iterator();
        while (it.hasNext()) {

            DocumentOperation op = it.next();
            if (op instanceof DocumentPut) {

                Document doc = ((DocumentPut) op).getDocument();
                /*
                 * Remove the PUT operation from the iterator so that it is not indexed back in
                 * the document cluster
                 */
                it.remove();

                try {
                    try {
                        lock.lock();
                        document_id += 1;
                    } finally {
                        lock.unlock();
                    }
                } catch (StatusRuntimeException | IllegalArgumentException e) {
                }
            }
        }
        return Progress.DONE;
    }
}

Another idea it to get the number of buckets and the bucket id I'm currently dealing with and to increment using this pattern:

document_id = bucket_id
document_id += bucked_count

which would work (if I can ensure my visitor operates on a single bucket at a time) but I don't know how to get these information from my visitor.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

滥情稳全场 2025-01-17 19:33:00

文档处理器对传入文档写入进行操作，因此它们不能应用于访问结果（无论如何都需要更多设置）。

要访问文档，您可以做的就是使用 HTTP/2 获取所有文档： https://docs.vespa.ai/en/reference/document-v1-api-reference.html#visit

然后使用相同的 API 对每个发出更新操作使用相同的 API 设置字段的文档： https://docs.vespa.ai/en/ reference/document-v1-api-reference.html#put

由于这是由单个进程完成的，因此您可以拥有一个分配唯一值的 document_id 计数器。

顺便说一句，避免该要求的一个常见技巧是为每个文档生成一个 UUID。

回复收藏 0 原文

~没有更多了~

关于作者

土豪

暂无简介

文章

29 人气

关注发私信

alipaysp_snBf0MSZIv

文章 0 评论 0

关注

梦断已成空

文章 0 评论 0

关注

瞎闹

文章 0 评论 0

关注

凯凯我们等你回来

文章 0 评论 0

关注

寄意

文章 0 评论 0

关注

似梦非梦

文章 0 评论 0

友情链接

文江博客

Vespa 访客索引文档

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

Vespa 访客索引文档

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。