如何让ByteBuffer中的只读数据能够被InputStream安全地多次消费？

发布于 01-19 12:14 字数 806 浏览 6 评论 0原文

我正在构建一个 API，我希望在启动时加载一些只读的静态数据。下面我调用一个远程 API，以便在启动时加载此类数据，并且它们的 SDK 仅具有 ByteBuffer 返回类型：

class MyService {
  private ByteBuffer remoteData;

  @PostContruct
  public void init() {
    remoteData = callAPI(); // returns ByteBuffer as type
  }

  public getDataAndDoSomething(Request req) {
    try (Inputstream is = new ByteBufferBackedInputStream(remoteData)) {
      // proceed with ByteBufferBackedInputStream
    }
  }
}

上述问题是在初始调用 getDataAndDoSomething()< /code>, remoteData 不再可用。如果我将 remoteData 设为本地变量并每次调用远程 API，这不会成为问题，但我想仅在启动时加载 remoteData。

我怀疑每次 InputStream 想要使用它时，我都需要以某种方式对其进行深度复制，但是 ByteBuffer API 相当混乱。有什么好方法可以让调用 getDataAndDoSomething() 的多个线程安全地使用它？

原文

I am building an API and I want some read-only, static data to be loaded at startup. Below I am calling a remote API for such data to be loaded at startup and their SDK only has a return type of ByteBuffer:

class MyService {
  private ByteBuffer remoteData;

  @PostContruct
  public void init() {
    remoteData = callAPI(); // returns ByteBuffer as type
  }

  public getDataAndDoSomething(Request req) {
    try (Inputstream is = new ByteBufferBackedInputStream(remoteData)) {
      // proceed with ByteBufferBackedInputStream
    }
  }
}

The issue with above is that after initial invocation of getDataAndDoSomething(), remoteData is no longer consumable. This wouldn't be a issue if I make remoteData a local variable and call remote API each time, but I'd like to load remoteData only at startup.

I suspect I'd need to make a deepcopy of it somehow each time InputStream wants to consume it, but the ByteBuffer APIs are rather confusing. What is a good approach to make this safe to consume from multiple threads that invoke getDataAndDoSomething()?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冷默言语2025-01-26 12:14:41

我有每次克隆当前ByteBuffer的想法，你可以使用方法duplicate()：

ByteBuffer java.nio.ByteBuffer.duplicate()
创建一个共享此缓冲区内容的新字节缓冲区。

class MyService {
  private ByteBuffer remoteData;

  @PostContruct
  public void init() {
    remoteData = callAPI(); // returns ByteBuffer as type
  }

  public getDataAndDoSomething(Request req) {
    ByteBuffer remoteToBeUsed = remoteData.duplicate();
    try (Inputstream is = new ByteBufferBackedInputStream(remoteToBeUsed )) {
      // proceed with ByteBufferBackedInputStream
    }
  }
}

I have the idea of cloning the current ByteBuffer each time, you can use the method duplicate() :

ByteBuffer java.nio.ByteBuffer.duplicate()
Creates a new byte buffer that shares this buffer's content.

class MyService {
  private ByteBuffer remoteData;

  @PostContruct
  public void init() {
    remoteData = callAPI(); // returns ByteBuffer as type
  }

  public getDataAndDoSomething(Request req) {
    ByteBuffer remoteToBeUsed = remoteData.duplicate();
    try (Inputstream is = new ByteBufferBackedInputStream(remoteToBeUsed )) {
      // proceed with ByteBufferBackedInputStream
    }
  }
}

回复收藏 0 原文

戒ㄋ2025-01-26 12:14:41

不需要。 ByteBuffers 是纯粹的内存构造（这确实意味着，如果该 API 返回一吨数据（例如 500MB 或更多），这不是一个好主意！） - 您可以轻松地重置它们。

缓冲区从 0 开始（这部分很简单），并且具有特定的容量（设定的大小；缓冲区不会增长或收缩）。它们甚至有一个标记，以便于读取小于全部容量的内容，然后“翻转”缓冲区以读取您刚刚写入的内容。它们通常的目的是充当中介：“写入者”进程填充它，直到达到容量为止，然后“翻转”缓冲区，以便“读取器”进程读取刚刚放入其中的内容，并在完成时读取缓冲区，它会重置回“写入”模式，来回，一遍又一遍。因此，它们有 4 个数字：开始（始终为 0）、位置、标记和结束。

所提供的缓冲区大概以这种状态开始（由 callAPI() 返回）：

start = 0
position = 0
mark = the total size of the data sent by the API
end = something. Hopefully, equal to mark, otherwise its wasted memory

当您将其用作 ByteBufferBackedInputStream 的源时，无论消耗输入流什么，最终都会移动 position假设它读取了整个内容，指针最终等于 mark，

因此您需要做的就是重置 。 >位置 返回0.

幸运的是，这很简单：

remoteData.position(0);

您可以再次使用它。

ByteBuffer 对象包含实际数据（通常是 byte[]，但它是抽象的，但是，通常，它是字节数组支持的），以及这 4 个指针，

因此，如果您尝试同时创建 4 个 BBBInputStream 并将它们交给不同的线程，那么这些都不起作用。全部只是读取，因此数据本身不会因此而损坏，但是您希望每个线程都有自己的指针吗？

您也可以这样做：您可以创建使用相同支持的新 BB 对象。 buffer：

ByteBuffer clone = remoteData.duplicate();

“duplicate”这个名字有点用词不当——这不复制支持数据，但它确实为您提供了一个具有独立开始/位置/标记/结束值的克隆。将缓冲区复制 3 倍，总共 4 个缓冲区，为每个线程提供这些副本之一。

No need. ByteBuffers are purely memory constructs (which does mean, if that API returns a tonne of data (say, 500MB or more), this is not a good idea!) - you can trivially reset them.

Buffers start at 0 (that part's simple enough), and have a specific capacity (a set size; buffers do not grow or shrink). They even have a mark in order to facilitate reading less than the full capacity and then 'flipping' the buffer to then read what you just wrote. Their usual purpose is to be an intermediary: A 'writer' process fills it, until capacity or not, and then the buffer is 'flipped' so that a 'reader' process reads what was just put in it, and when it is done, it resets back to 'write' mode, back and forth, over and over. Thus, they have 4 numbers: start (always 0), position, mark, and end.

The buffer as provided presumably begins in this state (as returned by callAPI():

start = 0
position = 0
mark = the total size of the data sent by the API
end = something. Hopefully, equal to mark, otherwise its wasted memory

When you then use it as source for your ByteBufferBackedInputStream, whatever consumes the inputstream will end up moving the position pointer forward. Assuming it reads the entire content, the pointer ends up being equal to mark.

Thus all you need to do to get back to the state that it was, is to reset the position back to 0.

Which is trivial to do, fortunately:

remoteData.position(0);

and you can use it again.

A ByteBuffer object contains the actual data (usually, a byte[], but it's abstracted, it could be something else. But, usually, it's byte-array backed), as well as those 4 pointers.

Hence, none of this is going to work if you try to make 4 BBBInputStreams simultaneously and hand em off to various threads. They all just read, so the data itself is not going to get corrupted by this, but those 4 pointers? You want each thread to have its own.

You can do that too, however: You can create new BB objects that use the same backing buffer:

ByteBuffer clone = remoteData.duplicate();

The name 'duplicate' is a bit of a misnomer - this does not duplicate the backing data, but it does give you a clone with independent start/position/mark/end values. Duplicate the buffer 3x for a total of 4 buffers, handing each thread one of these copies.

回复收藏 0 原文

~没有更多了~