我可以在 C++ 中使用 [] 运算符吗？创建虚拟阵列

发布于 2024-08-31 17:00:06 字数 1368 浏览 10 评论 0原文

我有一个大型代码库，多年前最初是 C 移植到 C++，它对许多大型空间数据数组进行操作。这些数组包含表示点和表示表面模型的三角形实体的结构。我需要重构代码，以便这些实体在内部存储的具体方式因特定场景而异。例如，如果这些点位于规则的平面网格上，我不需要存储 X 和 Y 坐标，因为它们可以即时计算，三角形也可以。同样，我想利用 STXXL 等核心工具进行存储。最简单的方法是用 put 和 get 类型函数替换数组访问，例如，

point[i].x = XV;

正如

Point p = GetPoint(i);
p.x = XV;
PutPoint(i,p);

您可以想象的那样，这是对大型代码库的非常繁琐的重构，容易在途中出现各种错误。我想做的是编写一个通过重载 [] 运算符来模仿数组的类。由于数组已经存在于堆上，并通过重新分配来移动，因此代码已经假设可能

point *p = point + i;

不会使用对数组的引用。这个类可以写吗？例如，用 [] 运算符编写以下方法；

void MyClass::PutPoint(int Index, Point p)
{
   if (m_StorageStrategy == RegularGrid)
   {
      int xoffs,yoffs;
      ComputeGridFromIndex(Index,xoffs,yoffs);
      StoreGridPoint(xoffs,yoffs,p.z);
    } else
       m_PointArray[Index] = p;   
  }
}

Point MyClass::GetPoint(int Index)
{
   if (m_StorageStrategy == RegularGrid)
   {
      int xoffs,yoffs;
      ComputeGridFromIndex(Index,xoffs,yoffs);
      return GetGridPoint(xoffs,yoffs);   // GetGridPoint returns Point
    } else
       return m_PointArray[Index];   
  }
}

我担心的是，我见过的所有数组类都倾向于按引用传递，而我认为我必须按值传递结构。我认为除了性能之外它应该起作用，任何人都可以看到这种方法的任何主要陷阱。注意，我必须按值传递的原因是为了

point[a].z = point[b].z + point[c].z

在底层存储类型不同的情况下正常工作。

原文

I have a large code base, originally C ported to C++ many years ago, that is operating on a number of large arrays of spatial data. These arrays contain structs representing point and triangle entities that represent surface models. I need to refactor the code such that the specific way these entities are stored internally varies for specific scenarios. For example if the points lie on a regular flat grid, I don't need to store the X and Y coordinates, as they can be calculated on the fly, as can the triangles. Similarly, I want to take advantage of out of core tools such as STXXL for storage. The simplest way of doing this is replacing array access with put and get type functions, e.g.

point[i].x = XV;

becomes

Point p = GetPoint(i);
p.x = XV;
PutPoint(i,p);

As you can imagine, this is a very tedious refactor on a large code base, prone to all sorts of errors en route. What I'd like to do is write a class that mimics the array by overloading the [] operator. As the arrays already live on the heap, and move around with reallocs, the code already assumes that references into the array such as

point *p = point + i;

may not be used. Is this class feasible to write? For example writing the methods below in terms of the [] operator;

void MyClass::PutPoint(int Index, Point p)
{
   if (m_StorageStrategy == RegularGrid)
   {
      int xoffs,yoffs;
      ComputeGridFromIndex(Index,xoffs,yoffs);
      StoreGridPoint(xoffs,yoffs,p.z);
    } else
       m_PointArray[Index] = p;   
  }
}

Point MyClass::GetPoint(int Index)
{
   if (m_StorageStrategy == RegularGrid)
   {
      int xoffs,yoffs;
      ComputeGridFromIndex(Index,xoffs,yoffs);
      return GetGridPoint(xoffs,yoffs);   // GetGridPoint returns Point
    } else
       return m_PointArray[Index];   
  }
}

My concern is that all the array classes I've seen tend to pass by reference, whereas I think I'll have to pass structs by value. I think it should work put other than performance, can anyone see any major pitfalls with this approach. n.b. the reason I have to pass by value is to get

point[a].z = point[b].z + point[c].z

to work correctly where the underlying storage type varies.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

病女 2024-09-07 17:00:06

您不需要按值传递数组。为了改变数组中的值，您需要两个版本的operator[]，一种返回引用（用于改变），另一种返回常量引用。

原则上没有理由不使用operator[]，只要您不需要在运行时改变存储的类型 - 没有虚拟运算符，因此您需要一个命名的如果您想要运行时多态性，请使用函数。在这种情况下，您可以创建一个简单的 struct ，它将运算符调用调整为函数调用（尽管它相当依赖于存储 API - 如果代码假设分配给点的成员变量会更改存储的数据），您可能还必须将点类型设置为模板变量，以便可以覆盖它）。

查看您的示例代码，它对存储策略进行了测试。不要这样做。要么使用面向对象并让存储对象实现通用虚拟接口，要么（可能更好）使用模板编程来改变存储机制。

如果您查看 std::vector 所做的保证（在最新的 C++ 标准中），那么可以拥有具有动态存储并允许使用指针算术的东西，尽管这需要连续的存储。鉴于您的某些值是动态创建的，因此可能不值得对您的实现施加这种限制，但限制本身并不会阻止使用 operator[]。

回复收藏 0 原文

阳光的暖冬 2024-09-07 17:00:06

您想要的是可能的，但由于您还需要写访问权限，因此有时结果会稍微复杂一些。您想要的是 setter 函数返回的不是直接的“点写访问”，而是一个临时副本，一旦副本超出范围，它将执行写入操作。

以下代码片段尝试概述解决方案：

class PointVector
{
  MyClass container_;

  public:
  class PointExSet: public Point
  {
    MyClass &container_;
    int index_;

    public:
    PointExSet(MyClass &container, int index)
      :Point(container.GetVector(index)),container_(container),index_(index)
    {
    }

    ~PointExSet()
    {
      container_.PutVector(index_) = *this;
    }
  };

  PointExSet operator [] (int i)
  {
    return PointExSet(container_,i);
  }
};

它并不像您希望的那么好，但恐怕您无法在 C++ 中获得更好的解决方案。

What you want is possible, but as you need write access as well, the result will be a little bit more complex sometimes. What you want is the setter function returning not a direct "Point write access", rather a temporary copy, which will do the write once the copy goes out of the scope.

Following code fragment tries to outline the solution:

class PointVector
{
  MyClass container_;

  public:
  class PointExSet: public Point
  {
    MyClass &container_;
    int index_;

    public:
    PointExSet(MyClass &container, int index)
      :Point(container.GetVector(index)),container_(container),index_(index)
    {
    }

    ~PointExSet()
    {
      container_.PutVector(index_) = *this;
    }
  };

  PointExSet operator [] (int i)
  {
    return PointExSet(container_,i);
  }
};

It is not as nice as you would probably hope it to be, but I am afraid you cannot get a much better solution in C++.

回复收藏 0 原文

夏の忆 2024-09-07 17:00:06

要完全控制数组上的操作，operator[] 应该返回一个特殊的对象（很久以前发明的，称为“游标”）来为您处理操作。
举个例子：

class Container
{
  PointCursor operator [] (int i)
  {
    return PointCursor(this,i);
  }
};
class PointCursor
{
public:
    PointCursor(_container, _i)
       : container(_container), i(_i),
         //initialize subcursor
         x(container, i) {}     

    //subcursor
    XCursor x;
private:
   Container* container;
   int i;
};
class XCursor
{
public:
    XCursor(_container, _i)
      : container(_container), i(_i) {}

     XCursor& operator = (const XCursor& xc)
     {
          container[i].x = xc.container[xc.i].x;
          //or do whatever you want over x
     }

     Container* container;
     int i; 
}
//usage
my_container[i].x = their_container[j].x; //calls XCursor::operator = ()

To have a full control over operations on array, operator[] should return a special object (invented long ago and called "cursor") that will handle operations for you.
As an example:

class Container
{
  PointCursor operator [] (int i)
  {
    return PointCursor(this,i);
  }
};
class PointCursor
{
public:
    PointCursor(_container, _i)
       : container(_container), i(_i),
         //initialize subcursor
         x(container, i) {}     

    //subcursor
    XCursor x;
private:
   Container* container;
   int i;
};
class XCursor
{
public:
    XCursor(_container, _i)
      : container(_container), i(_i) {}

     XCursor& operator = (const XCursor& xc)
     {
          container[i].x = xc.container[xc.i].x;
          //or do whatever you want over x
     }

     Container* container;
     int i; 
}
//usage
my_container[i].x = their_container[j].x; //calls XCursor::operator = ()

回复收藏 0 原文

北笙凉宸 2024-09-07 17:00:06

阅读上述答案后，我认为 Pete 使用两个版本的 operator[] 的答案是最好的方法。为了在运行时处理类型之间的变形，我创建了一个新的数组模板类，它采用四个参数，如下所示；

template<class TYPE, class ARG_TYPE,class BASE_TYPE, class BASE_ARG_TYPE>
class CMorphArray 
{
int GetSize() { return m_BaseData.GetSize(); }
BOOL IsEmpty() { return m_BaseData.IsEmpty(); }

// Accessing elements
const TYPE& GetAt(int nIndex) const;
TYPE& GetAt(int nIndex);
void SetAt(int nIndex, ARG_TYPE newElement);
const TYPE& ElementAt(int nIndex) const;
TYPE& ElementAt(int nIndex);

// Potentially growing the array
int Add(ARG_TYPE newElement);

// overloaded operator helpers
const TYPE& operator[](int nIndex) const;
TYPE& operator[](int nIndex);

   CBigArray<BASE_TYPE, BASE_ARG_TYPE>  m_BaseData;
private:
   CBigArray<TYPE, ARG_TYPE>    m_RefCache;
   CBigArray<int, int&> m_RefIndex;
   CBigArray<int, int&> m_CacheIndex;

   virtual void Convert(BASE_TYPE,ARG_TYPE) = 0;
   virtual void Convert(TYPE,BASE_ARG_TYPE) = 0;

   void InitCache();
   TYPE&    GetCachedElement(int nIndex);
};

主要数据存储在 m_BaseData 中，它是其本机格式的数据，其类型可能会有所不同，如所讨论的。 m_RefCache 是按预期格式缓存元素的辅助数组，GetCachedElement 函数使用虚拟 Convert 函数按原样转换数据移入和移出缓存。缓存的大小至少需要与可以在任意时刻处于活动状态的同时引用的数量一样大，但在我的情况下，缓存可能会受益于更大的缓存，因为它减少了所需的转换数量。虽然 Alsk 的游标实现可能运行良好，但给出的解决方案需要更少的对象副本和临时变量，并且应该提供稍微更好的性能，这在这种情况下很重要。

对于旧版 MFC 的外观和感觉，向所有 STL 粉丝致歉；该项目的其余部分是 MFC，因此在这种情况下更有意义。 CBigArray 是相关的堆栈溢出问题成为我处理大型数组的基础。我希望今天完成实施并明天进行测试。如果这一切都对我不利，我将相应地编辑这篇文章。

After reading the above answers, I decided that Pete's answer with two versions of operator[] was the best way forward. To handle the morphing between types at run-time I created a new array template class that took four parameters as follows;

template<class TYPE, class ARG_TYPE,class BASE_TYPE, class BASE_ARG_TYPE>
class CMorphArray 
{
int GetSize() { return m_BaseData.GetSize(); }
BOOL IsEmpty() { return m_BaseData.IsEmpty(); }

// Accessing elements
const TYPE& GetAt(int nIndex) const;
TYPE& GetAt(int nIndex);
void SetAt(int nIndex, ARG_TYPE newElement);
const TYPE& ElementAt(int nIndex) const;
TYPE& ElementAt(int nIndex);

// Potentially growing the array
int Add(ARG_TYPE newElement);

// overloaded operator helpers
const TYPE& operator[](int nIndex) const;
TYPE& operator[](int nIndex);

   CBigArray<BASE_TYPE, BASE_ARG_TYPE>  m_BaseData;
private:
   CBigArray<TYPE, ARG_TYPE>    m_RefCache;
   CBigArray<int, int&> m_RefIndex;
   CBigArray<int, int&> m_CacheIndex;

   virtual void Convert(BASE_TYPE,ARG_TYPE) = 0;
   virtual void Convert(TYPE,BASE_ARG_TYPE) = 0;

   void InitCache();
   TYPE&    GetCachedElement(int nIndex);
};

The main data storage is in m_BaseData which is the data in its native format, which can vary in type as discussed. m_RefCache is secondary array to cache of elements in the expected format, and the GetCachedElement function uses the virtual Convert functions to translate the data as it is moved in and out of the cache. The cache needs to be at least as big as the number of simultaneous references that can be active at any one time, but in my case will probably benefit from being bigger as it reduces the number of conversions required. While Alsk's cursor implementation probably would have worked well, the solution given requires fewer object copies and temporary variables, and ought to afford slightly better performance which is important in this case.

Apologies to all you STL fans for the older MFC look and feel; the rest of the project is MFC so it makes more sense in this case. The CBigArray was the result of a related stack overflow question that became the basis of my large array handling. I hope to finish the implementation today and test tomorrow. If it all goes belly up on me, I'll edit this post accoringly.

回复收藏 0 原文

~没有更多了~