@taby
Better try to modify ‘test_function(char **c, size_t size)’ to use vector of vectors, if not
Probably you need something like:
std::vector<char*> vectorToCharArray(std::vector<std::vector<char>>& vec) {
std::vector<char*> ptrptr(vec.size());
for(size_t i =0; i < vec.size(); ++i)
ptrptr[i] = vec[i].data();
return ptrptr;
}
auto p = vectorToCharArray(….)
test_function(p.data(). p.size())
But keep in mind lifetime of original std::vector<std::vector<char>> should be longer than ptrptr or any copy of their pointers to data allocated in same memory as original vector !!!
even if you try to resize one of original vectors it`s may cause memory corruption !!!
If so, you dont need to call ‘freeCharArray’ at all.
Q: ‘test_function(char **c, size_t size)’ get just one size, what about size of subarrays ?
C++ pointers that are released upon going out of scope
None
taby said:
OK, one more question. I am trying to convert a vector<vector<char>> to **char, and all of the AIs told me …
Is there any way around this? Surely there's a way to make it so that I don't have to manually call freeCharArray()?
That's where understanding the meaning behind it is far more important. AI doesn't understand that.
The pattern you are using is for a naive irregular / jagged 2D array, where each row of the array must be a different size. The approach you're looking at looks like a simplified form to manipulate it, but it will be slow and cache unfriendly. There are many implementations for jagged arrays that avoid the problems, but that make various tradeoffs depending on the use patterns.
I notice you're also storing null character terminators at the end of each row for each reason. Are you trying to manipulate strings? If your goal is a string table there are data structures optimized for that instead, containers that manage string interning and allowing you to use numeric constant representations of the strings, as it's much faster to manipulate an integer string ID than to process character arrays.
Yes what you're doing creating a vector<vector<char>>
structure can work, but the costs for growing the outer vector can be enormous as adding each row can require reallocation, copying every one of the sub-vectors. At the very least if you follow that approach, make a pass where you reserve them all before you start using them. You're looking at potential problems depending on how the implementation handles its growth pattern, many will grow at patterns that can waste tremendous amounts of memory if you don't reserve the full size. Further, as you're storing on the heap you've got lots of blocks, which can be wasteful and slow again.
You can use an array of pointers to arrays, but that comes with similar tradeoffs. In a string table often it's a flat array and they're pointers into the larger array. They can be quickly processed as offsets into the array instead of raw pointers so they can be serialized (loaded / stored from disk) extremely efficiently.
In any event, usually an optimal approach is to encode or embed a jagged array into either a single allocation or two allocations, with efficient techniques to move between them.
If you've actually got a rectangular 2D array that someone has written as a jagged 2D array, switch it over to a single flat block of memory. Then you've just got the basic (row*width)+column
to access into them.
Understanding the real problem you're trying to solve can help avoid a lot of nonsense like that. Sometimes there is no more efficient form, but for what you're describing there are many different patterns that can be selected based on their use. In different contexts different patterns will be better or worse, so it's important to understand that rather than blindly working with blocks of memory. Lots of blocks forming jagged arrays can work, but performance is going to suck, and overhead can be terrible.
AIs told me to allocate and deallocate using new[] and delete[]
Those are the array forms of new and delete. If you use the single item form new
you must use the corresponding delete
. If you use the array form new[]
to allocate an array you need to use the array form delete[]
to delete an array.
If you mix and match the array and non-array form that's undefined behavior as you've passed in the wrong argument. If you're lucky it will crash.
frob said:
AI doesn't understand that.
Confirm, i tried it, i even point AI to his errors, but after a lot of apologizes it`s writes exactly same code again and again.
So, for me AI programming is kind of smart StackOverflow search, helpful, but not to write complex libraries/projects
None
I found a code through Claude AI. It creates an int*** out of vector<vector<vector<int>>> without using new[]. It looks promising:
#include <iostream>
#include <string>
#include <vector>
#include <memory>
using namespace std;
template<class T>
void test_function(T ***c, const size_t size_x, const size_t size_y, const size_t size_z)
{
}
template<class T>
void test_function(T **c, const size_t size_x, const size_t size_y)
{
}
template<class T>
void test_function(T *c, const size_t size_x)
{
cout << "template test function" << endl;
}
void test_function(double* c, const size_t size_x)
{
cout << "double test function" << endl;
}
template<class T>
T*** get_pointer_from_vector(vector<vector<vector<T>>> &input_data, const size_t size_x, const size_t size_y)
{
vector<vector<T*>> ptr_array2(size_x, vector<T*>(size_y));
vector<T**> ptr_array1(size_x);
for (size_t i = 0; i < size_x; i++)
{
for (size_t j = 0; j < size_y; j++)
ptr_array2[i][j] = input_data[i][j].data();
ptr_array1[i] = ptr_array2[i].data();
}
T*** legacy_ptr = ptr_array1.data();
return legacy_ptr;
}
template<class T>
T** get_pointer_from_vector(vector<vector<T>>& input_data, const size_t size_x)
{
vector<T*> row_ptrs(size_x);
for (size_t i = 0; i < size_x; i++)
row_ptrs[i] = input_data[i].data();
T** legacy_ptr = row_ptrs.data();
return legacy_ptr;
}
template<class T>
T* get_pointer_from_vector(vector<T>& input_data)
{
T* legacy_ptr = input_data.data();
return legacy_ptr;
}
int main(void)
{
const size_t size_x = 2, size_y = 2, size_z = 3;
vector<vector<vector<int>>> data3(size_x, vector<vector<int>>(size_y, vector<int>(size_z)));
test_function(get_pointer_from_vector(data3, size_x, size_y), size_x, size_y, size_z);
vector<vector<char>> data2(size_x, vector<char>(size_y));
test_function(get_pointer_from_vector(data2, size_x), size_x, size_y);
vector<double> data1(size_x);
test_function(get_pointer_from_vector(data1), size_x);
return 0;
}
@taby if size of arrays is constant for each dimension, it`s rly bad idea to use vector of vectors of vectors…
as pointed before by @frob , in this case you just need to allocate flat memory dim1*dim2*dim3 size, and access to elements like array[i1*dim2*dim3 + i2*dim3 + i3]
otherwise you will deal with memory fragmentation and many cache misses, which reduce execution speed dramatically.
None
I must absolutely not allocate memory using new or malloc. I will take whatever penalty there is.
taby said:
I must absolutely not allocate memory using new or malloc. I will take whatever penalty there is.
Why? There are real reasons and scenarios where they might be constraints, but they come with tradeoffs. What is the reason for the constraint?
The code is C and I’m converting it to C++. One of the reasons that I want to get away from using new and malloc is because the malloc() call count is much larger than the free() call count. That cannot be good. The whole point of the exercise is to get rid of this problem altogether, by replacing it with vectors, and simply commenting out calls to free().
I understand the desire to optimize by allocating 1D arrays and using that indexing scheme. I use this for my graphics apps all of the time. But it’s not what I need here.
taby said:
I found a code through Claude AI. It creates an int*** out of vector>> without using new[]. It looks promising
This code is awful and broken, and is further evidence that LLMs are incapable of writing good-quality code. This code is an example of what NOT to do. It will almost assuredly crash as soon as you dereference the double or triple pointer returned by those functions. Those functions create temporary vectors on the stack, then return a pointer to the underlying memory allocation. The problem is that the temporary vectors get destructed when the functions return, and those allocations are freed before you could ever dereference the pointers. If you try, you will be accessing bad memory addresses which is very likely to crash and/or lead to non-deterministic program behavior.
Instead, do what Frob says and create a 2D or 3D array with a single vector, using manual index calculation:
std::vector<int> myData3D( size.x*size.y*size.z );
size_t index1D = index3D.x * (size.y*size.z) + index3D.y * size.z + index3D.z;
myData3D[index1D] // acccess data at position index3D
Not only does this do the minimum number of allocations (just 1), it will also be more efficient to traverse the data, since there are no multiple pointer dereferences and fewer cache misses. Double and especially triple pointers are almost always indicative of badly designed code and data structures.