[C#] Arrays, Lists, or Dictionaries for a Minecraft-like chunk system?

HekutaFelis

Hello! I have written a working voxel chunk-based system (similar to Minecraft's) in GDScript and now I want to try to rewrite it to the new language that I am practicing, C#. I would like your opinion on how to store chunk data. Every chunk object needs to store information for every voxel (block) it holds: Position, Shape, Rotation, Material, etc. With GDScript I used a Dictionary with Vector3 keys for position and Arrays for values that held info for the rest. Here are three options I though on how to approach it in C# and I would love to hear your opinion on what option is best.

Option 1: Arrays I though about using a three-dimensional array for each chunk, in which the index in the array also works as position information. Arrays are fast, but the problem of this solution is that they need to be of fixed size: x and z would be 10 (chunk size) and y would be the value of the build height limit. The problem with that is that when I need to iterate though the Array to load a chunk, the loop will need to iterate though every single index. I am worried that this would make it slow and it would result in many useless calculations. Even if the chunk only has one block, the loop will pass though a ton of empty indexes. So that brings us to the next option.

Option 2: Lists Lists are generally slower than arrays, but since the loops do not have a fixed size, they will only iterate indexes that actually hold information. The speed the chunk loads will greatly depend on the number of blocks it holds.

Option 3: Dictionaries Similar to list, but the keys will make it much easier to find specific blocks since they will hold information on the position of the block. It’s how I did it with GDScript. I am only worried about the load time when it gets iterated in a loop. Performance is my highest priority for this script.

Are dictionaries the best solution for this? Correct me if I am wrong about something, I am new to C#.

TwistedTwigleg

I've written a voxel system in both GDScript and C# in Godot, but I'm by no means an expert on this, so please take all of this with a grain of salt!

I would say if you are concerned about performance, which makes sense given how much data has to be processed in a voxel chunk, the biggest constraint is probably going to be how you render the Voxel, what optimizations you want to have, and whether your voxels need to access adjacent voxels.

For example, if you want to optimize for the GPU and reduce overdraw (drawing voxels you cannot see), then you will likely want to dynamically construct each face of the voxel that could be visible to the player, and then skip adding the mesh data for all the many voxels that may be completely obscured by voxels surrounding all faces. However, this optimization requires going through every voxel in the chunk, checking all adjacent faces, and then adding the relevant mesh data. This requires at least a single full loop through all the data and 6 adjacent voxel checks, so very processor intensive.

So, with that in mind, here's what I have found:

Arrays

This is probably the fastest solution for most situations where you want to optimize for the GPU, perform voxel lighting calculations in code, or otherwise need to go through the data in a loop when something changes. Arrays are good for this because they are fast, fixed size, and there are some optimizations you can make while iterating over them to speed things up. However, they can be unwieldy to program with and, as you have mentioned, may require looping through the entire array even when the overwhelming majority is empty, since there is no way to know exactly what data you have without trying to access it.

I would recommend using an array if you are going to iterate through every voxel in the chunk routinely, as then you will see the most benefit from it. In my projects, I've used Arrays for voxel chunks because I have to iterate through the entire voxel chunk to generate the mesh. Most voxel projects use arrays, but that doesn't mean it's the best solution for all cases! Depending on how you display and process voxels, using an alternative data structure may be MUCH faster, as you can skip a bunch of otherwise unnecessary calculations. If you have a way to just update small sections of a chunk, then an array might actual be slower since you would (potentially) have to iterate over the entire thing unnecessarily. The kicker here is that many voxel projects have visuals and game play mechanics that rely on 'seeing' other nearby voxels, which almost always means looping through the entire chunk, so using an array is the general route.

Side note: The absolute fastest solution is to use a three dimensional array that is flattened to a single dimension, if you want to absolute fastest access speed, though to be honest I've always just used multidimensional arrays and took the slight performance hit :smile:

Lists

Can't really speak for this one, as I haven't really tried it myself. I think the hard part with a list is going to be trying to determine if the list has a voxel at a certain position without doing a search through the entire list, which could be quite slow. If you fill the list to the maximum size of the chunk to get around the search issue, then you basically have an array and I'm not sure there would be any benefits to using a List. Again though, I haven't tried this at all so I'm really not sure!

Dictionaries

This one is a bit of a toss up and I would say greatly depends on whether you need to iterate over the entire array or not, what is the key (voxel position?), and how much data/RAM you want to use.

The real question that would determine it's performance is how fast checking for an element that does not exist is. If the code for trying to determine if a key is an a dictionary is going through the entire dictionary and doing comparisons, then it would be much slower than an Array or List. On the other hand, if the method is really quick, then it could be fairly quick.

Though as you mentioned, iterating is going to be slow here, and that's really the big trade off. If you can minimize the need to iterate through every voxel in the chunk, then it could be decently fast and has the nice bonus of being really easy to work with.

I have tried this before with some success on some of my earliest voxel experiments, but those projects were slow. Whether it was the dictionary or just my code, I am not sure and am somewhat inclined to think it might have been my code rather than the dictionary itself. I do know that I iterated through every voxel in the dictionary, regardless of whether it had a voxel or not (used null for empty space), so I can probably say I wasn't using the data structure in the most optimal way.

My only concern with using Dictionaries is that I haven't really seen it used too much in very large voxel projects, which would make me hesitant to recommend it simply because there is probably a reason it's not used too often. I think it is probably because most voxel games need to iterate over the entire array too often to reap the benefits of using dictionaries, that or the memory footprint is just too great for the platforms the developers target.

Again, I want to stress again that I'm not an expert on this and so I would highly recommend taking everything I wrote with a huge grain of salt! Ideally if you want to have the best idea on the potential performance impacts, you may want to make a super simple voxel chunk system using each solution and doing some benchmarks to get an idea on what the performance impact of each solution may be.

Hopefully what I wrote helps give some potential considerations though! Regardless of which you choose, please keep us updated as I think voxel engines are pretty cool and would love to see how your voxel system develops :smile:

HekutaFelis

The chunk will need to be iterated very frequently. Last time (with GDScript) I used the SurfaceTool. They way I wrote it, it needed to loop though the whole chunk every time you made a change ingame (add/break block). It also needed to update the chunk every time the character went behind a wall, or inside a room, to render it without the blocks above the player (the camera is top-down, unlike Minecraft. Good thing is that there’s no need for many chunks to be loaded at once, since the view range is more limited to a first-person game).

It’s going to loop often and that’s why fast iteration is my no 1 priority with this script. I also did make it last time to hide the occluded faces of the blocks, so as you said that results in extra calculations every time.

I am going to experiment with Arrays and see how it goes! :)

I want to thank you for replying on yet another post. Your comments are always useful on these forums!