Merge matrices/2d-Arrays of arbitrary size into one big 2d array

Question

I'm new to Scala and it's kind of a noobish question, yet I'm stuck here. I have multiple 2D-Arrays of Ints that I want to merge into one big 2D-Array.

The code so far:

object Generator
{
  // Use a HashMap for greate flexibility in the future. We could more easily add additional checks for biome placement like if(x, y-1) ...
  val biomes:mutable.HashMap[Vector2, Biome] = new mutable.HashMap[Vector2, Biome]

  def create(width:Int, height:Int)
  {
    for(x <- 0 until width; y <- 0 until height)
    {
      if(y < height / 3 * 2)
        biomes.put(new Vector2(x, y), new GrasslandBiome(128, 64))
      else
        biomes.put(new Vector2(x, y), new SkyBiome(128, 64))
    }

    terrain
  }

  def terrain
  {
    var tileMap:ArrayBuffer[Array[Array[Int]]] = new ArrayBuffer[Array[Array[Int]]]
    for((position, biome) <- biomes)
    {
      tileMap += biome.tileMap
    }

    Predef.println(tileMap.toString) // TODO remove
  }
}

Where a Biome is nothing more than a Generator for a 2D-Array: var data: Array[Array[Int]] (And yes, this is going to be a rather simple game to get to know Scala a bit better! :))

My problem is that the ArrayBuffer seems to be the wrong choice as I don't get the expected result, which is one big 2D-Array. In fact I get:

ArrayBuffer([[I@1ed8f33e, [[I@78c86c9a, [[I@46c4ace3, [[I@4de0dedb)

and then a NullPointerException for the println :(

I'd highly appreciate any help in resolving my error in reasoning (and programming) and want to wish you a wonderful day! :)

edit: Of course the corresponding Biome's data is NOT uninitialized/null! If I print it manually, everything is as expected.

edit2: As requested:

abstract class Biome(width:Int, height:Int)
{
  var tileMap: Array[Array[Int]]

  def create();

  def printTileMap()
  {
    printTileMap(tileMap, width, height)
  }

  def printTileMap(map: Array[Array[Int]], mapWidth:Int, mapHeight:Int)
  {
    var x = mapHeight - 1
    while(x >= 0)
    {
      for (y <- 0 until mapWidth) Predef.print(map(y)(x) + " ")
      println()
      x -= 1
    }
  }
}

and one example:

class SkyBiome(width:Int, height:Int) extends Biome(width:Int, height:Int)
{
  var tileMap: Array[Array[Int]] = Array.ofDim[Int](width, height)

  create()
  override def create()
  {
    for(x <- 0 until width; y <- 0 until height)
    {
     tileMap(x)(y) = 0
    }
  }

Also, one ugly [b]solution[/b] would be to iterate over the whole HashMap, sum up width and height of each biome, initialize the tileMap with that information and simple copy everything to the right position. But this seems to be a nasty solution.

Could you provide your implementation of `Biome.tileMap`? I tried to reconstruct your example (http://pastie.org/9582380) but I cannot reconstruct your NullPointerException. Also, your terrain method does nothing: you create an ArrayBuffer, modify this buffer, but never return it. `biome.tileMap` is not modified. Furthermore, there are probably better Collections than Array, which you can use, and the use of mutable collections looks like premature optimization. Use immutable objects where possible. http://alvinalexander.com/scala/scala-idiom-immutable-code-functional-programming-immutability — Kulu Limpa, Sep 21 '14 at 23:06
@KuluLimpa: I'll add the implementation. I'm totally aware that it doesn't return anything though. I wanted to debug first and then return it. If the output was okay, I could easily add a return value. I'm new to scala and the idea of immutable collections is a bit strange to me. If I make the HashMap immutable I can't even "put" anything in there :) — , Sep 22 '14 at 07:01
I'll happily look at your code and try to give you a useful answer once I'm home from work. Until then (or until someone else beats me to an answer), I'd suggest you have a look at Scala's `Optional` type (to make possible nullity explicit) and at some of the functional programming you can do with Scala's collections, e.g., `map`, `collect`, `foldLeft`. I'm almost certain that those would lead to a neat solution. Regarding immutable maps: You essentially create a new Map from the old one, containing the element you add. I'd only resort to mutable structures, if it improves performance — Kulu Limpa, Sep 22 '14 at 09:24
@Kulu Limpa: I'm sure there is a really nice solution. I have a little experience with Haskell and functional programming was a bless with some problems, but here I'm still stuck. If you have nothing better to do and still want to help I'd highly appreciate any nice, scalable solution to merge an arbitrary size of 2D-Arrays (of arbitrary size) to one big 2D-array :) — , Sep 22 '14 at 14:46

score 0 · Accepted Answer · edited May 23 '17 at 12:21

First of, I found an explanation on the difference between the various array-types: scala - array vs arrayseq

Now on your question on how you could get a one-dimensional Array from a two-dimensional array: But let us assume that, instead of Array[Array[T]] we have List[List[T]] (just because I like lists more; but you should have no problem applying this to arrays).

What does foldLeft[A](initial: A)(function: (A, B) => A) do? It iterates over the collection and aggregates a value of type A, by applying a function (A, B) => A at each element. Here, B is the generic type of the list. In our example, we fold a list of lists of integers (B), to a list of integers (A) by concatenating (++) the intermediate value a with the next element in the list.

Edit2:

When applying this to the Biome-example, we can create a map from positions to biomes as follows:

def create(width:Int, height:Int): Map[Vector2, Biome] =
{
  // we can define functions within functions
  def isGrasslandBiome(x: Int, y: Int): Boolean =
    y < height / 3 * 2

  // A for-comprehension returns a sequence of all computed elements
  val positionWithBiomes: Seq[(Vector2, Biome)] =
    for{
      x <- 0 until width
      y <- 0 until height
    } yield
    {
      val biome =
        if (isGrasslandBiome(x, y))
          GrasslandBiome(DefaultWidth, DefaultHeight)
        else
          SkyBiome(DefaultWidth, DefaultHeight)
      ((x, y), biome)
    }
  // We fold each element of the sequence into a map
  val biomes: Map[Vector2, Biome] =
    positionWithBiomes.foldLeft[Map[Vector2, Biome]](Map.empty){
      case (map, ((x, y), biom)) => map + ((x, y) -> biom)
    }
  biomes
}

We iterate over a sequence of ((Int, Int), Biome)s and aggregate them in a Map. I would prefer immutable Collections (most standard Scala collections) to mutable collections. Mutability can give you a hard time when debugging code for a possible benefit in performance. Also, immutability fits the functional style, because a function (in the mathematical sense) is always pure, i.e., does not alter the state.

Nevertheless, for the sake of the example, and because the internal representation of tileMap might actually be performance-critical, let us define a class TileMap, which is essentially a wrapper for a two-dimensional array: (Note: type TileMapInternal = mutable.ArraySeq[mutable.ArraySeq[Int]])

/** A tile map is our internal, mutable representation of a Biome
  *
  * Note that this makes Biome mutable
  */
/*mutable*/ class TileMap private(private val value: TileMapInternal, private val width: Int, private val height: Int) {

def print()
{
  value foreach { row =>
    row foreach { tile =>
      Predef.print(s"$tile ")
    }
    println()
  }
}

def setAllTiles(tileValue: Int) {
  for{
    x <- 0 until width
    y <- 0 until height
  }{
    setTile(tileValue)(x, y)
  }
}

def setTile(tileValue: Int)(x: Int, y: Int) {
  value(y)(x) = tileValue
}

def addRows(other: TileMap): TileMap =
  new TileMap(value ++ other.value,
    width, // we add rows -> width stays the same
    height + other.height
  )

def addColumns(other: TileMap): TileMap =
  new TileMap(
    value zip other.value map {case (first, second) => first ++ second},
    width + other.width,
    height // we add columns -> height stays the same
  )

}

This class has a private constructor, because we do not want the internal representation to leak out. In Scala, you can define a companion object which has access to all the private members. Basically, a companion object contains all methods that you would declare static in Java.

// companion object
object TileMap {
  def empty(width: Int, height: Int): TileMap = {
    val rows = new mutable.ArraySeq[mutable.ArraySeq[Int]](height)
    for (row <- 0 until height) {
      rows(row) = new mutable.ArraySeq[Int](width)
      for (cell <- 0 until width) {
        rows(row)(cell) = 0
      }
    }
    new TileMap(rows, width, height)
  }
}

The companion object will initialize an empty TileMap for us. This could most likely be written more elegant, but it does what it is supposed to: It creates an array of width x height and sets all elements to zero. Your Nullpointer exception probably happened because you didn't initialize the elements properly. When you tried to print them manually, they probably just printed 0 instead of null because of autoboxing.

Edit

How to accumulate the map in a single 2d array:

The methods addRows and addColumns create a new TileMap (i.e., your two-dimensional array) by combining either the inner or the outer array. Combining the outer array is easy: Simply add them using the method ++ which is defined on all scala collections. To combine the inner arrays, we can first zip the two arrays together, which gives us a single array of tuples of arrays (Array[(Array[Int], Array[Int])]). We can then map this array to a new, regular two-dimensional array by applying ++ on each tuple.

We can now use those two functions to reduce a two dimensional array of tile maps (which is basically what we have stored by using val biomes: Map[Vector2, Biome]) into a single tile map.

I should mention that, by design, this looks very fragile: I am imagining things to break once your tile maps are not of equal size. Also, messing up the order in which the rows/columns are combined could be nearly impossible to debug, and if your map of biomes isn't complete, computing the terrain may simply crash. Nevertheless, here's how I'd solve the problem with the current design:

def terrain(biomes: Map[Vector2, Biome]): TileMap = {
  val (rows, _) = biomes.keys.unzip
  rows.toList.sortBy{id => id} map {
    row => terrainForRow(row, biomes)
  } reduceLeft {
      (accumulator: TileMap, nexTileMap: TileMap) => accumulator addColumns nexTileMap
  }
}

def terrainForRow(column: Int, biomes: Map[Vector2, Biome]): TileMap =
  biomes.filter{
    case ((x, y), _) => x == column
  }.toList.sortBy{
    case ((_, y), _) => y
  }.map{
    case ((_, _), biome) => biome.tileMap
  }.reduceLeft[TileMap]{
    (accumulator, nextTileMap) => accumulator addRows nextTileMap
  }

This certainly needs some explanation: terrainForRow shall take a single row (i.e., all values in the map with the same value for x) and concatenate them in the correct order. The order is determined by y, because we want the order of the columns to be preserved.

First, we filter all biomes that have the same value for x
Second, we add ordering to the collection by converting it to list, and sort the list according to the value of y
Third, we throw away the position (value of x and y), and extract the tile map. This is needed if we want to use reduceLeft, because of its type signature.
Last, we accumulate the tile maps, calling addRows which we defined earlier. Note that reduceLeft is only defined for collections of size two and bigger

In the method terrain, we call terrainForRow for each row (that is, after we sorted the tile maps by row). Each result already is a TileMap, so we can call reduceLeft on the sequence of results and reduce them to a single TileMap by using the function addColumns that we defined.

I've put the updated code at http://pastie.org/9598459

I'd like to add some things you can do in scala, since you want to learn the language:

Currying and partially applied functions

Look at the signature of the method setTile: It has two parameter lists. This is called currying and allows us to partially apply functions. For example, we can create a function like this:

// let us partially apply the setTile function
val setGrassTileAt: (Int, Int) => Unit =
  biomesTileMap.setTile(GrassTileValue)
// now tileBuilder will set the value of GrassTile at each position we provide
setGrassTileAt(0, 0)
setGrassTileAt(3, 1)
setGrassTileAt(0, 1)

The function setGrassTile (stored in a variable, because in scala, functions are first-class citizens), takes two integer values (the coordinates) and sets the value of the biome at that coordinate to the GrassTileValue.

Uniform access principle and case classes

abstract class Biome
{
  // Scala supports the uniform access princple
  // A client is indifferent whether this is def, val or var
  // A subclass can implement width and height as val
  def width: Int
  def height: Int

  def tileMap: TileMap

}

// Case classes automatically define the methods equals, hashCode and toString
// They also have an implicit companion object defining apply and unapply
// Note that equals is now defined on width and height, which is probably not what
// we want in this case.
case class SkyBiome(width: Int, height: Int) extends Biome {
  val tileMap: TileMap = TileMap.empty(width, height)

}

case class GrasslandBiome(width: Int, height: Int) extends Biome {
  val tileMap: TileMap = TileMap.empty(width, height)
}

In Scala, clients should not distinguish between variables and methods when accessing a member of an object. For example, it does not matter whether width is a def in the supertype or a val in the subtype.

Case classes are classes that override equals, hashCode and toString. Also, they provide an extractor unapply (which you can use when matching) and an apply method (which is applyied when you look at the object as a function). Case classes are most beneficial for value objects.

I hope I could help you a bit. Edit: Code before my updated answer: http://pastie.org/9586136

Heyho! First of all: I'm really sorry to let you wait this long. I had some real life stuff going on and no time to look into your solution. One thing tho: "Now on your question on how you could get a one-dimensional Array from a two-dimensional array ..." This is NOT what I want to do. I want to "glue together" many bits of 2D-Arrays to one huge 2D-Array. Like (real life example!) you have parts of a map and glue them together so you have only one huge map. Still every coordinate is 2D. I'll dive into your solution and see if I can adapt it to my needs asap. Thanks again! — , Sep 26 '14 at 14:36
Sorry, that I did not understand you correctly. So what you want to do is something like: `biomes.foldLeft[Array[Array[Int]]](Array.empty)(case (a, (position, biome)) => a ++ biome.tileMap)` where `biomes: Map[Vector2, Biome]`? This should accumulate all tilemaps into one 2-d array. — Kulu Limpa, Sep 26 '14 at 22:56
I updated the answer with what I hope is answering your question. Things get a bit complicated, because we want to accumulate tiles based on two independent attributes: The position of the Biome in the map, and the position of the tile in the TileMap. I'd probably rethink my design: Why do you store separate Biomes and add them together in a combined tilemap, instead of storing the combined tilemap directly and let the biome define areas in this tilemap, e.g., by storing the top left and bottom right corner? — Kulu Limpa, Sep 27 '14 at 01:29
No problem. Sorry, I'm not a native speaker, so please bear with me. Well, I thought that creating the map in "bunches/chunks" and defining the creation of them (=biomes) and merge them together afterwards, would give me more creative freedom. Instead of one huge algorithm I can define many small ones and easily change how the world/map is build up :) Also later down the line I would like to dynamically create biomes by small .json files or something. There probably are many ways to achieve what I want to achieve, but I wanted to avoid one big, complicated algorithm, which I had originally. — , Sep 27 '14 at 11:44
I didn't mean you should not split up your map into the individual biomes. However, personally, I find it easier to store the map as a whole and extract individual biomes than to store the individual biomes and glue them together. Even if you store everything in one really big 2d array, your algorithms can stil work on the biome-abstraction you can build upon it. Though, you still have to carefully think about serialization and deserialization from json. — Kulu Limpa, Sep 27 '14 at 12:00
But if I build the map beforehand and split it up afterwards I lose freedom as I would need to know how big the map was etc. Like it is now one biome could be 1024*256 and the next one could be 16384*64 and the next 16*16 etc. A lot more freedom for me. At least how I have it in mind. But I know it's harder, hence I asked for help :D My first approach was to create one huge map :) — , Sep 27 '14 at 15:56
I wouldn't agree with the increased flexibility: How do you intend to glue two adjacent biomes of distinct size together? Say you have a biome of height 2 next to a biome of height 4: what would the combined array look like? As soon as you have inherent restrictions on what you can do, the design becomes complicated. However, if you store everything in one array, I think it is easier to define abstractions on it, e.g., you could have overlapping biomes. In any case, it is important to define the right abstractions that hide the implementation details from your application. — Kulu Limpa, Sep 27 '14 at 16:19
" Say you have a biome of height 2 next to a biome of height 4: what would the combined array look like?" Filling it up with "air" or another block, depending on the height and position of the biome. But I'll think about what you said. — , Sep 28 '14 at 11:36

Merge matrices/2d-Arrays of arbitrary size into one big 2d array

1 Answers1