Thursday 15 November 2012

Machinery Translation

I ported my Soma-cube-solving algorithm to C++, to see how the implementation would be affected by the language. This time, because there's no such thing as Literate C++, I've left the program listing to the very end. I tried to write idiomatic, readable C++. For example, I overloaded operators where that seemed mathematically plausible, rather than wherever possible. Similarly, I used STL machinery as I imagine its authors intended, and tried not to disappear down the pseudo-functional rabbit-hole suggested by STL functions like std::transform. I didn't try to second-guess compiler behaviour, letting functions return containers by value. And I avoided playing tricks involving the memory layout of data structures.

I ended up with a 388-line program (350 non-blank lines). That compares to about 150 lines in the Haskell version (about 105 non-blank lines) once the computation of statistics is included.

This is how the program performed:

$ g++ -O2 cube.cc -o cube
$ time ./cube > solutions-c.txt 
Possibilities (explored, kept) at each stage:
  (4,4)
  (256,130)
  (9360,2334)
  (168048,16625)
  (1596000,26403)
  (2534688,4080)
  (587520,240)
Total possibilities explored: 4895876
Total possibilities kept: 49816

real    0m0.032s
user    0m0.022s
sys 0m0.006s

The report of the work done is exactly the same as the one from the Haskell version, which is reassuring. (The solutions are also the same, though listed in a slightly different order.)

The C++ version is a little more than twice as fast: 0.03s instead of 0.07s.

What's not obvious from the source code or the numbers is how much I missed having any kind of interactive session while writing the C++ version. It's really useful, when writing code, to be able to evaluate a handful of expressions to see whether each function appears to do the right thing.

And, as promised, here's the source code:

#include <iostream>
#include <set>
#include <string>
#include <sstream>
#include <vector>

using namespace std;

enum Shape { L, S, T, R, P, Q, Y, kNumShapes };

const char* plan(Shape s) {
  switch (s) {
  case L: return
      "XXX"
      "X  ";
  case S: return
      " XX"
      "XX ";
  case T: return
      "XXX"
      " X ";
  case R: return
      "XX "
      "X  ";
  case P: return
      "X  "
      "X  "
      "   "
      "XX ";
  case Q: return
      " X "
      " X "
      "   "
      "XX ";
  case Y: return
      "XX "
      "X  "
      "   "
      "X  ";
  }
}

char name(Shape s) {
  switch (s) {
  case L: return 'L';
  case S: return 'S';
  case T: return 'T';
  case R: return 'R';
  case P: return 'P';
  case Q: return 'Q';
  case Y: return 'Y';
  }
}

const int N = 3;

struct Form { char a[N][N][N]; };

bool operator==(const Form& x, const Form& y) {
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        if (x.a[i][j][k] != y.a[i][j][k]) return false;
  return true;
}

bool operator<(const Form& x, const Form& y) {
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        if (x.a[i][j][k] < y.a[i][j][k]) return true;
        else if (x.a[i][j][k] > y.a[i][j][k]) return false;
  return false;
}

ostream& operator<<(ostream& o, const Form& f) {
  for (int j = 0; j < N; ++j) {
    if (j > 0) o << endl;
    for (int i = 0; i < N; ++i){
      if (i > 0) o << "  ";
      o << '|';
      for (int k = 0; k < N; ++k)
        o << f.a[i][j][k];
      o << '|';
    }
  }
  return o;
}

Form empty() {
  Form f;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        f.a[i][j][k] = ' ';
  return f;
}

Form operator+(const Form& x, const Form& y) {
  Form z;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k) {
        char xa = x.a[i][j][k];
        char ya = y.a[i][j][k];
        char& za = z.a[i][j][k];
        if (xa == ' ' && ya == ' ') za = ' ';
        else if (ya == ' ') za = xa;
        else if (xa == ' ') za = ya;
        else za = '#';
      }
  return z;
}

string flatten(const Form& f) {
  ostringstream o;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        o << f.a[i][j][k];
  return o.str();
}

int as_bits(const Form& f) {
  int bits = 0;
  int n = 0;
  string s = flatten(f);
  for (string::const_iterator it = s.begin();
       it != s.end();
       ++it, ++n)
    if (*it != ' ')
      bits |= 1 << n;
  return bits;
}

int size(const Form& f) {
  int n = 0;
  string s = flatten(f);
  for (string::const_iterator it = s.begin();
       it != s.end();
       ++it)
    if (*it != ' ')
      ++n;
  return n;
}

Form default_form(Shape s) {
  const char* p = plan(s);
  Form f;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k) {
        char& fa = f.a[i][j][k];
        switch(*p) {
        case 0: fa = ' '; break;
        case ' ': fa = ' '; ++p; break;
        default: fa = name(s); ++p;
        }
      }
  return f;
}

Form tx(const Form& f) {
  Form g;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        g.a[i][j][k] = (k == 0) ? ' ' : f.a[i][j][k-1];
  return g;
}

Form ty(const Form& f) {
  Form g;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        g.a[i][j][k] = (j == 0) ? ' ' : f.a[i][j-1][k];
  return g;
}

Form tz(const Form& f) {
  Form g;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        g.a[i][j][k] = (i == 0) ? ' ' : f.a[i-1][j][k];
  return g;
}

Form rx(const Form& f) {
  Form g;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        g.a[i][j][k] = f.a[j][N-1-i][k];
  return g;
}

Form ry(const Form& f) {
  Form g;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        g.a[i][j][k] = f.a[k][j][N-1-i];
  return g;
}

Form rz(const Form& f) {
  Form g;
  for (int i = 0; i < N; ++i)
    for (int j = 0; j < N; ++j)
      for (int k = 0; k < N; ++k)
        g.a[i][j][k] = f.a[i][N-1-k][j];
  return g;
}

vector<Form> majors(const Form& f) {
  vector<Form> result;
  result.push_back(f);
  result.push_back(ry(f));
  result.push_back(ry(ry(f)));
  result.push_back(ry(ry(ry(f))));
  result.push_back(rz(f));
  result.push_back(rz(rz(rz(f))));
  return result;
}

vector<Form> minors(const Form& f) {
  vector<Form> result;
  result.push_back(f);
  result.push_back(rx(f));
  result.push_back(rx(rx(f)));
  result.push_back(rx(rx(rx((f)))));
  return result;
}

bool in_bounds(Shape s, const Form& f) {
  return size(default_form(s)) == size(f);
}

vector<Form> shifts(Shape s) {
  vector<Form> result;
  Form fi = default_form(s);
  for (int i = 0; i < N; ++i) {
    Form fj = fi;
    for (int j = 0; j < N; ++j) {
      Form fk = fj;
      for (int k = 0; k < N; ++k) {
        if (in_bounds(s, fk))
          result.push_back(fk);
        fk = tz(fk);
      }
      fj = ty(fj);
    }
    fi = tx(fi);
  }
  return result;
}

vector<Form> all_forms(Shape s) {
  vector<Form> result;
  if (s == L) {
    Form f = default_form(s);
    result.push_back(f);
    result.push_back(ty(f));
    result.push_back(tz(f));
    result.push_back(ty(tz(f)));
  } else {
    set<Form> distinct_forms;
    vector<Form> f_sh = shifts(s);
    for (vector<Form>::const_iterator it_sh = f_sh.begin();
         it_sh != f_sh.end();
         ++it_sh) {
      vector<Form> f_min = minors(*it_sh);
      for (vector<Form>::const_iterator it_min = f_min.begin();
           it_min != f_min.end();
           ++it_min) {
        vector<Form> f_maj = majors(*it_min);
        for (vector<Form>::const_iterator it_maj = f_maj.begin();
             it_maj != f_maj.end();
             ++it_maj) {
          if (distinct_forms.find(*it_maj) == distinct_forms.end()) {
            result.push_back(*it_maj);
            distinct_forms.insert(*it_maj);
          }
        }
      }
    }
  }
  return result;
}

typedef pair<Form, int> FormBits;

vector<vector<FormBits> > combos() {
  vector<vector<FormBits> > result;
  for (int s = 0; s < kNumShapes; ++s) {
    result.push_back(vector<FormBits>());
    vector<FormBits>& forms_and_bits = result.back();
    vector<Form> forms = all_forms(static_cast<Shape>(s));
    for (vector<Form>::const_iterator it = forms.begin();
         it != forms.end();
         ++it) {
      forms_and_bits.push_back(FormBits(*it, as_bits(*it)));
    }
  }
  return result;
}

struct Stats {
  Stats() : tried(0), kept(0) {}
  int tried;
  int kept;
};

Stats operator+(const Stats& x, const Stats& y) {
  Stats z;
  z.tried = x.tried + y.tried;
  z.kept = x.kept + y.kept;
  return z;
}

template<typename C>
struct CompareBySize {
  bool operator()(const C& x, const C& y) const {
    return x.size() < y.size();
  }
};

pair<vector<Form>, vector<Stats> > solutions() {
  vector<vector<FormBits> > cs = combos();

  sort(cs.begin(), cs.end(), CompareBySize<vector<FormBits> >());

  vector<FormBits> partials;
  vector<Stats> stats;
  partials.push_back(FormBits(empty(), 0));

  for (vector<vector<FormBits> >::const_iterator fbs = cs.begin();
       fbs != cs.end();
       ++fbs) {
    vector<FormBits> next_partials;
    stats.push_back(Stats());
    Stats& partial_stats = stats.back();
    for (vector<FormBits>::const_iterator p = partials.begin();
         p != partials.end();
         ++p)
      for (vector<FormBits>::const_iterator fb = fbs->begin();
           fb != fbs->end();
           ++fb) {
        ++partial_stats.tried;
        if ((p->second & fb->second) == 0) {
          next_partials.push_back(FormBits(p->first + fb->first,
                                           p->second | fb->second));
          ++partial_stats.kept;
        }
      }
    partials.swap(next_partials);
  }

  pair<vector<Form>, vector<Stats> > result;
  for (vector<FormBits>::const_iterator p = partials.begin();
       p != partials.end();
       ++p)
    result.first.push_back(p->first);
  result.second.swap(stats);
  return result;
}

int main(int argc, char* argv[]) {
  pair<vector<Form>, vector<Stats> > sols = solutions();
  for (vector<Form>::const_iterator it = sols.first.begin();
       it != sols.first.end();
       ++it)
    cout << *it << endl << endl;

  Stats totals;
  cerr << "Possibilities (explored, kept) at each stage:" << endl;
  for (vector<Stats>::const_iterator it = sols.second.begin();
       it != sols.second.end();
       ++it) {
    totals = totals + *it;
    cerr << "  (" << it->tried << ", " << it->kept << ')' << endl;
  }
  cerr << "Total possibilities explored: " << totals.tried << endl;
  cerr << "Total possibilities kept: " << totals.kept << endl;
  return 0;
}

Tuesday 13 November 2012

A Better Sort of Input

In a previous post I developed a backtracking algorithm for finding solutions of a Soma cube. It was good to discover that an optimized version of the program would find all the solutions in only 1.2s.

Later, I began to wonder how much of the speed was down to being able to prune the solution space effectively, versus simply being able to check each candidate solution very quickly. A revised version of the program emits statistics to help find out.

> module CountingCube where
> import Cube hiding (main)
> import Data.Bits
> import Data.Monoid (mappend, mempty)
> import Data.List (intercalate, sortBy)
> import Data.Ord (comparing)
> import Data.Word
> import System.IO

In this version, the computation returns not just a list of solutions, but also a list of details of how much work was done for each additional shape folded into the partial solution. (The work list has to be reversed once it is complete because it is built up backwards.)

> countedSolutions = (map fst pbs, reverse work) where
>     (pbs, work) = foldr addForm ([(mempty, 0)], []) combos

For each shape, all the forms of that shape have to be tried against all the partial solutions built so far, so the number of candidates tried will be the product of those two lists' lengths. The number of candidates kept is just the number of partial solutions resulting from this step of the fold.

> addForm form (partialSolutions, work) = (partialSolutions', work') where
>     partialSolutions' = [(ps `mappend` f, psb .|. fb) |
>                          (f, fb) <- form,
>                          (ps, psb) <- partialSolutions,
>                          psb .&. fb == 0]
>     work' = (tried, kept) : work
>     tried = length form * length partialSolutions
>     kept = length partialSolutions'

As before, the program prints out all the solutions, separated by newlines. But this time it also describes how much work was done.

> printSolutions (solutions, work) = do
>   putStrLn $ intercalate "\n" [show s | s <- solutions]
>   hPutStrLn stderr $ unlines $ [
>                  "Possibilities (explored, kept) at each stage:"
>                 ] ++ map (("  " ++) . show) work ++ [
>                  "Total possibilities explored: " ++ show (sum (map fst work)),
>                  "Total possibilities kept: " ++ show (sum (map snd work))
>                  ]

> main = printSolutions countedSolutions

How did that turn out?

$ ghc --make -main-is CountingCube -O2 CountingCube.lhs
[2 of 2] Compiling CountingCube     ( CountingCube.lhs, CountingCube.o )
Linking CountingCube ...
$ time ./CountingCube > solutions.txt 
Possibilities (explored, kept) at each stage:
  (64,64)
  (6144,2544)
  (244224,38448)
  (5536512,805584)
  (58002048,1817232)
  (130840704,552384)
  (2209536,240)
Total possibilities explored: 196839232
Total possibilities kept: 3216496


real    0m3.594s
user    0m3.374s
sys 0m0.212s

Because the fold is right-associative, it starts with the last shape in the list: Y. As can be seen, the Y was tried in all 64 positions in which it could appear, and was found to fit in all of them - just as expected, given that no shapes had been added ahead of it. The next shape to be tried fitted in the position tried 2544 out of 6144 times - less than half of the time.

Overall, the number of moves attempted was 196,839,232. The entire solution space (up to rotations and reflections) has 268,435,456 possibilities, but each of seven shapes has to be placed for each one of those, so a completely stupid non-backtracking implementation would try 1,879,048,192 moves. So pruning of the solution space like this avoids about 90% of the work.

But it gets better. While writing a version of the same algorithm in C++ (of which more later) I noticed that the total work done was different, even though the number of solutions ended up being the same, simply because the shapes were being considered in a different order. Of course ... so it would be better to try to place the shapes having fewer forms first, because they constrain the solution space more so there should be less backtracking to do.

Note that smarterSolutions here is identical to countedSolutions above, except that it sorts the input first.

> smarterSolutions = (map fst pbs, reverse work) where
>     (pbs, work) = foldr addForm
>                   ([(mempty, 0)], [])
>                   (reverse $ sortBy (comparing length) combos)

> main' = printSolutions smarterSolutions

GHC --make unfortunately doesn't seem to tell the difference between programs built with different -main-is settings, so the previous run's outputs need to be deleted first.

$ rm CountingCube CountingCube.o CountingCube.hi
$ ghc --make -main-is "CountingCube.main'" -O2 CountingCube.lhs
[2 of 2] Compiling CountingCube     ( CountingCube.lhs, CountingCube.o )
Linking CountingCube ...
$ time ./CountingCube > solutions.txt 
Possibilities (explored, kept) at each stage:
  (4,4)
  (256,130)
  (9360,2334)
  (168048,16625)
  (1596000,26403)
  (2534688,4080)
  (587520,240)
Total possibilities explored: 4895876
Total possibilities kept: 49816


real    0m0.070s
user    0m0.061s
sys 0m0.007s

That really was a valuable optimisation! Instead of trying nearly 200,000,000 moves, this version explores only 4,895,876 - it is over 40 times more efficient. And the solutions are now all found in under a tenth of a second. I am embarrassed that I didn't think of this sooner.

Thursday 8 November 2012

When Simpler Isn't Better

In a previous post I investigated the number of ways of solving a Soma cube by writing a Haskell program to find out for me. I was pleased to find that an optimised version of the program took about 1.2 seconds to find 240 solutions.

In that program, I represented each partial solution and each position of a shape as a 27-bit number, allowing simple bitwise operations to combine forms and filter out the invalid ones. Was that a premature optimisation? It would definitely have been faster in terms of programming time not to write all the conversion to bit patterns, and to operate directly on the representation of forms as lists of characters. But would the resulting program have run as fast?

A different version of just a few functions will show whether the extra work was necessary.

> module DumbCube where
> import Cube hiding (main)
> import Data.Monoid (mappend)
> import Data.List (intercalate)

For each shape, all the forms are precalculated, but not the bit representations that Cube.combos computes.

> dumbCombos = [allForms shape | shape <- enumFrom L]

Likewise, the fold to enumerate solutions combines form representations directly, instead of working on any bit representations.

> dumbSolutions = foldr1 addForm dumbCombos where
>   addForm partialSolution form = [cf |
>                                   cf <- [(pf `mappend` f) |
>                                          f <- form,
>                                          pf <- partialSolution],
>                                   not ('#' `elem` flatten cf)]

Finally, as before, print out all the solutions, separated by newlines.

> main = putStrLn $ intercalate "\n" [show s | s <- dumbSolutions]

How fast is the result now?

$ ghc --make -main-is DumbCube -O2 DumbCube.lhs
[1 of 2] Compiling Cube             ( Cube.lhs, Cube.o )
[2 of 2] Compiling DumbCube         ( DumbCube.lhs, DumbCube.o )
Linking DumbCube ...
$ time ./DumbCube > solutions.txt 

real    4m0.143s
user    3m58.932s
sys 0m1.072s

This slightly simpler program produces the same output but takes about 240 seconds - almost exactly one second per solution, on average. That also means it's almost exactly 200 times slower than the original program with its bitwise representation. That conversion to patterns of 27 bits obviously was worthwhile!

Tuesday 16 October 2012

Of Cubes and Computers

Mike has a Cube, like this.



The cube is made of seven shapes, like these.



How many ways are there of assembling the cube? One could try placing each shape in turn in each position that it can occupy within the cube, and then seeing which of those combinations fill the cube. How many combinations of placements of pieces are there, then? For a very loose upper bound on this figure, assume that each of the seven shapes can be oriented against any of six faces of the cube, and in any of four directions relative to that, and can be translated to originate in any of the 27 cells in the cube. That gives (6 × 4 × 27)7 possibilities, which is 47,976,111,050,506,371,072.

But that's only a very loose bound. Many of the shapes have symmetries that mean that, after trying them one way round, there's no point in trying them the other way round - it'll just give the same answers. And none of the shapes can really be translated into all 27 different starting points without poking out of the sides of the cube, so that number can be replaced with a more accurate estimate too. Taking the symmetries and sizes of the seven shapes into account reduces the estimate to only about 40 million million possibilities.

There are about a million seconds in two weeks, so if a program can be written to check 40 million possibilities a second, then it'll only take two weeks to finish. On a 1GHz processor, checking 40 million possibilities a second requires a check to take on average 25 clock cycles. That seems doable enough to be worth investigating.

However, most of the work of checking possibilities can easily be avoided by noticing that, if any two pieces are placed in such a way that they would occupy the same space, then that definitely won't lead to any valid solutions. In other words, most of the possibilities will quickly turn out to be impossibilities. A program that can take advantage of that will probably finish in much less than two weeks.

One way to enumerate all the candidate shape placements is to enumerate all the placements of the first shape, then for each of those enumerate all the placements of the second shape, and then for each of those the placements of the third shape, and so on. In fact this approach allows exactly the work-avoiding strategy that is needed: when considering placing the Nth shape in a given position, look at whether that placement would conflict with the current positions of any of the N-1 shapes which have already been placed; if so, there is no need to continue investigating the positions of the remaining shapes. A solution is detected when all seven shapes have been placed, since that means each of the 27 cells in the shapes must be in different cells of the cube.

It's still worth thinking about how to check possible shape placements efficiently, though. One way to do it is to consider that a shape in a given position will fill some of the 27 cells of the cube and not others. So a partially-completed cube can be represented by a string of 27 bits, with a set bit indicating that the corresponding cell has been filled and a clear bit indicating that it is still free. And a choice of position for a particular shape can also be represented by a string of 27 bits, this time with the set bits indicating which of the cells will be filled by placing the shape in that position. Then, to see whether a new shape will fit into a partial solution in a given position, just check that the bit pattern for the partial solution and the bit pattern for the new shape placement have no set bits in common. And, to make a new partial solution with the new shape in its chosen place, just take the disjunction of those two bit patterns. Also, there aren't that many useful positions for each shape (between 64 and 144 in every case) so the bit patterns for each shape can be computed once to begin with and then reused.

Time to work this out in a Haskell module ...

> module Cube where

There will definitely be a need for some bit-twiddling and other things.

> import Data.Bits
> import Data.List (intercalate, nub, transpose, unfoldr)
> import Data.Monoid
> import Data.Word

There are seven shapes. The names here are intended to suggest their forms where possible. The P and Q shapes don't really look like any letter. They are mirror images of each other, though.

> data Shape = L | S | T | R | P | Q | Y
>   deriving (Enum, Show)

Each shape has a plan, which describes the cells of the cube that it occupies if placed in the left back corner of the bottom layer of the cube. The plan describes each row of each layer of the cube in turn, up to the point where all the remaining rows are empty. So the L plan can stop after only two rows, but the Y plan needs to go as far as the first row of the second layer.

> plan L = "XXX" ++
>          "X  "

> plan S = " XX" ++
>          "XX "

> plan T = "XXX" ++
>          " X "

> plan R = "XX " ++
>          "X  "

> plan P = "X  " ++
>          "X  " ++
>          "   " ++
>          "XX "

> plan Q = " X " ++
>          " X " ++
>          "   " ++
>          "XX "

> plan Y = "XX " ++
>          "X  " ++
>          "   " ++
>          "X  "

The terminology 'form' is introduced to mean an assignment of shapes to positions in the cube. A form could be represented in all sorts of ways - 27 bits, or a string of 27 characters, or an array of booleans, or something else. A three-dimensional list of characters is slightly ugly, in that it is quite possible to construct forms that have the wrong dimensions or meaningless contents. But it has the advantage of allowing lots of built-in list manipulation and display functions to be brought to bear.

> newtype Form = Form [[[Char]]]
> unform (Form x) = x

It's useful to be able to compare and sort forms. Again, the choice of representation allows different encodings of the same form to seem unequal when they shouldn't, but that isn't a problem in practice.

> instance Eq Form where
>   (Form x) == (Form y) = x == y

> instance Ord Form where
>   Form x `compare` Form y = x `compare` y

An easily readable representation of a form is helpful for listing the solutions (as well as for playing with individual shapes' forms to show that things are working properly). Transposing the layers of rows allows the successive layers to be printed next to one another instead of above one another to make better use of screen space - an example of the list representation making things easier.

> instance Show Form where
>   show (Form f) = unlines [showSlice s | s <- transpose f]
>    where
>     showSlice s = intercalate "  " [ showRow r | r <- s ]
>     showRow r = "|" ++ r ++ "|"

There is an empty form (corresponding to the partial solution where no shapes have been placed yet) and it's easy to see how to combine a form with another form and that (if the combination is valid) this will be a symmetric operation, so these forms naturally produce a Monoid.

> instance Monoid Form where
>   mempty = Form $ (replicate 3 . replicate 3 . replicate 3) ' '
>   mappend (Form x) (Form y) = Form $ (zipWith . zipWith . zipWith)
>                                      superpose x y
>    where
>     superpose ' ' ' ' = ' '
>     superpose  a  ' ' =  a
>     superpose ' '  b  =  b
>     superpose  _   _  = '#'  -- This should never happen!

To compute the bit pattern for a form, string the characters for each of its rows together and then convert each character in turn into a single bit.

> flatten = concat . concat . unform
> asBits :: Form -> Word32
> asBits f = sum [bit n | (n, x) <- zip [0..] (flatten f), x /= ' ']

It'll be useful to know how many cells of the cube are occupied by a given form.

> size f = length [x | x <- flatten f, x /= ' ']

As mentioned before, the plan of each shape describes the cells it occupies when pushed into a particular corner of the cube. Here is a way to take this plan (which doesn't even describe the entire cube!) and turn it into a complete form suitable for combining with other forms.

> defaultForm :: Shape -> Form
> defaultForm shape = Form layers
>  where
>   layers = (inThrees . inThrees) fullPlan
>   fullPlan = [formChar c | c <- plan shape] ++
>              replicate (27 - length (plan shape)) ' '
>   inThrees = takeWhile (not . null) . unfoldr (Just . splitAt 3)
>   formChar ' ' = ' '
>   formChar  _  = head (show shape)

Translation functions take a form for a shape, and give a form for the same shape but shifted by one place along one of the three axes. Because the default forms of the shapes are always in one particular corner, these translation functions only need to cover shifts away from that one corner in order to be able to describe all the possible positions into which the shape can be translated.

These functions implicitly fix the coordinate system of the form representation: a list of layers, each of which is a list of rows, each of which is a list of cells. This means, for example, that manipulating the outermost list of the structure corresponds to dealing with the cube layer-wise.

Each of these functions inserts empty space on one face of the cube and chops one slice of cells off the opposite face of the cube. Again, the list representation makes them easy to define.

> tx (Form f) = Form [[[' ', a, b] | [a, b, _] <- layer] | layer <- f]
> ty (Form f) = Form [["   ", a, b] | [a, b, _] <- f]
> tz (Form [a, b, _]) = Form [replicate 3 "   ", a, b]

The rotation functions are slightly more complicated. For convenience, they are based on a 90-degree rotation being equivalent to a transposition followed by a reflection.

> cw = transpose . reverse
> ccw = reverse . transpose

Rotation about the X axis leaves each row unchanged, so it's just the rotation applied to the outer two layers of lists.

> withForm f = Form . f . unform
> rx = withForm ccw
> rx' = withForm cw

To rotate about the Y axis, first transpose Z and Y to bring the Z and X axes next to each other, then apply the rotation to those Z and X axes, then transpose Y and Z to restore the coordinate system. Note that the direction of the applied rotation is reversed because transposition reverses the chirality of the representation.

> ry = withForm (transpose . map ccw . transpose)
> ry' = withForm (transpose . map cw . transpose)

Finally, to rotate about Z, apply the desired rotation to each layer in turn.

> rz = withForm (map cw)
> rz' = withForm (map ccw)

These translation and rotation functions are a basis for enumerating all the forms that a shape can produce. As discussed earlier, each shape can be rotated to have its base against one of six faces of the cube ...

> majors = [id, ry, (ry . ry), ry', rz, rz']

... and then rotated in the plane of that face in one of four ways ...

> minors = [id, rx, (rx . rx), rx']

... and then translated by up to two cells in each of three directions, but only so long as the shape still lies within the bounds of the cube.

> shifts shape = [t | t <- [times x tx . times y ty . times z tz |
>                           x <- [0..2],
>                           y <- [0..2],
>                           z <- [0..2]], inBounds t]
>  where
>   inBounds t = size (t (defaultForm shape)) == size (defaultForm shape)
>   times 0 f = id
>   times n f = f . times (n-1) f

So, all the forms that a shape can have can be listed by applying each combination of major rotation, minor rotation and shift to the default form for the shape. Some of these will be duplicates because of the symmetries of the shape. nub is used to eliminate them, which is quite inefficient but this only needs to be computed once for each shape.

As described so far, combining these lists of forms will give 48 times more solutions than are interesting, because every solution that is merely the rotation or reflection of another solution will also be found. An easy way to avoid that is to fix the orientation of one of the shapes - say, the first one. Then, there can't be any solutions that are simply rotations of another solution, because that one piece is always facing the same way. Similarly, omit any translations for it that give any forms which are mirror images. These considerations reduce the forms for the L shape to just four.

> allForms :: Shape -> [Form]
> allForms shape = nub $ [t (defaultForm shape) | t <- transforms shape]
>  where
>   transforms L = [id, ty, tz, (ty . tz)]
>   transforms shape =  [maj . min . shift |
>                        maj <- majors,
>                        min <- minors,
>                        shift <- shifts shape]

For each shape, all the forms it can have and their bit representations. Keeping this list on its own avoids computing a shape's forms and their bit representations afresh for each new partial solution in which the shape is being considered.

> combos = [[(f, asBits f) | f <- allForms shape] | shape <- enumFrom L]

For each shape in turn, add each of its forms in turn into the partial solutions found so far, but only if the bit patterns for the form and the partial solution don't overlap.

> solutions = map fst $ foldr1 addForm combos where
>   addForm partialSolution form = [(pf `mappend` f, pb .|. b) |
>                                   (f, b) <- form,
>                                   (pf, pb) <- partialSolution,
>                                   pb .&. b == 0]

Finally, print out all the solutions, separated by newlines.

> main = putStrLn $ intercalate "\n" [show s | s <- solutions]

An optimised build of this program is easily fast enough:

$ ghc --make -O2 -main-is Cube Cube.lhs
[1 of 1] Compiling Cube             ( Cube.lhs, Cube.o )
Linking Cube ...
$ time ./Cube > solutions.txt

real    0m1.239s
user    0m1.229s
sys 0m0.008s

(These times are from a system with dual 2.5GHz hyperthreaded Intel Core i5 CPUs.)

Naturally, the Soma cube (for that is what it is) has been studied thoroughly already.

It will be interesting to see what optimisations the compiler has found and what the structure of the optimised program looks like. It will also be interesting to see what a comparable program in C or C++ or Python would look like, and how fast they go.

Wednesday 16 May 2012

Why your software has bugs

I really like getting things right. Getting things right was such a rewarding skill at school that I've never managed to unlearn the lesson. And that's unfortunate, because writing software isn't about getting things right.

Writing software, like making any tool, is about getting things right enough.

The value of software is a function of the amount of work you can automate. The more your software can correctly do things on your behalf, the more time you can reclaim in which to make money in some additional way. So, if we (simplistically) define quality of software as the fraction of the time that it does the right thing¸ then its value should be a linear function of its quality. The line won't start at zero, because software can do the wrong thing very very fast and then you have to clear up after it. And it will never reach 100%, because even in the best case you'll have to spend a second or two casting an eye over the results to make sure things did go according to plan.

The cost of software doesn't work like that, though. Doing twice as much testing doesn't find you twice as many bugs, and there will always be some defects lurking in obscure corners where nobody has thought to look.

We can sketch the cost and value of software:


It's clear that, beyond a certain quality threshold, the cost of producing the software will outweigh the value that can be derived from it. Make the product that little bit more perfect, and it will have cost more to make than you can possibly hope to gain from it.

But it's worse than that. At the intersection of those two lines, the author of the software is spending all available resources on making software of the highest possible quality, making zero gain. It's better to maximise the difference between the two curves. The optimal quality is further down the scale:


This turns out to be a result in production theory and it's easy to find equivalents to it in all sorts of other fields. And the consequence is equally general. Once that blue curve is headed back downwards, every incremental increase in quality - every bug fixed - costs more than it's worth. There really is no point in fixing those bugs.