 Wild Tech "Garden"

# Results for Solving Zebra and Sherlock

## Results

### Hypotheses 1 and 2

The hypothesis that achieving arc consistency using AC-3 is sufficient to solve the Zebra problem can be quickly disproved using the benchmark Zebra problem presented earlier in this paper. Once arc consistency has been achieved there still remain several variables with more than one possible value in its domain. Similar results can be seen for the Sherlock problem by applying AC-3 to a random case generated by the GenerateRandomProblem() procedure of the preceeding section.

One such sample problem appears in Appendix A

Kumar (in [Kumar92]) reports that arc consistency is not, in general, sufficient to find a solution (or that there is no solution). He then goes on to show that for a contraint graph having n nodes, it can be solved using only arc consistency, if the graph is n-consistent. He also shows that it is sometimes possible to solve a n-node bcsp with less than n-consistency.

### Hypotheses 3 and 4

Given that arc consistency by itself cannot solve all binary constraint problems it follows that an algorithm such as BM-CBJ2 which can is better at solving the bcsp’s. Having said that, comparing the performance of AC-3 to that of BM-CBJ2 for problems that AC-3 can solve is still an interesting exercise. Therefore, using problems generated using GenerateStrongKProblem(), we obtain the following results over 1000 test cases for each of the Zebra and Sherlock problem.

Measurement AC-3 vs. BM-CBJ2
Fewer Checks (number of cases) 25975
Less Time (number of cases) 277710
Average Number of Checks 234103755
Minimum Number of Checks 9876137
Maximum Number of Checks 42086134394
Average Time (CPU + System) (seconds) 1.771.72
Minimum Time (CPU + System) (seconds) 1.571.39
Maximum Time (CPU + System) (seconds) 1.985.24

AC-3 vs. BM-CBJ2: Zebra

Measurement AC-3 vs. BM-CBJ2
Fewer Checks (number of cases) 124876
Less Time (number of cases) 417573
Average Number of Checks 89090139170
Minimum Number of Checks 33529294
Maximum Number of Checks 16439161250399
Average Time (CPU + System) (seconds) 2.164.67
Minimum Time (CPU + System) (seconds) 1.831.50
Maximum Time (CPU + System) (seconds) 2.561145.12

AC-3 vs. BM-CBJ2: Sherlock

What is most interesting about the results is the fact that in a small number cases BM-CBJ2 is much worse than AC-3, so much so that for Sherlock most of the performance measures look worse for BM-CBJ2 than AC-3 despite the fact that it was better more than half the time. Exploring the properties of the problems that result in such bad performance is a useful future exercise.

### Hypothesis 5

The hypothesis that BM-CBJ2 is more efficient at solving most of a large random sample of of Zebra problems is true, as can be seen below. Further, we can see that the algorithm tested can be ordered, both in terms of constraint checks and in terms of time, as follows: BM-CBJ2 < DoubleX < Xover < Mutate although DoubleX is only slightly better than Xover. The change to Xover to get DoubleX is not a clear winner, but crossover plus mutation is without a doubt better than mutation alone. The numerical results follow.

Measurement BM-CBJ2 Mutate Xover DoubleX
Average Number of Checks 3554 12588855 3897429 3315161
Minimum Number of Checks 412 541110 379620 499423
Maximum Number of Checks 15093 132081622 323868611 212341720
Average Time (CPU + System) (seconds) 2.17 62.65 22.00 19.30
Minimum Time (CPU + System) (seconds) 1.82 5.44 4.66 5.14
Maximum Time (CPU + System) (seconds) 6.49 686.84 1603.49 986.12

Zebra, BM-CBJ2 vs Mutate vs Xover vs DoubleX

## Conclusions and Future Work

This paper studied a number of hypothesis, and in the process developed some algorithms. The algorithms for generating random problems proved to be harder to develop than expected. Original attempts involved attempts to use a heuristic rather than solving the bcsp for each new addition of a constraint, but that failed so the current solution (using AC-3 for strongly k-consistent generation and CBJ-Multi for ‘totally random’ problems) was developed.

The genetic algorithm development proved easier although a genetic algorithm which bested BM-CBJ2 was not found in producing this paper (a possible route to take for this is to make the chromasones more complex, rather than simply being the solution to the bcsp). BM-CBJ2 proved to be the best complete (i.e. works for all problems) algorithm as AC-3 cannot always find a solution, but for strongly k-consistent problems AC-3 was found to be the better choice for both the Zebra and the Sherlock problems. If determining k-consistency before solving the problem proves to be both possible and efficient a hybrid approach could prove to be ideal.

Further work on random problem generation is needed. It would, example, be useful to test the ‘totally random’ problems to determine if they are a good cross-section of real-world problems, or if they tend to cluster around a particular type of problem. In addition, it could be informative to determine how many of good random selection of problems are strongly k-consistent (and therefore solveable by AC-3).

Additional questions include the value of AC-3 preprocessing, which according to [Kumar92] is a common practise, and how BM-CBJ2 stacks up against both other backtracking algorithms ([Kondrak97] proves that BM-CBJ2 performs fewer constraint checks but whether is is faster depends on whether the reduction in constraint checking comes with a greater overhead cost). BM-CBJ2 also needs to be compared to other algorithms out there. In addition, this paper makes no attempt to compare variable instantiation orders for BM-CBJ2, and it is known that variable instantiation order can have a significant performance impact on backtracking and backjumping algorithms [Prosser93].