Friday, March 7, 2014

Sudoku in Loco

The Obligatory Sudoku Example

No discussion about constraint solvers is complete without the obligatory sudoku example. Unfortunately, sudoku is such a basic exercise for a constraint solver that it doesn’t really tell you much about the engine. But if anything, because sudoku is so standard, it has become a good way to get a feel for the style of the constraint DSL.

The Loco (0.2.0) version is concise and readable, with a nice separation between the core set of constraints that underlie all sudoku puzzles, versus the additional starter numbers provided by the puzzle.

In most constraint solvers, the constraint operators imperatively impose constraints on a variable store. In Loco, the constraint operators simply produce Clojure data structures. Since we’re working with Clojure data, assembling the full model is simply a matter of concatenating the sequence of constraints describing the fundamental rules of sudoku with the sequence of constraints specific to a given puzzle. This is a good example of how Loco lets you easily build portions of your model separately and then combine them or otherwise manipulate them with standard Clojure functions.

To follow the example, the main thing you need to know about Loco is that the notion of a subscripted variable, like gridi,j, is represented in Loco as a vector, e.g., [:grid i j]

First, here’s how we’re going to ultimately input the Sudoku grids to our solver:

;http://www.mirror.co.uk/news/weird-news/worlds-hardest-sudoku-puzzle-ever-942299
(def worlds-hardest-puzzle
  [[8 - - - - - - - -]
   [- - 3 6 - - - - -]
   [- 7 - - 9 - 2 - -]
   [- 5 - - - 7 - - -]
   [- - - - 4 5 7 - -]
   [- - - 1 - - - 3 -]
   [- - 1 - - - - 6 8]
   [- - 8 5 - - - 1 -]
   [- 9 - - - - 4 - -]])

Here’s the solving code:

(def basic-model
  (concat
    ; range-constraints
    (for [i (range 9) j (range 9)] 
      ($in [:grid i j] 1 9)),
    ; row-constraints
    (for [i (range 9)]
      ($distinct (for [j (range 9)] [:grid i j]))),
    ; col-constraints
    (for [j (range 9)]
      ($distinct (for [i (range 9)] [:grid i j]))),
    ; section-constraints
    (for [section1 [[0 1 2] [3 4 5] [6 7 8]]
          section2 [[0 1 2] [3 4 5] [6 7 8]]]
      ($distinct (for [i section1, j section2] [:grid i j])))))

(defn solve-sudoku [grid]
  (solution
    (concat basic-model
            (for [i (range 9), j (range 9)
                  :let [hint (get-in grid [i j])]
                  :when (number? hint)]
              ($= [:grid i j] hint)))))

Let’s test it out in the REPL:

=> (solve-sudoku worlds-hardest-puzzle)
{[:grid 4 0] 3, [:grid 5 1] 8, [:grid 6 2] 1, [:grid 7 3] 5, 
 [:grid 8 4] 1, [:grid 3 0] 1, [:grid 4 1] 6, [:grid 5 2] 7, 
 [:grid 6 3] 9, [:grid 7 4] 2, [:grid 8 5] 8, [:grid 2 0] 6,
 [:grid 3 1] 5, [:grid 4 2] 9, [:grid 5 3] 1, [:grid 6 4] 7,
 [:grid 7 5] 6, [:grid 8 6] 4, [:grid 1 0] 9, [:grid 2 1] 7,
 [:grid 3 2] 4, [:grid 4 3] 8, [:grid 5 4] 6, [:grid 6 5] 4,
 [:grid 7 6] 9, [:grid 8 7] 5, [:grid 0 0] 8, [:grid 1 1] 4,
 [:grid 2 2] 5, [:grid 3 3] 2, [:grid 4 4] 4, [:grid 5 5] 9,
 [:grid 6 6] 3, [:grid 7 7] 1, [:grid 8 8] 2, [:grid 0 1] 1,
 [:grid 1 2] 3, [:grid 2 3] 4, [:grid 3 4] 3, [:grid 4 5] 5,
 [:grid 5 6] 5, [:grid 6 7] 6, [:grid 7 8] 7, [:grid 0 2] 2,
 [:grid 1 3] 6, [:grid 2 4] 9, [:grid 3 5] 7, [:grid 4 6] 7,
 [:grid 5 7] 3, [:grid 6 8] 8, [:grid 0 3] 7, [:grid 1 4] 8,
 [:grid 2 5] 1, [:grid 3 6] 8, [:grid 4 7] 2, [:grid 5 8] 4,
 [:grid 0 4] 5, [:grid 1 5] 2, [:grid 2 6] 2, [:grid 3 7] 9,
 [:grid 4 8] 1, [:grid 0 5] 3, [:grid 1 6] 1, [:grid 2 7] 8,
 [:grid 3 8] 6, [:grid 0 6] 6, [:grid 1 7] 7, [:grid 2 8] 3,
 [:grid 0 7] 4, [:grid 1 8] 5, [:grid 0 8] 9, [:grid 8 0] 7,
 [:grid 7 0] 4, [:grid 8 1] 9, [:grid 6 0] 5, [:grid 7 1] 3,
 [:grid 8 2] 6, [:grid 5 0] 2, [:grid 6 1] 2, [:grid 7 2] 8,
 [:grid 8 3] 3}

On my machine, this “hardest Sudoku” takes about 17ms to solve. Benchmarking against your favorite constraint solver on your machine, and pretty-printing the output as a readable grid are left as an exercise for the reader.

Wednesday, March 5, 2014

Appointment scheduling in Clojure with Loco

Loco makes it easy to declaratively build constraint satisfaction models. In this blog post, we’ll look at a common use for constraint programming – appointment scheduling – and in so doing, we’ll see some of the ways that Loco goes beyond the features found in other Clojure constraint libraries.

(use 'loco.core 'loco.constraints)

Scheduling appointments with no conflicts

Imagine you have four people coming in for an interview, and you’ve set aside four timeslots in your day to conduct these interviews. You ask each person to list the timeslots when he/she can potentially come in.

Let’s use 0-based indexing to refer to the people, and 1-based indexing to refer to the timeslots.

Person 0 says she can come in at any of the four timeslots: 1, 2, 3, or 4.
Person 1 says he can come in at timeslot 2 or 3.
Person 2 says she can come in at timeslot 1 or 4.
Person 3 says he can come in at timeslot 1 or 4.

So the availability data looks like this:

(def availability
  [[1 2 3 4]
   [2 3]
   [1 4]
   [1 4]])

Let the variable [:person 0] denote the timeslot when person 0 is scheduled to come in, [:person 1] when person 1 comes in, etc.

(def person-vars
  (for [i (range (count availability))] [:person i]))

We want to constrain each [:person i] variable to the available timeslots.

(def availability-constraints
  (for [i (range (count availability))]
    ($in [:person i] (availability i))))

We want to ensure we don’t schedule two people in the same timeslot.

(def all-different-constraint
  (apply $all-different? person-vars))

For convenience, let’s assemble the constraints into one big list (the order doesn’t matter in Loco):

(def all-constraints 
  (conj availability-constraints all-different-constraint))

Now we’re ready to solve. Let’s dump the solution into a sorted-map for easy readability.

=> (into (sorted-map) (solution all-constraints))
{[:person 0] 3, [:person 1] 2, [:person 2] 4, [:person 3] 1}

So there you have it. Once we’ve played around with this example interactively in the REPL, and are confident in the model, we can easily abstract this into a function that takes availability data, and returns the schedule:

(defn schedule [availability]
  (->>
    (solution
      (conj
        (for [i (range (count availability))]
          ($in [:person i] (availability i)))
        ($distinct
          (for [i (range (count availability))] [:person i]))))
    (into (sorted-map))))

=> (schedule
     [[1 3 5]
      [2 4 5]
      [1 3 4]
      [2 3 4]
      [3 4 5]])
{[:person 0] 5, [:person 1] 4, [:person 2] 1, [:person 3] 2, [:person 4] 3}

I think the declarative Loco way of modeling constraints is concise and elegant, but this example could just as easily be done in, say, core.logic. So let’s push beyond, into an area that (as far as I know) can’t be done with core.logic.

Scheduling appointments minimizing conflicts

The above scheduler is somewhat naive.

=> (schedule 
     [[1 2 3 4]
      [1 4]
      [1 4]
      [1 4]])
{}

This doesn’t work because there’s no way to satisfy the constraint that no two people can be scheduled in the same timeslot.

But let’s say, hypothetically, that if absolutely necessary, we can potentially squeeze two candidates into the same timeslot. We’d rather not, but we can do it if we have to. Can we build a model for this?

Again, let’s start exploring the problem interactively with global defs and playing around with it at the REPL. Here’s the problematic availability example:

(def availability
  [[1 2 3 4]
   [1 4]
   [1 4]
   [1 4]])

As before, we’ll want to constraint each person’s timeslot to his/her availability schedule:

(def availability-constraints
  (for [i (range (count availability))]
    ($in [:person i] (availability i))))

Let’s define a few names for convenience. Let timeslots be a list of all the timeslot numbers.

(def timeslots (distinct (apply concat availability)))

Let person-vars be the list of all [:person i] variables.

(def person-vars
  (for [i (range (count availability))] [:person i]))

Now for the interesting part. We want to allow up to 2 people in a given timeslot. So we’ll let the variable [:num-people-in-timeslot 1] be the number of people signed up for timeslot 1, and so on. Let people-in-timeslot-vars be the list of all such variables.

(def people-in-timeslot-vars
  (for [i timeslots] [:num-people-in-timeslot i]))

Now, we create a list of constraints that state that each of these [:num-people-in-timeslot i] variables ranges between 0 and 2.

(def conflict-constraints
  (for [i timeslots]
    ($in [:num-people-in-timeslot i] 0 2)))

To give these :num-people-in-timeslot variables the appropriate meaning, we need to bind each [:num-people-in-timeslot i] variable to the number of times i occurs among the variables [:person 1], [:person 2], etc. Loco’s $cardinality constraint allows us to do exactly that. For example,

($cardinality [:x :y :z] {1 :number-of-ones})

will bind :number-of-ones to the number of times 1 occurs among :x, :y, and :z. So, the following constraint will bind all the [:num-people-in-timeslot i] variables to their appropriate values.

(def number-in-timeslots
  ($cardinality person-vars
                (zipmap timeslots people-in-timeslot-vars)))

To minimize the number of conflicts, we need to count the number of conflicts.

Let the variable :number-of-conflicts stand for the number of timeslot conflicts we have. We need two constraints on :number-of-conflicts. The first constraint just sets up the finite domain that the variable could range over (i.e., 0 to the total number of timeslots). We need to do this because in Loco, every variable must be declared somewhere in the model. The second constraint binds :number-of-conflicts to the number of times 2 appears in the variables [:num-people-in-timeslot 1], [:num-people-in-timeslot 2], etc.

(def number-of-conflicts
  [($in :number-of-conflicts 0 (count timeslots))
    ($cardinality people-in-timeslot-vars {2 :number-of-conflicts})])

We built the constraints in parts; now building the model is simply a matter of concatting all the constraints together. (Note that number-in-timeslots is a single constraint, so we concatenate [number-in-timeslots] in with the other lists of constraints).

(def all-constraints (concat availability-constraints
                             conflict-constraints
                             [number-in-timeslots]
                              number-of-conflicts))

Now, we’re all set up to solve the model.

=> (solution all-constraints
             :minimize :number-of-conflicts)
{[:person 0] 2,
 [:person 1] 4,
 [:person 2] 4,
 [:person 3] 1,
 :number-of-conflicts 1,
 [:num-people-in-timeslot 1] 1,
 [:num-people-in-timeslot 2] 1,
 [:num-people-in-timeslot 3] 0,
 [:num-people-in-timeslot 4] 2}

In the final version, we really only want to see the [:person i] variables; Loco allows us to hide the other variables from the output by prepending an underscore character in front of the variable names.

So let’s abstract this into a more robust schedule-with-conflicts function.

(defn schedule-with-conflicts [availability]
  (let [timeslots (distinct (apply concat availability)),

        availability-constraints
        (for [i (range (count availability))]
          ($in [:person i] (availability i))),

        person-vars
        (for [i (range (count availability))] [:person i]),

        people-in-timeslot-vars
        (for [i timeslots] [:_num-people-in-timeslot i]),

        conflict-constraints
        (for [i timeslots]
          ($in [:_num-people-in-timeslot i] 0 2)),

        number-in-timeslots
        ($cardinality person-vars 
                      (zipmap timeslots people-in-timeslot-vars)),

        number-of-conflicts
        [($in :_number-of-conflicts 0 (count timeslots))
          ($cardinality people-in-timeslot-vars {2 :_number-of-conflicts})]

        all-constraints (concat availability-constraints
                                conflict-constraints
                                [number-in-timeslots]
                                number-of-conflicts)]

    (into (sorted-map)
          (solution all-constraints :minimize :_number-of-conflicts))))

Let’s give it a spin:

=> (schedule-with-conflicts
       [[1 2 3 4]
        [1 4]
        [1 4]
        [1 4]])
{[:person 0] 2, [:person 1] 4, [:person 2] 4, [:person 3] 1}

Written with StackEdit.

Optimization with Loco

I had the pleasure of beta-testing Loco, which was announced today. Loco is a Clojure constraint solver library built on top of the Java library Choco. There are several constraint libraries available for Clojure, including core.logic, propaganda, and clocop, each with a slightly different focus.

The features that make Loco shine are the performance of the constraint engine, the fully declarative API, the ease with which one can build models, support for several interesting global constraints, and the ability to find an optimal solution for models with multiple solutions.

This is the first article of what I hope to be a series, detailing some of the interesting problems you can solve with Loco.

Scheduling Buses with Loco

(use 'loco.core loco.constraints)

Loco is a powerful and expressive constraint solver, but it can also be used to solve certain kinds of integer linear programs.

One classic example is bus scheduling. Imagine that we are city transportation planners and we want to minimize the number of buses we need to operate in order to meet demand. We know the number of buses demanded for four-hour blocks of time.

(def demands
  {:12am-4am  8
   :4am-8am  10
   :8am-12pm  7
   :12pm-4pm 12
   :4pm-8pm   4
   :8pm-12am  4})

So for example, this map tells us that there is sufficient demand for 8 buses operating between 12am and 4am.

The interesting twist is that buses operate for 8 hours at a time. So, if we set a bus into operation at 12am, it operates an 8-hour shift from 12am-8am. So the question is, how many buses do we need to run from 12am-8am, and from 4am-12pm, etc. in order to meet the above demands.

We can represent this by a series of variables, each of which must be an integer from 0 through 12 (since 12 is the maximum overall demand).

So with loco, we can get the solution quite simply:

(solution
  [($in :bus-12am-8am 0 12)
   ($in :bus-4am-12pm 0 12)
   ($in :bus-8am-4pm 0 12)
   ($in :bus-12pm-8pm 0 12)
   ($in :bus-4pm-12am 0 12)
   ($in :bus-8pm-4am 0 12)
   ($>= ($+ :bus-8pm-4am :bus-12am-8am) (demands :12am-4am))
   ($>= ($+ :bus-12am-8am :bus-4am-12pm) (demands :4am-8am))
   ($>= ($+ :bus-4am-12pm :bus-8am-4pm) (demands :8am-12pm))
   ($>= ($+ :bus-8am-4pm :bus-12pm-8pm) (demands :12pm-4pm))
   ($>= ($+ :bus-12pm-8pm :bus-4pm-12am) (demands :4pm-8pm))
   ($>= ($+ :bus-4pm-12am :bus-8pm-4am) (demands :8pm-12am))]
  :minimize
  ($+ :bus-12am-8am :bus-4am-12pm :bus-8am-4pm 
      :bus-12pm-8pm :bus-4pm-12am :bus-8pm-4am))

which yields

{:bus-8pm-4am 0, :bus-4pm-12am 4, :bus-12pm-8pm 7, 
 :bus-8am-4pm 5, :bus-4am-12pm 2, :bus-12am-8am 8}

Let’s see if we can generalize this to handle an arbitrary number of evenly-spaced time periods. Clearly, to do this we’ll need to get away from demands and variables that directly name the timespan. Instead, for our example in which we sliced the day into six 4-hour time periods, we can imagine indexing these blocks of time (0-based) as “Time period 0” through “Time period 5”. So we can just use a vector for our demands, for example:

[8 10 7 12 4 4]

means that 8 buses are demanded for time period “0” (corresponding to 12am-4am), 10 buses are demanded for time period “1” (corresponding to 4am-8am),… up to a demand of 5 buses for time period “5”.

We’ll make use of Loco’s ability to treat vectors as subscripted variables. So [:buses 0] denotes , which is our variable for how many buses we schedule starting at the beginning of time period 0 (i.e., 12am). [:buses 1] (i.e., ) is the variable for the number of buses starting at the beginning of time period 1, etc.

We will also need an input, span which indicates how many consecutive time periods are spanned by the bus’s shift. In our example, span would be 2 (because a bus works for 2 of our 4-hour time periods).

(defn minimize-buses
  "Takes a vector of the demands for any number of equally-spaced time slots.
   span is the number of time slots that a bus's operating time spans"
  [demands span]
  (let [time-slots (count demands),
        max-demand (apply max demands),

        declarations
        (for [i (range time-slots)]
          ($in [:buses i] 0 max-demand))

        constraints
        (for [i (range time-slots)]
          ($>=
            (apply $+ (for [j (range (inc (- i span)) (inc i))]
                        [:buses (mod j time-slots)]))
            (demands i)))]

    (solution
      (concat declarations constraints)
      :minimize (apply $+ (for [i (range time-slots)] [:buses i])))))

Let’s test it out on our original sample demand:

=> (minimize-buses [8 10 7 12 4 4] 2)
{[:buses 5] 0, [:buses 4] 4, [:buses 3] 7, [:buses 2] 5, [:buses 1] 2, [:buses 0] 8}

Hmmm, it’s a little hard to read. We can fix that:

=> (into (sorted-map) (minimize-buses [8 10 7 12 4 4] 2))
{[:buses 0] 8, [:buses 1] 2, [:buses 2] 5, [:buses 3] 7, [:buses 4] 4, [:buses 5] 0}

Good, same answer as before. But now we can easily adjust to alternative demand schedules. For example, here’s a solution for a demand schedule based on 2-hour time periods, while buses still work 8-hour shifts:

=> (into (sorted-map)
      (minimize-buses [1 5 7 9 11 12 18 17 15 13 4 2] 4))

{[:buses 0] 0, [:buses 1] 1, [:buses 2] 2, [:buses 3] 6, [:buses 4] 2,
 [:buses 5] 2, [:buses 6] 8, [:buses 7] 5, [:buses 8] 0, [:buses 9] 0,
 [:buses 10] 0, [:buses 11] 4}

Now, let’s try a demand schedule with 1-hour time periods, with buses working 8-hour shifts:

=> (into (sorted-map)
      (minimize-buses
        [1 3 5 7 9 11 12 13 14 15 16 19
         18 17 15 13 15 16 10 8 6 5 4 2]
        8))

Uh oh, this seems to run forever. We can fix this with the timeout feature. In the definition of minimize-buses, we change the call to solution as follows:

(solution
      (concat declarations constraints [constraint])
      :minimize (apply $+ (for [i (range time-slots)] [:bus i]))
      :timeout 1000)

The :timeout keyword specifies a number of milliseconds, after which the solver should return the best solution it has found so far:

=> (into (sorted-map)
      (minimize-buses
        [1 3 5 7 9 11 12 13 14 15 16 19
         18 17 15 13 15 16 10 8 6 5 4 2]
        8))

{[:bus 0] 1, [:bus 1] 2, [:bus 2] 2, [:bus 3] 2, [:bus 4] 2, [:bus 5] 2,
 [:bus 6] 1, [:bus 7] 1, [:bus 8] 2, [:bus 9] 3, [:bus 10] 3, [:bus 11] 5,
 [:bus 12] 1, [:bus 13] 1, [:bus 14] 2, [:bus 15] 2, [:bus 16] 2, [:bus 17] 0,
 [:bus 18] 0, [:bus 19] 0, [:bus 20] 0, [:bus 21] 0, [:bus 22] 0, [:bus 23] 0}

Written with StackEdit.