Information about http://research.microsoft.com/~horvitz/keys_aaai04.pdf

The Backdoor Key: A Path to Understanding Problem…

Tags: 2003a, backdoors, bility, department of computer science, empirical studies, ensembles, eric horvitz, faction, gomes, henry kautz, key size, microsoft research, problem instance, redmond wa, ruan, search algorithms, seattle wa, selman, solution time, washington edu,
Pages: 6
Language: english
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
              The Backdoor Key: A Path to Understanding Problem Hardness
                   Yongshao Ruan, Henry Kautz                                        Eric Horvitz
                     Department of Computer Science                              Microsoft Research
                        University of Washington                                Redmond, WA 98052
                           Seattle, WA 98195                                 horvitz@microsoft.com
                      ruan@cs.washington.edu
                     kautz@cs.washington.edu


                            Abstract                                 doors (on the order of a few dozen variables). These al-
                                                                     gorithms are best-case exponential in the size of the back-
  We introduce our work on the backdoor key, a concept that          door, and thus are practical for problems with backdoors up
  shows promise for characterizing problem hardness in back-
  tracking search algorithms. The general notion of backdoors        to about size 20. For problems with larger backdoors --
  was recently introduced to explain the source of heavy-tailed      which do indeed appear to be the vast majority of satisfia-
  behaviors in backtracking algorithms (Williams, Gomes, &           bility problems of interest -- the best complete algorithms
  Selman 2003a; 2003b). We describe empirical studies that           remain ones based on Davis-Putnam-Loveland-Logemann
  show that the key faction,i.e., the ratio of the key size to the   style backtracking search. Therefore it is important to in-
  corresponding backdoor size, is a good predictor of problem        vestigate the precise connection between the backdoor prop-
  hardness of ensembles and individual instances within an en-       erties of an problem instance or ensemble and problem
  semble for structure domains with large key fraction.              hardness as measured by solution time using backtracking
                                                                     search.
                        Introduction                                    Intuitively, the difficulty of identifying a backdoor set and
                                                                     setting a correct assignment to the backdoor variables by a
Propositional reasoning (SAT) is the best-known NP-                  backtracking solver is positively correlated to problem hard-
complete problem in computer science. Even though a                  ness because once the backdoor set is assigned correctly, the
class of SAT problems may be intractable in the worst case,          remaining problem becomes trivial. However, as we shall
most of its instances may still be polynomial-time solvable          describe below, problem hardness is not a simple function of
in practice and their hardness may vary significantly. It            backdoor size. In particular, the backdoor size does not cap-
has proven extremely difficult in practice to define realis-         ture the dependencies among backdoor variables. We intro-
tic problem ensembles where instances do not vary widely             duce the concepts of backdoor key variables to capture such
in hardness (Kautz et al. 2001). Different instances have            dependencies. A backdoor key variable is a backdoor vari-
widely varying hardness even in the well-known random 3-             able whose value is logically determined by settings of other
SAT problem ensembles with fixed clause-to-variable ratios           backdoor variables. We define the key fraction of a problem
(Kirkpatrick & Selman 1994). In such cases, it is crucial to         as the ratio of the key size to the corresponding backdoor
understand the hidden structures that can be used to identify        size, and investigate its relation with instance hardness and
problem hardness and as points of attack for search heuris-          ensemble hardness. Instance hardness is the hardness of an
tics.                                                                individual instance, which we take as the median search cost
   There has been considerable research interest in identify-        of a set of runs of a search algorithm applied to the given
ing such structures for many years. One example of such              instance. Ensemble hardness is the median hardness of an
hidden structure is the backdoor set (Williams, Gomes, &             ensemble of instances. Empirical results demonstrate that
Selman 2003a). Given a SAT formula, a set of variables               key fractions are good predictors for both ensemble hard-
forms a backdoor for a problem instance if there is a truth as-      ness and instance hardness for structure domains with large
signment to these variables such that the simplified formula         key fractions.
can be solved in polynomial time via use of the propaga-
tion and simplification mechanism of the SAT solver under
consideration.                                                              Previous Work on Hidden Structure
   The original motivation of the backdoor concept was to            We first summarize prior work on identifying hidden struc-
explain the heavy-tailed behavior of backtracking search al-         tures that are linked to problem hardness. Next, we de-
gorithms (Williams, Gomes, & Selman 2003b). Williams                 scribe two structure benchmark domains used in our empir-
et al. went on to describe a variety of simple random-               ical studies.
ized "guessing" algorithms for solving problems with small
backdoor sets, and empirically showed than many interest-            Hidden Structures for Problem Hardness
ing benchmark SAT problems have surprisingly small back-
                                                                     An important step in understanding problem hardness is
Copyright c 2004, American Association for Artificial Intelli-       to relate hardness with observations of phase transitions in
gence (www.aaai.org). All rights reserved.                           problem distributions as the underlying parameters are var-
ied. Many NP-complete problems like satisfiability and con-      QCP problem is to complete a partially-filled Latin square,
straint satisfaction display phase transition behavior in sol-   where the "order" of the instance is the length of a side of
ubility and the easy-hard-easy pattern of search cost asso-      the square. We used a version called Quasigroup with Holes
ciating with increasing constrainedness. Problems that are       (QWH), where problem instances are generated by erasing
under-constrained tend to have many solutions and thus gen-      values from a solved Latin square (Achlioptas et al. 2000;
erally easy to solve due to the high probability of guess-       Kautz et al. 2001). Note that QWH problems are satisfi-
ing a solution. On the other hand, problems that are over-       able by definition as they are built from previously solved
constrained tend to have no solution. These problems are         problems.
also generally easy to solve because there are many con-            We also will study problem hardness for a second prob-
straints and thus many possible solutions can be pruned ef-      lem domain, the morphed Graph Coloring Problem intro-
ficiently. Hard problems are usually located at a critical-      duced by Gent et al. (Gent et al. 1999). The Graph Col-
constrained point, where roughly half the instances are sat-     oring Problem (GCP) is a well-known combinatorial prob-
isfiable and half the instances are unsatisfiable. One well-     lem from graph theory. Given a graph G = (V, E), where
know example is the random 3-SAT problem with varying            V = {v1 , v2 , ..., vn } is the set of vertices and E the set of
clause-to-variable ratio: instances with the ratio equals 4.25   edges connecting the vertices, we seek to find a coloring
are the most difficult ones on average (Mitchell, Selman,        C : V  N , such that connected vertices always have dif-
& Levesque 1992). Other examples include the average             ferent colors. The challenge is to decide whether a coloring
graph connectivity for graph coloring problem (Cheeseman,        of the given graph exists for a particular number of colors.
Kanefsky, & Taylor 1991); constraint tightness (Smith 1994;      A p-morph of two graphs A = (V, E1 ) and B = (V, E2 ) is
Prosser 1996) for constraint satisfaction problems. A gener-     a graph C = (V, E) where E contains all the edges com-
alized the notion of constrainedness for an ensemble of in-      mon to A and B, a fraction p of the edges from E1 - E2
stances was introduced by Gent et al. (Gent et al. 1996).        (the remaining edges of A), and a fraction 1 - p of the edges
    However, information about the number of solutions does      from E2 - E1 . The test sets considered here are obtained
not tell the whole story. Mammem et al. (Mammen & Hogg           by morphing regular ring lattices, where the vertices are or-
1997) showed that the easy-hard-easy pattern still appears       dered cyclically and each vertex is connected to its k closest
for some search methods even the number of solution is held      in this ordering, and random graphs from the well-known
constant. In these cases, the pattern appears to be due to       class Gnm . The morphing ratio p controls the amount of
changes in the size of the minimal unsolvable subproblems,       structure in the problem instances. The test sets used in our
rather than changing number of solutions.                        studies are from (sat ).
    Another important line of research has been focused on          Finally, we consider domains of logistics planning (Kautz
the relationship between hardness and backbone variables.        & Selman 1996) and circuit synthesis problems (Kamath et
For satisfiable SAT instances, the backbone is the set of lit-   al. 1993). All instances of these domains are satisfiable
erals which are logically entailed by the clauses of the in-     and encoded in propositional encodings. The solver we used
stance, that is, backbone variables take on the same values      was Satz-Rand (Gomes, Selman, & Kautz 1998), a random-
in all solutions.                                                ized version of the Satz system (Li & Anbulagan 1997) with
    Achlioptas et al. (Achlioptas et al. 2000) demonstrated      powerful variable selection heuristics, and zChaff (version
that there is a phase transition phenomenon in the backbone      Z2001.2.17) (Zhang et al. 2001).
fraction­the ratio of the size of backbone to the total number
of variables­with changes in the number of unset variables                               Backdoors
or "holes" created in solutions to Quasigroup with Holes
                                                                 The backdoor (Williams, Gomes, & Selman 2003a; 2003b)
(QWH) instances. They also show that the phase transition
                                                                 of a problem aims to capture structural properties in a prob-
coincides with the hardness peak in local search. However,
                                                                 lem that underlie heavy-tailed behaviors in backtracking al-
the hardness peak of backtrack algorithms does not coincide
                                                                 gorithms. A set of variables forms a backdoor for a prob-
with the phase transition of backbone fraction seen in lo-
                                                                 lem instance if there is a truth assignment for these variables
cal search. The relationships between the hardness peak and
                                                                 such that the simplified formula can be solved in polyno-
backbone fraction is even less clear for optimization and ap-
                                                                 mial time by the propagation and simplification mechanism
proximation problems where backbone size has been found
                                                                 of the SAT solver under consideration. That is, after setting
to be negatively or positively correlated with hardness de-
                                                                 the backdoor variables correctly, the simplified formula falls
pending on the domain at hand (Slaney & Walsh 2001).
                                                                 in a polynomially solvable class.
    In other work, Singer et al. (Singer, Gent, & Smaill 2000)      Given a Boolean formula F, let V be the set of variables
found that the number of solutions is only relevant for small-   in F. Let AB : B  V  {T rue, F alse} be a partial
backbone random 3-SAT instances. They introduced a mea-          truth assignment and F[AB ] denote the simplified formula
sure of the backbone fragility of an instance, which indi-       obtained from the formula F by setting the partial truth as-
cates how persistent the backbone is as clauses are removed.     signment AB .
The backbone fragility of an instance is positively correlated
                                                                    A set of backdoor variables is defined with respect to a
with the problem hardness for local search.
                                                                 particular search algorithm; once the backdoor variables are
                                                                 assigned certain values, the problem becomes polynomial
Benchmark Domains and Solvers                                    time solvable by the algorithm. Such algorithms are called
We shall first investigate hardness and backdoor keys within     sub-solvers (Williams, Gomes, & Selman 2003a). A sub-
the benchmark domain of a version of the Quasigroup Com-         solver S always runs in polynomial time. Given a formula F
pletion Problem (QCP) (Gomes & Selman 1997). The basic           as the input, S either rejects F, or determines F correctly as
                                                  1                                                                                                         1
                                                                                                   Hardness                                                                    Hardness
                                                                                               Backdoor size                                                               Backdoor size




                    Hardness and backdoor size




                                                                                                                              Hardness and backdoor size
                                                 0.8                                                                                                       0.8


                                                 0.6                                                                                                       0.6


                                                 0.4                                                                                                       0.4


                                                 0.2                                                                                                       0.2


                                                  0                                                                                                         0
                                                  2^-8         2^-7        2^-6   2^-6 2^-4 2^-3          2^-2   2^-1   2^0                                  310       320      330   340 350 360 370      380   390   400
                                                                                    Morphing ratio                                                                                       Number of hole

Figure 1: Normalized backdoor sizes and ensemble hardness. Left: Morphed GCP and its x-axis is the morphed ratio. Right:
QWH of order 33 and its x-axis is the number of holes. For hardness, each data point represents the median value of the
run-time of Satz-rand on 100 instances. For backdoor size, data points represent the mean value of 100 instances
                                                 1000                                                                                                      100000


                                                                                                                                                            10000
                    Hardness (Satz)




                                                                                                                              Hardness (Satz)
                                                 100
                                                                                                                                                             1000


                                                                                                                                                                 100
                                                  10

                                                                                                                                                                 10


                                                       1                                                                                                           1
                                                           9          10          11        12       13          14     15                                             2       4      6    8    10    12   14    16    18
                                                                                       Backdoor size                                                                                       Backdoor size

Figure 2: Backdoor sizes and instance hardness. The x-axis represents the backdoor size and the y-axis is the instance hardness.
Left: Morphed GCP with morphed ratio = 2-6 . Right: QWH of order 33 and 363 holes. For each data point (x, y), y represents
the median search cost of 100 runs of Satz-rand and x represents the mean size of 100 backdoor sets.

unsatisfiable or satisfiable, returning a solution if satisfiable.                                                                                         variable branching function, i.e., allowing only unit propa-
  Given a sub-solver S, a formula F and its variables set V ,                                                                                              gation and simplification. By the definition of backdoor, the
we can definite formally the notion of backdoor.                                                                                                           set of branching variables in a solution contains a backdoor
Definition 1 (Backdoor) A nonempty subset B of the vari-                                                                                                   set. Unnecessary variables were randomly removed one by
                                                                                                                                                           one from the set of branching variables until the remaining
able set V is a backdoor for F w.r.t S if for some partial
                                                                                                                                                           set of variables is a minimal backdoor set. For each ensem-
truth assignment AB , namely, a backdoor truth assignment,
S returns a satisfying assignment of F[AB ].                                                                                                               ble, we collected data of 100 instances; for each instance,
                                                                                                                                                           we generated 100 backdoor sets.
    Note that, by definition, backdoor sets exist only for sat-                                                                                               Recall that a backdoor is a set of variables that, if set cor-
isfiable instances. A corresponding notion of strong back-                                                                                                 rectly, renders the remaining problem trivial to solve. Thus,
doors can be defined for unsatisfiable instances (Williams,                                                                                                the difficulty of identifying and assigning backdoor vari-
Gomes, & Selman 2003a). We will not discuss strong back-                                                                                                   ables correctly is proportional to the difficulty of solving a
doors.                                                                                                                                                     problem. It is reasonable to assume that the backdoor of
    Note that any superset of a backdoor set is also a backdoor                                                                                            a problem reflects a core structural property that is closely
set. We are interested in backdoor sets with no unnecessarily                                                                                              linked to the problem hardness of the problem. However, we
variables, which we refer to as minimal backdoor sets.                                                                                                     have found that the straightforward way of using backdoor
Definition 2 (Minimal Backdoor) A nonempty backdoor                                                                                                        size as a predictor does not correlate well with ensemble or
set B for F w.r.t S is minimal if for any v  B such that                                                                                                   instance hardness.
B - v is not a backdoor set for F w.r.t S.                                                                                                                    Figure 1 shows that there is no significant correlation be-
                                                                                                                                                           tween backdoor size and ensemble hardness for either mor-
  For the rest of this paper, unless otherwise specified, we                                                                                               phed GCP or QWH domains. For morphed GCP, the back-
assume that a backdoor set is minimal.                                                                                                                     door size increases with increasing morphing ratio p, or the
                                                                                                                                                           declining amount of structure in the problem instances. This
Backdoor Size and Hardness                                                                                                                                 is consistent with prior observations that structured domains
We performed empirical studies to test the relation between                                                                                                tend to have smaller backdoor sizes, while random domains
backdoor sizes and hardness on the morphed GCP, QWH,                                                                                                       tend to have larger backdoor sizes (Williams, Gomes, & Sel-
circuit synthesis, and logistics planning domains. We modi-                                                                                                man 2003a).
fied the SAT solver Satz-rand to keep track of the set of vari-                                                                                               We have also noted that ensemble hardness for morphed
ables selected for branching and their truth assignments in a                                                                                              GCP does not increase monotonically with increasing back-
solution. We also used Satz as a sub-solver by turning off its                                                                                             door size. Instead, the ensemble with the largest backdoor
size is the easiest to solve and the peak of hardness is located   assignment AB if and only if v is a dependent variable in F
at some intermediate point between the totally random and          with respect to the partial truth assignment AB-v
totally structured ensembles. The same observation applies
to QWH problems: Backdoor size increases monotonically                In distinction to variables with superficial dependencies,
with the increasing number of holes while the peak of hard-        which can be easily detected by unit-propagation and sim-
ness is located at some intermediate point.                        plification, a backdoor key set represent a deep depen-
   We additionally discovered that there is poor correlation       dency whose detection requires extra effort beyond unit-
between backdoor sizes and instance hardness within an en-         propagation and simplification. We believe that such deep
semble. In Figure 2, the graph on the left captures results ob-    dependencies provide a view onto the core structural prop-
tained with morphed GCP, showing a weak negative corre-            erties that lay at the foundations of problem hardness.
lation while the right figure for QWH shows a weak positive           We know that once the backdoor variables are set cor-
correlation. Empirical studies using zChaff demonstrated           rectly, the remaining problem becomes trivial. Thus, the
similar results, i.e., no significant correlation between back-    difficulty of solving a problem is proportional to the diffi-
door sizes and ensemble or instance hardness, which are not        culty of identifying and assigning backdoor variables. We
shown here due to limitation of space.                             do not yet know how to estimate the difficulty of identifying
   We believe that the reason that the size of a backdoor set      backdoor variables; this is an open challenge. It is generally
alone is not enough to predict hardness is that the back-          believed that modern variable selection heuristics have done
door size does not capture the dependencies among back-            reasonably well at identifying backdoor variables from the
door variables, i.e., the "dependent" variables whose truth        success of the state-of-the-art backtrack search algorithms.
values are completely determined by other variables in the            We believe that the difficulty of assigning backdoor vari-
backdoor set. For an example, in the case of morphed GCP,          ables correctly is captured by the notion of a backdoor key,
although the backdoor sizes of the totally random ensem-           because the dependent variables are the ones which are
ble are larger, there are few dependent variables. There-          likely to be assigned incorrectly. Thus, the size of a back-
fore, the probability of setting some backdoor variable to         door key is a prime candidate for predicting problem hard-
the "wrong" value is smaller, even though more backdoor            ness. Another candidate is the relative size of a backdoor
variables must be set. On the other hand, for instances at the     key set, which we refer to as the backdoor key fraction:
hardness peak, the dependencies among backdoor variables           Definition 5 (Backdoor Key Fraction) A backdoor key
are higher and there are more dependent variables. In these        fraction is the ratio of the size of the backdoor key set to the
cases, it is likely that a backdoor variable will be set to a      size of the corresponding backdoor set.
value that makes the remaining problem unsatisfiable.
   This observation suggests that problem hardness is the re-        We will see that in many domains the key fraction is a
sult of the interaction between backdoor size and dependen-        precise predictor of problem hardness.
cies among backdoor variables, which we will address in the
next section.                                                      Experiments
                                                                   We performed a set of empirical studies to test our hypoth-
                     Backdoor Keys                                 esis. We used the method described in previous section to
In the last section, we showed that problem hardness is not        collect minimal backdoor sets and then calculated the key
a simple function of backdoor size, and proposed the need          size for each backdoor set by testing whether a backdoor
for considering the dependencies among backdoor variables.         variable is in the key set with respect to the truth assignment
To capture such dependencies, we introduce the notion of           of the backdoor set.
the backdoor key and examine its relationship with problem            Fig. 3 shows the relation between ensemble hardness and
hardness.                                                          key fractions as well as key sizes. The graph at the left dis-
                                                                   plays results for the morphed GCP domain and the graph on
Definitions                                                        the right shows results for the QWH domain. For both prob-
Given a sub-solver S, let B be a backdoor of a formula F           lem domains, the peaks of key fraction coincide with those
with respect to S, and AB be a backdoor truth assignment,          of ensemble hardness, which illustrates a strong correlation
i.e., a value setting of B such that S returns a satisfiable       between key fraction and ensemble hardness. On the other
assignment of F. We use B - v as a simple denotation of            hand, there is little or no correlation between key size and
B - {v}, for any v  V . Before our exposition of backdoor          problem hardness. For morphed GCP, the peak of backdoor
keys, we need to define the notion of a dependent variable.        key size shifts to the left of the hardness peak. For QWH,
Definition 3 (Dependent Variable) A variable v  V is a             the size of backdoor key increases monotonically, which
dependent variable of formula F with respect to a partial          does not coincide with the easy-hard-easy pattern of prob-
truth assignment AB if F[AB ] determines v, i.e., there is a       lem hardness. Empirical studies using zChaff demonstrated
unique value assignment x of v such that F[AB  {v/x}] is           similar results.
satisfiable.                                                          In predicting instance hardness within an ensemble, the
                                                                   key fraction also shows the strongest correlation. Fig. 4
   To capture the dependencies among backdoor variables,           shows the correlation between key fraction and instance
we introduce the definition of a backdoor key, which we de-        hardness. For both domains, the overall shapes suggest lin-
fine formally as follows:                                          ear correlations between the key fraction and the log of in-
Definition 4 (Backdoor Key) A backdoor variable v is in            stance hardness. We performed linear regression on the data
the backdoor key set of B with respect to a backdoor truth         and summarized the results in table 1. As a comparison, we
                                                           1                                                                                                          1
                                                                                                     Hardness                                                                         Hardness




                    Hardness, key size and key fraction




                                                                                                                               Hardness, key size and key fraction
                                                                                                      Key size                                                                         Key size
                                                                                                   Key fraction                                                                     Key fraction
                                                          0.8                                                                                                        0.8


                                                          0.6                                                                                                        0.6


                                                          0.4                                                                                                        0.4


                                                          0.2                                                                                                        0.2


                                                           0                                                                                                          0
                                                           2^-8        2^-7    2^-6   2^-6 2^-4 2^-3     2^-2     2^-1   2^0                                           310       320    330    340 350 360 370        380   390   400
                                                                                        Morphing ratio                                                                                            Number of hole

Figure 3: Normalized key sizes, key fractions and ensemble hardness. Left: Morphed GCP where the x-axis represents the
morphed ratio. Right: QWH where the x-axis represents the number of unset variables or holes, removed from prior solutions.
For hardness, each data point represents the median hardness of 100 instances solved by Satz-rand. For backdoor size, the data
points represent the mean value of 100 instances.
                                                          1000                                                                                                       100000

                                                                                                                                                                      10000
                    Hardness (Satz)




                                                                                                                               Hardness (Satz)
                                                          100                                                                                                          1000

                                                                                                                                                                           100

                                                           10                                                                                                              10

                                                                                                                                                                             1

                                                                1                                                                                                          0.1
                                                                0.25     0.3     0.35    0.4    0.45     0.5      0.55   0.6                                                  0.1      0.2    0.3   0.4   0.5   0.6   0.7   0.8   0.9
                                                                                         Key fraction                                                                                                Key Fraction

Figure 4: Key fractions and instance hardness. The x-axis is the key fraction (the ratio of key size and backdoor size) and the
y-axis is the instance hardness. Left: Morphed GCP with morphed ratio = 2 -6 . Right: QWH of order 33 and 363 holes. For
each data point (x, y), y represents the median search cost of 100 runs of Satz-rand and x represents the mean key fraction of
100 backdoor sets.
also include results for backdoor size and key size. We re-                                                                                                          the phenomena of backbones in graph coloring problems
port correlation coefficients as well as two error measures                                                                                                          (Culberson & Gent 2001). By the original definition of
for linear regression, the root mean squared error (RMSE)                                                                                                            backbone--a variable that is fixed in all solutions--the prob-
and mean absolute error (MAE). The table shows that the                                                                                                              lems had no backbones. Culberson and Gent therefore
key fraction is the best predictor for problem hardness in all                                                                                                       adopted a new definition of backbone in terms of pairs of
aspects.                                                                                                                                                             variables that must take on different values in all solutions.
   An intuitive explanation for the success of the key frac-                                                                                                         We are currently pursuing a similar generalized notion of
tion in predicting hardness is that it represents the probabil-                                                                                                      backdoor key to take into consideration special problems
ity that a backdoor variable is dependent. We have argued                                                                                                            where the key fails to be predictive of hardness. For exam-
that variable selection heuristics essentially attempt to iden-                                                                                                      ple, one can define dependent pairs of variables with respect
tify backdoor variables, that only dependent backdoor vari-                                                                                                          to a partial truth assignment, and then analogously define the
ables can be set to incorrect values (values that rule out all                                                                                                       set of backdoor key pairs.
solutions), and that large backtracking trees and long runs
result when backdoor variables are initially set to incorrect                                                                                                                    Conclusion and Research Directions
values. Putting these arguments together makes the predic-                                                                                                           We extended the notion of backdoor variables by introduc-
tion that as the key fraction increases the probability that the                                                                                                     ing and investigating the concept of backdoor keys. Back-
solver has a long run should increase as well. Our experi-                                                                                                           door keys capture the link between dependencies among
mental results show that this is exactly the case.                                                                                                                   backdoor variables and problem complexity. To examine the
   The concept of backdoor keys works well for domains                                                                                                               relationship between the backdoor key of a problem and its
with relatively large key fractions, such as QWH and mor-                                                                                                            hardness, we performed experiments on several structured
phed GCP. We did the same investigation for the logistics                                                                                                            domains. Our analysis suggests that the key fraction is a
and circuit synthesis domains but found no significant cor-                                                                                                          good predictor for both ensemble and instance hardness for
relations. It turns out that for the two latter domains, most in-                                                                                                    domains with relatively large key fractions.
stances from the two domains have key sets of size zero! In                                                                                                             We are pursuing a deeper understanding of backdoors
other words, given any backdoor and its corresponding solu-                                                                                                          and backdoor keys, as well as their relationships to prob-
tion, you can flip the truth assignment of any single variable                                                                                                       lem hardness. In parallel, we are seeking to apply the back-
in the backdoor and still extend the backdoor to a solution.                                                                                                         door and backdoor key concepts to inform problem-solving
   A similar issue was highlighted in work that looked at                                                                                                            methods. Moving beyond backdoor keys, we are seeking to
                                              Satz-rand                                           zChaff
              Measures                  QWH                           GCP                 QWH                         GCP
                                    F    K    B      F                 K     B         F   K    B       F              K        B
       Correlation Coefficient     0.86 0.79 0.42 0.78                0.56 -0.26     0.70 0.70 0.42 0.54              0.53     0.01
               RMSE                0.41 0.47 0.71 0.24                0.37 0.38      0.39 0.68 0.87 0.37              0.37     0.38
                MAE                0.29 0.37 0.57 0.29                0.28 0.28      0.29 0.53 0.69 0.31              0.31     0.32
Table 1: Results of linear regression analysis for instance hardness, where F represents key fraction, K represents key size and
B represents backdoor size


understand more about the hidden structure of problems. We                able problems. In Proceedings of the Sixteenth International Joint
are particularly interested in understanding structure among              Conference on Artificial Intelligence (IJCAI-01), 351­358.
variables in the backdoor set for domains with small key                  Kirkpatrick, S., and Selman, B. 1994. Critical behavior in the
fractions. The presence and nature of such hidden struc-                  satisfiability of random Boolean expressions. Science 264:1297­
tures may be related to the difficulty of identifying backdoor            1301.
variables. Finally, we interested in investigating the effect of          Li, C. M., and Anbulagan. 1997. Heuristics based on unit propa-
clause learning on backdoor and key sizes. Clause learning                gation for satisfiability problems. In Proceedings of the Interna-
(Marques-Silva & Sakallah 1996; Bayardo & Schrag 1997;                    tional Joint Conference on Artificial Intelligence, 366­371. AAAI
Zhang 1997; Moskewicz et al. 2001; Zhang et al. 2001)                     Pess.
can be seen as a mechanism that helps to identify variable                Mammen, D. L., and Hogg, T. 1997. A new look at the easy-
dependencies. Our hypothesis is that the effectiveness of                 hard-easy pattern of combinatorial search difficulty. Journal of
clause learning may be related to the variation in backdoor               Artificial Intelligence Research 7:47­66.
or key sizes during the learning process.                                 Marques-Silva, J., and Sakallah, K. 1996. Grasp ­ a new search
                                                                          algorithm for satisfiability. In Proceedings of the International
                         References                                       Conference on Computer Aided Design, 220­227.
 Achlioptas, D.; Gomes, C. P.; Kautz, H. A.; and Selman, B. 2000.         Mitchell, D. G.; Selman, B.; and Levesque, H. J. 1992. Hard
 Generating satisfiable problem instances. In AAAI/IAAI, 256­261.         and easy distributions for SAT problems. In Rosenbloom, P.,
                                                                          and Szolovits, P., eds., Proceedings of the Tenth National Confer-
 Bayardo, R., and Schrag, R. 1997. Using csp look-back tech-              ence on Artificial Intelligence, 459­465. Menlo Park, California:
 niques to solve real-world sat instances. In Proceedings of the          AAAI Press.
 Fourteenth National Conference on Artificial Intelligence (AAAI-
 97), 203­208. New Providence, RI: AAAI Press.                            Moskewicz, M. W.; Madigan, C. F.; Zhao, Y.; Zhang, L.; and Ma-
                                                                          lik, S. 2001. Chaff: Engineering an Efficient SAT Solver. In Pro-
 Cheeseman, P.; Kanefsky, B.; and Taylor, W. M. 1991. Where               ceedings of the 38th Design Automation Conference (DAC'01).
 the Really Hard Problems Are. In Proceedings of the Twelfth
 International Joint Conference on Artificial Intelligence, IJCAI-        Prosser, P. 1996. An empirical study of phase transitions in binary
 91, Sidney, Australia, 331­337.                                          constraint satisfaction problems. Artificial Intelligence 81:81­
                                                                          109.
 Culberson, J., and Gent, I. 2001. Frozen development in graph
 coloring. Theoretical Computer Science 265(1­2):227­264.                 www.satlib.org.
 Gent, I. P.; MacIntyre, E.; Prosser, P.; and Walsh, T. 1996. The         Singer, J.; Gent, I.; and Smaill, A. 2000. Backbone Fragility
 constrainedness of search. In AAAI/IAAI, Vol. 1, 246­252.                and the Local Search Cost Peak. Journal of Artificial Intelligence
                                                                          Research 12:235­270.
 Gent, I. P.; Hoos, H. H.; Prosser, P.; and Walsh, T. 1999. Mor-
 phing: Combining structure and randomness. In Proceedings                Slaney, J. K., and Walsh, T. 2001. Backbones in optimization and
 of the Sixteenth National Conference on Artificial Intelligence          approximation. In IJCAI, 254­259.
 (AAAI'99), 654­660.                                                      Smith, B. 1994. Phase transition and the mushy region in con-
 Gomes, C. P., and Selman, B. 1997. Problem Structure in the              straint satisfaction problems. In In Proceedings of the eleventh
 Presence of Perturbations. In Proceedings of the Fourteenth Na-          European Conference on Artificial Intelligence, 100­104.
 tional Conference on Artificial Intelligence (AAAI-97), 221­227.         Williams, R.; Gomes, C.; and Selman, B. 2003a. Backdoors to
 New Providence, RI: AAAI Press.                                          typical case complexity. In IJCAI03.
 Gomes, C. P.; Selman, B.; and Kautz, H. 1998. Boosting Com-              Williams, R.; Gomes, C.; and Selman, B. 2003b. On the connec-
 binatorial Search Through Randomization. In Proceedings of the           tions between backdoors, restarts, and heavy-tailedness in combi-
 Fifteenth National Conference on Artificial Intelligence (AAAI-          natorial search. In SAT03.
 98), 431­438. New Providence, RI: AAAI Press.                            Zhang, L.; Madigan, C. F.; Moskewicz, M. W.; and Malik, S.
 Kamath, A. P.; Karmarkar, N. K.; Ramakrishnan, K. G.; and Re-            2001. Efficient conflict driven learning in boolean satisfiability
 sende, M. G. C. 1993. An interior point approach to Boolean              solver. In ICCAD, 279­285.
 vector function synthesis. Proceedings of the 36th MSCAS 185­            Zhang, H. 1997. SATO: an efficient propositional prover. In
 189.                                                                     Proceedings of the International Conference on Automated De-
 Kautz, H., and Selman, B. 1996. Pushing the envelope: planning,          duction (CADE'97), volume 1249 of LNAI, 272­275.
 propositional logic, and stochastic search. In Proceedings of the
 Thirteenth National Conference on Artificial Intelligence (AAAI-
 96), 1188­1194. Portland, OR: AAAI Press.
 Kautz, H.; Ruan, Y.; Achlioptas, D.; Gomes, C. P.; Selman, B.;
 and Stickel, M. 2001. Balance and filtering in structured satisfi-