Information about http://www.ics.uci.edu/~majumder/vispercep/shapeObject.pdf

Shape and Object Recognition: Models for Understanding …

Tags: alignment, correlations, different window, dilation, dilations, equivalence, generalization, global structure, hypothesis, invariant features, julian, object recognition, perceptual systems, recognition models, reference frames, reflections, representations, rotations, shape similarity, similarity methods,
Pages: 21
Language: english
Created: Fri May 16 12:38:28 2008
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
Page 7
image
Page 8
image
Page 9
image
Page 10
image
Page 11
image
Page 12
image
Page 13
image
Page 14
image
Page 15
image
Page 16
image
Page 17
image
Page 18
image
Page 19
image
Page 20
image
Page 21
image
 Shape and Object Recognition:
   Models for Understanding
      Perceptual Systems
              Julian Yarkony




                 Outline
· Challenges and discussion
  of difficulties
· Shape equivalence
  methods
· Shape Similarity Methods




                                 1
               Motivation
· Shape allows the user to predict more
  properties about an object than any other
· Shape is not a single property but made up
  of other properties
· In perceiving shape local bits, parts and
  correlations must be organized into
  representations, features,parts, and global
  structure.




      Difficulties with Geometric
            Transformations
· Translations: means massive spaces must be
  searched, and that the center of the image
  desired is unknown
· Rotations: means that even given the center
  recognition is difficult
· Reflections: adds increased difficulty
· Dilations: Means that many different
  window sizes must be searched




                                                2
Translation




 Rotation




              3
      Dilation




    Reflection
NOT same as rotation




                       4
    Generalization,learning and noise

·   Other key difficulties
·   Generalization of shape structure
·   Learning new shape structure
·   Finding shapes in images
·   Noisy, or occluded object




      And that's just in 2D world
· Just imagine what its
  like in 3d world




                                        5
         Shape equivalence I
· Concerned with finding objects with the
  same shape despite other spatial viewing
  conditions
· If a shape perceived can be identified with a
  stored representation in memory then
  properties can be inferred.




        Three Theories of Shape
             Equivalence
· Invariant features hypothesis
· Transformational alignment hypothesis
· Object-centered reference frames
  hypothesis




                                                  6
           Invariant features
· Uses properties which are invariant under
  translation, rotation, dilation, reflection,
· Number of lines or angles
· Relative size of lines
· Relative distances between parts
· Relative Orientation of lines and angles
· Closeness and connectedness




 Invariant Features: Dominant ...
           until recently
· Simple model, easily generalized
· Means that fast methods without doing tons
  of translations can be used. Classification
  can be done immediately.
· McCullough and Pitts, founders of neural
  networks and supported this theory and it
  came to dominate the literature




                                                 7
         Mach's Square/Diamond
· Uh oh, and yes those are the SAME size
· 45 degree rotation
· Mach's Square/Diamond perceived as having different
  shape depending on rotation
· Critically wounds invariant features hypothesis




       Transformational alignment




                                                        8
    Transformational Alignment
· STEPS consider to candidate shapes A,B
· Find "Anchor points"
· Find point correspondences
· Determine translation needed to align
  Anchor points of B with A and those of A
  with B
· Determine if transformed versions are
  identical




    Transformational alignment II ­

· Very plausible
· Connected with many important visual
  phenomena
· machine learning note: computationally
  intensive
· Points of maximum concavity make good
  anchor points




                                             9
                   Problems
· Objects do not come labeled with anchor points
· 3 non-collinear (no line connects all 3) needed for
  3d shapes
· Correspondences between points not
  predetermined. n! combos
· 5!=120, 10!=3,628,800, 20! ????
· Machs square/diamond will be rotated to make the
  same shape
· Anchor points must be visible in both figures




     Object Oriented Reference
              frames
· Each object has its own "made to order" reference
  frame
· Each frame can handel transformational variance
· Each frame fits the structure of its designated
  object
· NOT COMPUTATIONALLY POSSIBLE
· How does one determine which is the correct
  frame
· Machs' square/diamond?? CAN FIND
  DIFFERENCE




                                                        10
Circle object centered coordinate
             systems




                  R: XX+YY=1
                  B:(X-2)(X-2)+(Y-2)(Y-2)=4




    Object Centered Reference
  Frames Explanation for Failure
· Frames fail by 3 things according to Palmer
· Intrinsic bias:heuristics for internal structure used,
  (think axis of elongation for trees)
· Relative description: Comparisons made by
  comparing descriptions of objects not objects
  themselves
· Extrinsic bias: perceived orientation of object
  biased by orientation of the environment and other
  elements in the environment.




                                                           11
             Perceptual notes
· Human brain better at perceiving objects with
  "good" shape over amorphous objects
· Human subjects more quickly recall objects with
  good orientation axis when presented under
  rotation then they do for amorphous objects
  presented under rotation.
· Wiser believed that objects are stored in memory
  upright relative to their own reference frame




    Heuristics for reference frame
               selection
·   Gravitational orientation
·   Axis or relative symmetry
·   Axis of elongation
·   Contour orientation
·   Textural orientation
·   Contextual orientation
·   Motion orientation




                                                     12
       Contextual orientation




     Diamond             Tilted Square




               Shape similarity
· Inadequate to describe the power and
  versatility of human shape perception
· The following theories describe how the
  human mind might represent shape
· Representing, means creating a model for a
  set of cases that encompasses generalities,
  parameters, and variances of the members
  in a class




                                                13
                       Templates
· Ridiculed in vision texts
· Can build shape detectors using
                                     -   -   -   -   -
  following figure
· Performs convolution               -   +   +   +   -
· high numbers or high negative      -   +   +   +   -
  numbers=high correlation
· One can also build grandmother     -   +   +   +   -
  detectors                          -   -   -   -   -
· Look at the similarity to neural
  networks
· The features are taken directly
  from the image




          Problems with templates
   · Cant process rotations, or reflections well
   · Multi-sensory channels create exponential
     growth in number of templates
   · Edges can be used in multi-sensory
     channels but complex shapes is too
     expensive




                                                         14
         Normalization: Making
            Templates Work
· decrease or increase size so that the object
  is properly scaled for use in the template.
· Adjust orientation by longest axis,ex: make
  the longest axis go be vertical
· Squashing or stretching (this can cause
  problems with the operation above)
· No scheme has yet been made which does
  not result in combinatorial explosion




    Normalization via Dilation
     -    -   -   -   -
                             -   -    -   -      -
     -    +   +   +   -      -   +   +    +      -
     -    +   +   +   -      -   +   +    +      -
     -    +   +   +   -      -   +   +    +      -
     -    -   -   -   -      -   -    -   -      -




                                                     15
Normalization via Rotation about
         longest axis
 -   -      -    -     -              -     -     -       -   -

 -   -      -    -     -              -     -     -       -   -

 -   +     +     +     -              -     +     +       +   -

 -   +     +     +     -              -     +     +       +   -

 -   -      -    -     -              -     -     -       -   -




  Normalization via stretching




         Note this transformation can create difficulty
         with dilation and rotation transformations




                                                                  16
         Problem to note Acuity
     -    -   -     -   -       -   -   -    -     -

     -    +   +     +   -       -   +   +    +     -

     -    +   +     +   -       -   +   +    +     -

     -    +   +     +   -       -   +   +    +     -
     -    -   -     -   -       -   -   -    -     -




                  Feature Lists
· Encode long lists of features that are present
  and not present
· Features are of two types Local: has 58%
  angle, has curved line...
· Global: X symmetry, closed




                                                       17
      Feature lists: The Biggest
               Problem
Which features to choose?
· How to find them?
· Finding features is a lot like object
  recognition




           Multi-Dimensional
           scaling/Clustering
· Takes in a list of distances between objects
· Distance between A and B!=distance between B
  and A
· To make diagram for people use or clustering
· Reduces dimensionality to (2 or 3) dimensions,
  then apply clustering algorithm
· Then objects can be identified with clusters in the
  new space.




                                                        18
                Clustering




 Multi-Dimensional Scaling

Buf to nyc=15
Vat to lib=40
Jea to pla=52                    QuickTimeTM and a
                       TIFF (Uncompressed) decompressor

Bmg to roc=14             are needed to see this picture.




....
....
...




                                                            19
         Machine learning note
· Invariant features means feature vectors can
  be used to describe an object
· Just like chemical fingerprints describe
  molecules
· Now we can apply clustering, generative
  modeling and clustering tools
· It may not be perfect but one can go a great
  distance on this theory




          Take Home Points
· Many ideas of describing the way the
  human brain recognizes objects
· None are quite sufficient
· Some say the brain is a quantum computer
  my advisor is among them.




                                                 20
21