Information about http://ils.unc.edu/~march/journalism_april08.pdf

Tags: ch 4, colloquium series, conversation school, digital libraries, dv projects, election case, library science, mary junck, north carolina at chapel, north carolina at chapel hill, public expression, research colloquium, school of information, school of journalism, science university, social interaction, theoretical view, university of north carolina, university of north carolina at chapel hill, youtube,
Pages: 6
Language: english
Created: Thu Apr 17 12:03:52 2008
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
                                                                                     Outline
           Digital Video: From                                               Thesis: Digital Video blurs traditional information and
                                                                             communication boundaries
           Digital Libraries to                                              Three ongoing DV projects

           Social Interaction                                                   Open Video DL
                                                                                 ·   Accessible, reusable files
                                                                                 ·   Surrogation as key R&D challenge
           The Mary Junck Research Colloquium Series                             ·   User studies on surrogate effects
           School of Journalism and Mass Communication                           ·   The question of channel synchronicity
           University of North Carolina at Chapel Hill                          VidArch and preserving context
                                                                                 · YouTube and 2008 Election case
           April 17, 2008
                                                                                UNC YouTube Channel: issues and policies
           Gary Marchionini                                                  From retrieval to (public) expression and conversation
           School of Information and Library Science
           University of North Carolina at Chapel Hill
           march@ils.unc.edu                                                         2                          Gary Marchionini, UNC-CH                          4/17/2008

           www.ils.unc.edu/~march




      Digital Video Status                                                           Theoretical View
Digital video a burgeoning DL challenge: YouTube phenomenon                   Digital video is crafted expression
(fall 2007: 65K new videos/day; 20TB/mo; 100M views/day)                         Multiple channels (analog and digital)
Substantial research activity on storage, retrieval from engineering             Visceral as well as intellectual effects (analog and digital)
perspective (see IEEE, ACM MM)                                                   A descendant of film but with potential dynamics/behavior (digital)--changes over time, every
                                                                                 time
Many large-scale DLs and services
    InforMedia, Fischlar, ECHO, Internet Archive , Open Video,                Digital Libraries are journeys (learning
    public.tv, researchchannel                                                environments) rather than destinations for patrons
Most attention on system/collection building rather than services             and librarians
Commercial attention on system and management                                    Beyond libraries as repositories to sharium
    IBM, MERL, Microsoft, Artesia, Virage, WFMY locally
NIST TREC Video Track for retrieval evaluation
                                                                              Open Video deals with reusable (open) video
Advances on capture, critical need for reuse tools
                                                                              objects
                                                                                 A journey toward new forms of expression and reflections on history
Portability advances (e.g., wifi, iPhone)                                        What do you do with 24/7 feeds of video from every street corner in Manhattan?
On the cusp of `natural'


      3                     Gary Marchionini, UNC-CH             4/17/2008           4                          Gary Marchionini, UNC-CH                          4/17/2008




      Open Video Vision/Contributions
      http://open-video.org
                                                                                     Background & Status
An open repository of video files that can be re-used in a variety            Begun 1995 with colleagues at UMD & BCPS; current instance at
of ways by the education and research communities                             UNC initiated in 1999
    Encourages contributions                                                  Funding: NSF# IIS-0099538 1999-2004; NSF IIS 0455970 2006-
                                                                              07; Library of Congress NDIIPP (2007-08); IBM (2007); Google
    A testbed for interactive interfaces                                      (2007)
An easy to use DL based upon the agile views interface design                 Collaborators/Contributors: I2-DSI, ibiblio, CMU, UMD, Prelinger
framework                                                                     Archive, Internet Archive, NASA, ACM
    Multiple, cascading, easy to control views (pre, over, re,                ~4000+ video segments
    shared, peripheral)                                                       ~40000 unique visitors per month
    Views based upon empirically validated surrogates                         ~1.8M hits/month
    An environment for building theory of human information                   MPEG-1, MPEG-2, MPEG-4, QT
    interaction                                                               OAI provider
A set of methods and metrics that reveal how people understand                Ongoing user studies of surrogation
digital video through surrogates

      5                     Gary Marchionini, UNC-CH             4/17/2008           6                          Gary Marchionini, UNC-CH                          4/17/2008




                                                                                                                                                                                 1
                       What is (was) a Surrogate?                                                                                                                    The Blur
                                                                                                                                                                         The (relatively) Neat Past and the Very Scruffy
        Condensed representation for human consumption                                                                                                                   Present
        constructed to stand for an information object                                                                                                                      Blurring the `levels of representation' model of information
                                                                                                                                                                            (primary-secondary-tertiary-n-ary)
        Information compressions                                                                                                                                         The metadata--surrogate continuum within the levels
        Surrogates                                                                                                                                                       of representation continua
                 Enable decision-making by presenting search results in a uniform way                                                                                       Metadata region mainly for retrieval
                 Support sense making and incidental learning
                 Save human time (compaction)
                                                                                                                                                                            Metadata region mainly for and by machines (semantic
                                                                                                                                                                            web)
                 Save network capacity and system resources
                                                                                                                                                                             · Automatic metadata generation advances
        Examples                                                                                                                                                             · Implicit links and mining of interactions as metadata
                 Abstract, gloss, summary                                                                                                                                   Surrogate region mainly for sense making
                 Title, bibliographic record
                 Preview, snippet
                                                                                                                                                                            Surrogate region mainly for and by people
                 Profile                                                                                                                                                     · Professional abstracting
                 Logo                                                                                                                                                        · Social tagging and annotations/links as surrogates
                 Your avatar

                       7                                              Gary Marchionini, UNC-CH                                                 4/17/2008             8                        Gary Marchionini, UNC-CH                 4/17/2008




                                   Representation and the Digital Blur

                                                             Physical
                                                                                World
                                                                                                 Mental
                                                                                                                                                                     Digital Video Surrogates
                                                                   Information                                                                                            Classes
                            Representation
                                                                     Object
                              Level 1                                                                                                                                       Textual
                                                             Recorded     Ephemeral
                                                                                                                                                                            Visual
                           Representation                                                                                                                                   Audio
                             Level 2
                                                       metadata                                  surrogates                                                               Cost benefit analysis: maximize `meaning'
                                                                                                                                                                          per unit time
                                                                                World
                                                             Physical                            Mental                                                                     Transmission time
                                                                                                                                                                            Compaction rate
                Digital                                                   Information
            Representation                                                  Object                                                                                          Cognitive processing time

                                           metadata                                                          annotations
                                                                                                                                                                          Performance vs. Preference
                                                                           surrogates

                       9                                              Gary Marchionini, UNC-CH                                                 4/17/2008             10                       Gary Marchionini, UNC-CH                 4/17/2008




                       Research Framework                                                                                                                            Surrogates Examined
                                                                                                                                                                Storyboard with text keywords (20-36 per board@ 500 ms)
                                                                                                                                                                Storyboard with audio keywords
                                                                             GOALS
                                                                  learning, work, entertainment                                                                 Slide show with text keywords (250ms repeated once)
                                                                                                                                                                Slide show with audio keywords
    EFFORT                                                                    TASKS                                                  OUTCOMES
                                                                                                                                                                Fast forwards 32X, 64X, 128X, 256X
      TIME                                                            select video for viewing                                     PERFORMANCE
time spent searching                                                  select scene for viewing                                  retrieval (precision, recall)   Poster frames (1-3)
 and viewing results                                                  copy and use scenes                                      recognition (objects, action)
                                         VIDEO                        copy and use frames              INDIVIDUAL                   gist comprehension          Real time clips/excerpts (7 sec)
MENTAL LOAD                         CHARACTERISTICS                   other tasks?                  CHARACTERISTICS                  (linguistic, visual)
  perceptual load                                                                                                                                               Text (title, keywords, etc.)
   cognitive load                    genre: documentary,                                             domain experience
                                                                                                                                   SATISFACTION
                                           narrative
                                     topic: literal, figurative
                                                                                                     video experience
                                                                                                     cultural experience
                                                                                                                                 perceived usefulness           Visual features (e.g., in/out, people, etc.)
PHYSICAL LOAD                                                                                                                    perceived ease of use
 amount of muscle
                                     style: visual, audio,
                                           textual, place
                                                                         SURROGATES,
                                                                         AGILE VIEWS
                                                                                                     computer experience
                                                                                                     info seeking experience
                                                                                                                                          flow                  Spoken descriptions
   movement                                                                                                                         user satisfaction
                                                                        display controls
                                                                                                     metacognitive abilities
                                                                                                     demographics
                                                                                                                                                                Spoken keywords
                                                                        keywords
                                                                        storyboard w/ text, audio
                                                                                                                                                                Combined visual (storyboard, fast forward) and spoken
                                                                        slide show w/ text, audio
                                                                        fast forward w/ audio
                                                                                                                                                                (descriptions, keywords)
                                                                        poster frames




                       11                                             Gary Marchionini, UNC-CH                                                 4/17/2008             12                       Gary Marchionini, UNC-CH                 4/17/2008




                                                                                                                                                                                                                                                   2
               Tasks                                                                                             User Studies
                Text                      Still Image              Moving Image         Audio                Qualitative Comparison of Surrogates (Spring 02, ECDL 02)
                                                                                                             Fast Forwards (Fall 02, JCDL 03)
Recognition/    Object selection (text)   Object selection         Excerpt selection
                                                                                       Select Spoken         Text or Pictures (Spring 03, CIVR 03)
Selection       Keyword selection             (graphical)
                Description selection     Keyframe selection
                                                                                        Description          Narrativity (CHI 02, ASIST 03)
                                                                                       Select Spoken
                Title selection
                                                                                        Keyword              Shared views and History Views (Geisler dissertation)
                                                                                                             TREC evaluation (Spring/summer 03; 05)
Generative      Gist writing              Visual gist
Inference       (free text)                   determination                                                  ViSOR (Gruss Master's paper)
                                                                                                             Look vs Read (Hughes Master's paper)
                                                                                                             Video relevance (CHI 05; ASIST04; Yang dissertation)

                                           Metrics
                                                                                                             Cognitive load (Mu dissertation)
                     Accuracy                                                                                Teachers using video (Brown dissertation)
                     Confidence                                                                              Spoken Audio and Storyboards (CHI 07)
                     Time to complete                                                                        Spoken Audio and Fast Forwards (current)
                     Usefulness, usability, engagement, enjoyment, preferences


               13                               Gary Marchionini, UNC-CH                         4/17/2008       14                    Gary Marchionini, UNC-CH           4/17/2008




               Audio Surrogates

                    Spoken descriptions, summaries,
                    keywords
                    Visual displays of audio signals
                    Audio skims (excerpts)
                    Compressed speech
                    Parallel streams (cocktail party effect)


               15                               Gary Marchionini, UNC-CH                         4/17/2008       16                    Gary Marchionini, UNC-CH           4/17/2008




               Recent Study (CHI 07)                                                                             Synchronicity
                    36 participants, within subjects design used audio-only                                           Coordinated media channels lead to
                    (spoken descriptions), visual-only (storyboards), and
                    combined surrogates to do 5 kinds of recognition and gist                                         better understanding, retention, and
                    tasks.                                                                                            satisfaction
                    Accuracy, time to view, time to complete task, suite of
                    affective measures                                                                                What about multi-channel surrogates?
                    Statistically reliable differences on 3 of 5 accuracy tasks,
                    time to view, and most affective measures. Combined                                                 Assume surrogate channels should also be
                    generally better and preferred, audio almost as good as                                             coordinated?
                    combined, visual alone faster to consume but no time                                                Perhaps more sense making possible if sampling
                    penalties for audio and combined on task completion.                                                across different channels and integrating in the head
                    Implications                                                                                        at consumption time rather than pre-coordination at
                        Add audio surrogates                                                                            indexing time?
                        Use audio in small form-factor devices
                        Audio and visual quality important                                                            We have initiated a series of studies
                        Synchronizing different channels in surrogates may not be
                        necessary
                        User controlled tradeoffs: time, satisfaction, performance

               17                               Gary Marchionini, UNC-CH                         4/17/2008       18                    Gary Marchionini, UNC-CH           4/17/2008




                                                                                                                                                                                      3
                 Tradeoffs                                                                            Video Preservation
                                                                                                      (VidArch) Project
                                                                                                  http://ils.unc.edu/vidarch
Visual
channels
                                                                                                  What kind and how much context to preserve?
                                                                                                  National Digital Information Infrastructure Preservation Program
                                               OR?                                                (NDIIPP) funding via NSF and LoC.
Audio
                                                                                                  Focus on specific topics
channels                                                                                              2008 Presidential campaign (15K May 07-present)
                                                                                                      Energy, truth commissions, health, pandemics
                                         = most salient samples                                   Harvest video, metadata, and activity from YouTube; use API to
                                                          User-centered integration               query rather than crawl
           Pre-processed integration
           yields less cognitive load.                    (constructivist). More cognitive        Create Curator's tools and services
           Less sense making and                          load. Better sense making and           Fundamental DL issue of content/metadata/context boundaries
           retention?                                     retention due to active                 in WWW objects
                                                          participation and ?better?
                                                          Information samples

                 19                      Gary Marchionini, UNC-CH                     4/17/2008         20                    Gary Marchionini, UNC-CH           4/17/2008




                                                                                                             Challenges of Extending
                 Extend Documentation                                                                        Documentation Strategies
                 Strategies to Web                                                                           to Web Videos
                      Power of the masses to produce
                                                                                                             Potentially too many for hand-crafted,
                      documentation of a subject or issue
                                                                                                             artisan approach.
                      (folksonomy).
                      Web-based materials can have a strong                                                  Ephemerality­ here today; gone
                      impact on society, including phenomena                                                 tomorrow.
                      such as voting behavior.                                                               Variable quality and relevance.
                      Democratizes collecting strategies; more                                               General lack of metadata.
                      than network news or campaign materials                                                Unclear provenance and authenticity.
                      being collected.
                                                                                                             Lack of contextualizing information.
                      Materials never before created or collected.
                 21                      Gary Marchionini, UNC-CH                     4/17/2008         22                    Gary Marchionini, UNC-CH           4/17/2008




                 Video Harvesting from                                                                 Election 2008 Collecting
                 YouTube                                                                               Scenario
                                                                                                             Curator of Hillary Clinton's campaign.
                                                                                                             Direct feed of materials from Clinton's staff and
                                                                                                             Democratic Party.
                                                                                                                Press releases, video, interviews, Face Book, etc.
                                                                                                             Wide variety of traditional media ­ newspapers, TV,
                                                                                                             radio.
                                                                                                             Now wide variety of bottom-up materials, including You
                                                                                                             Tube videos
                                                                                                                "Official" CNN debate videos, reactions, etc.



                 23                      Gary Marchionini, UNC-CH                     4/17/2008         24                    Gary Marchionini, UNC-CH           4/17/2008




                                                                                                                                                                             4
                     Video Harvesting from                                                                                                         Overview of the collection
                     YouTube for Election 2008                                                                                                     (as of 04/17/2008)
            56 queries (6 general and 50 names)                                                                                                         Crawls = 273
            Use You Tube APIs, screen scraping and other tools to collect                                                                               Unique videos = 17,862
            videos and context
                                                                                                                                                        Total videos = 19,570
            Crawl everyday (almost) since May 07
                                                                                                                                                        Video files = 181 GB
            Get top 100 results for each query
                                                                                                                                                        Total views = 496,581,313
            Collect more than 20 attributes (including all the comments)
                                                                                                                                                        Total comments = 3,017,625
            Download flash videos
                                                                                                                                                        Total ratings = 2,847,427
            Compare to blog postings                                                                                                                    Total honors = 547
                                                                                                                                Capra, R., et al., (in press). Selection and Context Scoping for Digital Video Collections: An Investigation of YouTube and
                                                                                                                                Blogs. ACM/IEEE Joint Conference on Digital Libraries (June 2008).
                                                                                                                                Also see http://ils.unc.edu/vidarch

                      25                                  Gary Marchionini, UNC-CH                                  4/17/2008                      26                                 Gary Marchionini, UNC-CH                                   4/17/2008




                      Implications to Date                                                                                                         UNC YouTube Channel
                            YouTube is as much a conversation as an                                                                                     http://youtube.com/uncchappelhill
                            information source
                               Comments and responses                                                                                                   Content
                                    · Textual
                                                                                                                                                              Information in Life Series
                                    · Video
                            Layers of video representation that are                                                                                              · Lectures (SILS, ibiblio, Public Health)
                            strongly culture dependent                                                                                                           · Interviews with UNC Faculty/staff
                                    · Video allusions (e.g., McCain's Obama girls,
                                      mama, etc.; Vote different)                                                                                             Carolina Week playlist
                            The Internet is not quite mainstream                                                                                              Global Health playlist
                                    · Ron Paul in blogosphere and YouTube
                                                                                                                                                              News and Publicity
                      27                                  Gary Marchionini, UNC-CH                                  4/17/2008                      28                                 Gary Marchionini, UNC-CH                                   4/17/2008




                      Policy Issues                                                                                                                Implications
                         Collection Development
                               Campus gatekeeping                                                                                                       The Medium get attention, attention
                               Instructor ownership
                                                                                                                                                        brings new requirements for response
                         Intellectual Property and Reuse
                                                                                                                                                              Sturm's viral lecture on storytelling
                               CC
                               Incidentals                                                                                                                    Carson memorial comments
                         Management                                                                                                                     New opportunities and challenges for
                               Content development                                                                                                      teaching and learning
                               Channel freshness
                               Blowback
Marchionini, G. (in press). Digital Video Policy and Practice in Higher Education: From Gatekeeping to Viral Lectures.
Educational Technology
http://youtube.com/uncchappelhill
                      29                                  Gary Marchionini, UNC-CH                                  4/17/2008                      30                                 Gary Marchionini, UNC-CH                                   4/17/2008




                                                                                                                                                                                                                                                              5
Summary Implications:                                       Q&A

     The beginning of an information and
     communication paradigm shift                           Thanks for your attention
       Lots of content to reuse
        · Professional to web/security cam
       Capture everything mentality
        · Public/private blur
       Non-textual tags and annotations                     Thanks also to NSF, Library of Congress, NASA,
        · Primary/n-ary blur
                                                              Microsoft, IBM, and Google for partial support of this
       Conversation interjections                             work
       Create/find/share blur

31                   Gary Marchionini, UNC-CH   4/17/2008   32                 Gary Marchionini, UNC-CH        4/17/2008




                                                                                                                           6