Tags: collaborative systems, differentiated products, engineer search, erik selberg, infospace, internet search systems, internet space, leading a team, meta search, metacrawler search service, microsoft search, new business opportunities, proven track record, scalability issues, scale internet, search team, software development engineer, technical interests, windows live search, world wide web security,
Erik W. Selberg, Ph. D.
4815 36th Ave. NE
Seattle, WA. 98105
h: (206) 517-3039 c: (206) 915-1472
erik@selberg.org
http://www.selberg.org
Short Biography
Dr. Selberg is one of the founding members of Microsoft Search Labs. Search Labs is a research
and development group within the Windows Live Search team. Its mission is to invent, build,
and deploy differentiated products and services for Windows Live, leading to new business
opportunities and thought leadership for Microsoft in the Internet space. He has a proven
track record in developing and operating large-scale Internet search systems, integrated Internet
applications, and hosted services. He created the MetaCrawler search service, now part of
InfoSpace and still one of the top meta-search services available. He is currently working at
Microsoft Corporation, leading a team responsible for the algorithmic relevance of MSN Search
and Live Search. Dr. Selberg is well published and holds two patents. He is focused on the
business of technology, creating profitable technology products and services that have a profound
impact on a large number of people.
Technical Interests
Search, information retrieval, collaborative systems, large scale systems, retrieval and scalability
issues of the World Wide Web, security and cryptography, audio and video systems.
Professional Experience
Senior Software Development Engineer, Search Labs
Microsoft Corporation, Redmond, WA June 2006 - present
As a senior software development engineer, I am one of the founding members of Microsoft Search
Labs, Search Labs is a research and development group within the Windows Live Search team.
Its mission is to invent, build, and deploy differentiated products and services for Windows Live,
leading to new business opportunities and thought leadership for Microsoft in the Internet space.
The project I am working on has not been publicly disclosed, and thus I am unable to comment
on it at this time.
Senior Program Manager, Web Search Algorithmic Relevance
Microsoft Corporation, Redmond, WA Feb. 2003 June 2006
As the program manager for Web Search Algorithmic Relevance, part of the Windows Live
Search (formerly MSN Search) group at Microsoft, I drove a cross-group team of developers,
testers, and researchers to create the best world-wide algorithmic search engine. My primary
1
focus was core relevance of Web search results. Key to this endeavor was leveraging Microsoft's
strong researcher community, not just in MSR - Redmond but also MSR-SVC in Mountain
View, MSRC in Cambridge, UK, and MSRA in Beijing, China. The collaboration I drove has
been extremely successful - code and algorithms have been integrated from every lab into MSN
Search. This helped turn the Search group into a model of how to effectively partner with MSR
throughout Microsoft. The MSN Search Engine was built from the ground up and has continued
to improve to the point where it is very close in relevance to both Yahoo! and Google, both of
whom have had several years head start on their projects.
Development Manager, Content Discovery Systems
RealNetworks, Inc. Seattle, WA Aug. 2001 Feb. 2003
As the Development Manager for the Content Discovery Systems Group of RealNetworks, I
oversaw the development, maintenance, and operations of nearly a dozen system platforms for
the Real Subscription Platform. One of my primary responsibilities was ensuring the proper
development and operation of the Real Media Authorization Services, mission-critical components
of the Real service that ensured subscribers were able to access content they were entitled to and
that non-subscribers were denied and presented with the appropriate upsell or error message. I
also managed the Real Content Management Services, which included the Real Search Service.
These services enabled various vendors and internal editors to provide content for the various
Real subscriptions as well as enabled various front-end teams in the company, including external
localization groups, to provide customers with the ability to discover and play various pieces
of content. Two of the primary systems under my direct supervision were the Radio Service,
which powered the various Radio offerings of Real, and the Album Information Service, which
is automatically used by every user of the RealPlayer to provide information to that user about
the media they are playing, such as album, artist, and track information. These two services are
the most commonly used services of people using the RealPlayer.
2
Startup Consultant
Various Engagements 1996, 1999, 2000
As a startup consultant, I aided various startup companies with early stage business and technical
development. I helped draft and edit business plans and patent applications. I aided the
development of initial prototypes and demos. I was also instrumental in facilitating initial and
secondary meetings with local VCs and angel funds, such as Madrona Venture Capital and Arch
Venture Partners.
My first engagement was with Netbot Inc., a Seattle-based startup, that licensed MetaCrawler
from my research group at the University of Washington. Over the summer of 1996, I worked
with Netbot to transfer the MetaCrawler technology and initiate the groundwork for Netbot to
host a commercial MetaCrawler service, which would be able to handle upwards of 10 million
queries per day at launch. In addition, I was responsible for initial negotiations with the Web
search services for commercial use of MetaCrawler with their services, and was heavily involved
with the conception of the business plan for the commercial MetaCrawler service.
Shortly after I completed my work at Netbot, it was acquired by Excite Inc. for $35 million
dollars for its Jango software product, a comparison shopping agent that used the MetaCrawler
engine. Excite was later acquired by Home, Inc. in spring of 1999 for $7 billion dollars.
Directory of Technology
FizzyLab, Inc. Seattle, WA July, 2000 Jan., 2001
FizzyLab was a contextual advertising and search company. Its primary product, Content
Relevator, provided a "More articles like this" for several online media providers, such as BusinessWeek
and People, and its secondary product, Commerce Relevator, provided relevant advertising links.
Among other initial customers were Time Magazine. FizzyLab ran on a Java / Solaris / Oracle
platform.
As the Director of Technology for FizzyLab, I led a team of 20 people responsible for the
design, implementation, and maintenance of the four primary FizzyLab services and various
infrastructure components. I was also responsible for both strategic planning as well as tactical
implementation of the technology group for the company. I was involved with forming company
strategy as it pertains to both existing and developing technologies, assessing the technical
challenges of various market opportunities, evaluating third party technology, and evaluating
business opportunities that arise from FizzyLab's Advanced Technology Group. With the other
company Directors, I created a company-wide tactical road maps on a quarterly basis. I then
executed against the company road map, ensuring that initiatives were properly scoped and
staffed, teams were engaged, and engineering work was properly scheduled. My managerial
responsibilities included creating project and technology teams, giving direction to team leads and
members, hiring and firing, setting individual goals, and developing and executing development
processes.
Director of Search
Go2Net, Inc., Seattle, WA June, 1999 June, 2000
Go2Net was a diversified Internet company with products in four main areas: Web search,
small business hosting, financial message boards, and online games, with emerging initiatives
3
in broadband properties. The Search group was responsible for the development, maintenance,
and operations of MetaCrawler, DogPile, and 100hot. In July of 2000, both MetaCrawler and
DogPile were each handling over 2 million queries per day on about 20 lower-end Linux boxes.
As the Director of Search for Go2Net, I lead a team that managed the daily operation and
enhancement of MetaCrawler, DogPile, and 100hot, which were responsible for generating several
million dollars of revenue per quarter. I helped form the product strategies, created the schedule,
and assessed third-party technologies for acquisition or partnerships. On the technology side, I
created and implemented the development, build, and release processes, developed the high level
architectures of the system, scheduled resources, and oversaw the hiring of the engineering team.
Research Assistant
University of Washington, Seattle, WA September, 1993 June, 1999
While attaining my doctorate degree at the UW, I created the MetaCrawler parallel search engine
with my advisor, Prof. Oren Etzioni. I wrote all the code and handled all the administration
of MetaCrawler for over a year. I was responsible for ensuring MetaCrawler remained fast and
responsive as the number of users grew, given limited hardware resources - four DEC Alphas. I
developed many software optimizations and automated server administration tools towards this
effort. Before MetaCrawler was licensed in 1996, it was handling almost 100,000 queries per day
with room to grow, which was roughly 3-5 times as many as its closest competitor, SavvySearch.
While investigating the overlap of search engine results, I observed that experiments that used
search results as data were very unstable. This led to my work on empirical evaluation of major
search engines. This evaluation measured how rapidly the results of a query change over time,
as well as how different the results of a query are when the query is submitted with different
query options. These findings concluded that the results of Web search engines change extremely
rapidly, even when given the same query.
Research Intern
AT&T Bell Laboratories, Murray Hill, NJ Summer, 1994
At AT&T Bell Laboratories, I worked with Bart Selman and Henry Kautz on the Bots project.
The Bots project was an exploration into creating personal assistant software agents that would
communicate with one another to accomplish various tasks, such as meeting scheduling, expertise
referral, and e-mail prioritizing. My tasks were to re-write the Bots in a more manageable way as
well as to create a Bot communications protocol so that Bots could effectively transfer information
between themselves.
Research Intern
Pittsburgh Science Center, Pittsburgh, PA Summer, 1993
I was a research intern at the Pittsburgh Science Center, working with Prof. Adam Beguelin on
the Parallel Virtual Machine (PVM) project, which simulated a distributed memory multiprocesser
by using a network of workstations. I designed a monitoring and debugging tool for applications
using PVM as well as designed and integrated a Kerberos-based authorization system for PVM
applications.
4
Education
Ph.D., Computer Science and Engineering June, 1999
University of Washington, Seattle, WA.
M.S., Computer Science and Engineering June, 1995
University of Washington, Seattle, WA.
B.S., Mathematics / Computer Science and Logic & Computation May, 1993
Carnegie Mellon University, Pittsburgh, PA.
Awards
The C|Net Awards for Internet Excellence, 1995
MetaCrawler was one of three finalists for the Best Internet Search Engine.
Allen Newell Award for Excellence in Undergraduate Research, 1993
Teaching Experience
University of Washington, Seattle, WA
Winter Spring, 1994
Teaching Assistant for Professors John Zahorjan (Winter Quarter) and Steve Hanks (Spring
Quarter) for UW CSE undergraduate second-quarter introduction to computer science course.
This course is taken by roughly 400 students per quarter, and gives undergraduates a further
understanding of more advanced introductory topics, such as object oriented programming,
searching and sorting, pointers, etc. With John Zahorjan and 3 other TAs, we migrated the
course from using Ada on UNIX systems to C++ on Windows, Mac, and UNIX systems.
Publications
Thesis
"Towards Comprehensive Web Search." Erik Selberg. Ph. D. Thesis, University of Washington,
June, 1999.
5
Fully Refereed Papers
"On the Instability of Web Search." Erik Selberg and Oren Etzioni. In RIAO '00: Content-based
Multimedia Access, Apr., 1999.
"Multi-Service Search and Comparison using the MetaCrawler." Erik Selberg and Oren Etzioni.
In Proceedings of the 4th International World Wide Web Conference, Dec., 1995.
"TRON: Process-Specific File Protection for the UNIX Operating System." Andrew Berman,
Virgil Bourassa, and Erik Selberg. In Proceedings of the 1995 Winter USENIX Conference, Jan.,
1995.
Invited Papers
"The MetaCrawler Architecture for Resource Aggregation on the Web." IEEE Expert, Jan. /
Feb. 1997, 12(1).
Technical Reports
"Experiments with Collaborative Index Enhancement." Erik Selberg and Oren Etzioni. University
of Washington Tech Report UW-CSE-98-06-01, June 1998.
"How to Stop a Cheater: Secret Sharing with Dishonest Participants." Erik Selberg. Carnegie
Mellon University Tech Report CMU-CS-93-182, June 1993.
Patents
Dr. Selberg holds two patents covering a method and system wrappers to execute a query on a
network:
6,102,969 Method and system using information written in a
wrapper description language to execute query on a network
6,085,186 Method and system using information written in a
wrapper description language to execute query on a network
Senior Thesis Advising
As part of the undergraduate honors program, Undergraduate seniors complete a senior project
where they work closely with graduate students and professors.
Christin Boyd
Winter and Spring 1996
Design and implementation of a query refinement system for HuskySearch. This system attempted
to aid the user in improving a given query as well as provide us with failure data on poor queries.
Darren Schack
Summer 1996
Design and implementation of a distributed document caching mechanism for HuskySearch. This
system would cache documents downloaded from the Web on demand to benefit repeated or
similar queries.
6
Tim Bradley
Fall 1995 and Winter 1996
Design and implementation of a system to facilitate MetaCrawler log mining. This data would
allow us to discern patterns in what clients were looking for, what they received, and what they
subsequently followed.
Invited Talks
Decade of the Web Symposium, University of Iowa
March, 1999
Presentation on the current state of meta search technology and the World Wide Web, including
details on MetaCrawler, HuskySearch, and other Web-related IR projects at the University of
Washington.
IBM T.J. Watson Research Center
Oct. 1998
Boeing Corp.
May 1998
Presentations on MetaCrawler and HuskySearch, describing both the technical innovations and
practical applications for both Internet and Intranet use.
Data Mining Summit
Mar. 1997
Presentation on meta search technology for data mining professionals, with emphasis on practical
applications of current research.
Distributed Indexing and Searching Workshop
May 1996
IETF Group Meeting
June 1996
Presentation on MetaCrawler technology emphasizing practical and economic implication of meta
search services for leading academics and industry professionals. During the workshop a proposal
was reached for scaling meta search services via query routing, which was then presented at the
FIND group of the IETF.
Software
HuskySearch. HuskySearch is a second generation meta search service available at the University
of Washington. HuskySearch is the primary search service available for searching the University
7
of Washington and is an ongoing research testbed for World Wide Web IR. HuskySearch is
available at http://huskysearch.cs.washington.edu.
MetaCrawler. MetaCrawler is one of the original World Wide Web meta search services. Its
early popularity was instrumental in the formation of Netbot, Inc., a startup company founded
at the University of Washington and acquired by Excite, Inc. for $35 million. MetaCrawler has
since been licensed to Go2Net Inc., a Seattle start-up, and is one of the premiere sites on the
Go2Net Network. The Go2Net Network was ranked #25 by Media Metrics in terms of overall
web traffic in January 1999. MetaCrawler is available at http://www.metacrawler.com.
Department and University Activities
Graduate Student Orientation Committee, 19941995: Supervised orientation presentation for
new graduate students in the CSE Department.
References
Tom Haug tom.haug@pacificedge.com
Pacific Edge Software, Inc.
Steve Newman steve.newman@infospace.com
InfoSpace, Inc.
Chek Lim chek.lim@infospace.com
InfoSpace, Inc.
Professor Oren Etzioni etzioni@cs.washington.edu
Department of Computer Science and Engineering, University of Washington
Professor Ed Lazowska lazowska@cs.washington.edu
Department of Computer Science and Engineering, University of Washington
Professor Efthimis Efthimiadis efthimis@u.washington.edu
Information School, University of Washington
8