The North East DB/IR Day is a semi-annual conference, which brings together
database and information retrieval researchers and students from academic
and research institutions in the area for an exciting technical program as
well as informal discussion. The DB/IR day provides a regular forum for
presenting diverse viewpoints on database systems and information retrieval,
addressing current topics as well as promoting information exchange among
researchers.
Thanks to all the participants that
made this DB/IR Day really fun ! We would also like to thank our gold sponsors
CA Labs and the
FLWOR Foundation as well as our bronze sponsor Yahoo Research
for making this happen. Statistics: We had over 130 participants registered,
about 120 attendants (tallied at lunch) and around 25 registered posters.
Pics can be found here.
Poster Prizes
| First Prize |
Alpa Jain, Columbia University, SQOUT:SQL queries over text databases |
| |
|
| Second Prize |
Qingqing Gan, Brooklyn Polytech, Automatic Detection of Web Spam
|
| |
|
| Third Prize (1) |
Jalal Mahmud, Stony Brook,
Information Overload in Non-Visual Web Transaction: Context Analysis
Spells Relief
|
| |
|
| Third Prize (2) |
Nikolay Archak, NYU Stern Graduate School of Business,
Show me the Money! Deriving the pricing power of product features by
mining consumer reviews |
Short note on the name: The DB/IR day has been called numerous
names in the past, including greater NY Area, Columbia, Greater Philadelphia
etc. We chose to name it North East, with the aim of enlarging its scope and
maybe growing it into an East Coast version of itself. Let us know what you
think :)
The Fall 2007 DB/IR Day is hosted by Stony Brook University on Friday,
October 5-6, 2007.
The program consists of technical keynote lectures from distinguished
researchers in databases and information retrieval. In addition, there are
group introductions, student presentations, and a poster session ( with
substantial prizes) to promote awareness of current DB/IR research at
various graduate departments in the North-East area, and stimulate
collaborations between academia and industry. A second optional day
(Saturday) includes a series of social activities (short research rump
sessions, wine tasting trip, boat tour and dinner) intended to both enable
synergies between participants and showcase the beautiful areas and beaches
of Long Island.
|
|
Directions and Parking
The DB/IR Day will be held in the
Student Activities Center
(Rooms: Auditorium, Ballroom A). Directions can be found
here.
Here are the google
maps and the yahoo maps pointers to Stony Brook.
Parking directions can be found on this
map. We recommend parking in the "administration parking garage" depicted in E5
on the map. The SAC Building is depicted in D5 (3-5 minutes walk from garage).
Another set of directions (to the Wang Center building in E4, right in front of the garage) with more
options can be found here.
If you would prefer to take the train from Penn Station in Manhattan,
here is the schedule to Stony Brook
(you likely need to make the 7:49am train in Penn, or, in the worst case,
the 9:15 that gets here at 11:10). This is another map
(#3 is the garage, #61 is the RR station, and #78 is the SAC where the action will be).
Even more detailed instructions here.
|
|
|
Lodging
The workshop hotel is the Radisson
and offers a special
rate of $115.00 per night that includes complimentary high speed wireless internet
access and shuttle services to/from Stony Brook (just mention DBIR when
reserving).
There are a variety of other lodging choices in the area: the famous
Three Village Inn
in historical downtown Stony Brook (4 miles from campus),
The Danfords Inn
Marina,
The Heritage
Inn, and the Holly Berry Bed and
Breakfast, all located near a
a picturesque marina/harbor and the charming Port Jefferson village (5
miles from campus), and the Holiday Inn Express (3 miles) located on the Nesconset
Highway.
|
Preliminary Program
Friday
| 09:00 - 17:00 |
Registration |
| 09:15 - 09:30 |
Welcome and Opening Remarks |
| 09:30 - 10:10 |
Group Presentations |
| 10:10 - 10:20 |
Coffee Break |
| 10:20 - 11:20 |
Invited Talk:
Raghu Ramakrishnan (Yahoo! Research)
|
| |
Web Data Management: Powering the New Web
The Web is no longer a static repository of documents; it is a dynamic
repository of information that connects people with their passions, and
on a more prosaic note, the applications they use in their personal and
professional lives. How is the Web evolving as an information source,
and how does this affect the future of information discovery? What are
the implications of the rapid growth of social networks? How does the
emergence of the Web as a delivery channel for services affect the
future of software? Technically, these trends have given rise to a new
wave of challenges, and led to vigorous research on a number of fronts
ranging from social network analysis, information extraction and
community information management, massively distributed storage and
computing platforms, and placed a premium on hosted service
architectures. In this talk, I will discuss these issues and outline
some of the solutions that are beginning to emerge.
Raghu Ramakrishnan is Chief Scientist for Audience and Research Fellow
at Yahoo!, and heads the Community Systems Group in Yahoo! Research. He
is on leave from the University of Wisconsin-Madison, where he is
Professor of Computer Sciences, and was founder and CTO of QUIQ, a
company that pioneered question-answering communities, powering Ask
Jeeves' AnswerPoint as well as customer-support for companies such as
Compaq. His research has influenced query optimization in commercial
database systems, and the design of window functions in SQL:1999. His
paper on the Birch clustering algorithm received the SIGMOD 10-Year
Test-of-Time award, and he has written the widely-used text "Database
Management Systems" (with Johannes Gehrke). He is Chair of ACM SIGMOD,
on the Board of Directors of ACM SIGKDD and the Board of Trustees of the
VLDB Endowment, and has served as editor-in-chief of the Journal of Data
Mining and Knowledge Discovery, associate editor of ACM Transactions on
Database Systems, and the Database area editor of the Journal of Logic
Programming. Dr. Ramakrishnan is a Fellow of the Association for
Computing Machinery (ACM), and has received several awards, including a
Distinguished Alumnus Award from IIT Madras, a Packard Foundation
Fellowship, an NSF Presidential Young Investigator Award, and an ACM
SIGMOD Contributions Award.
|
| 11:20 - 11:30 |
Coffee Break |
| 11:30 - 12:30 |
Invited Talk:
Marianne Winslett (UIUC)
|
| |
Managing Scientific Data: New Challenges for Database Researchers
The database research community's appetite for new applications has led to
increased interest in the data management needs of scientists. This area
encompasses a huge range of applications, extending from public repositories
of observational data such as the popular Sloan Digital Sky Survey to
one-of-a-kind runs of simulation codes crafted by individual scientists. In
this talk, we will survey the most common data management needs found in the
hard sciences, describe the new database research challenges that arise from
these needs, and outline ways to address some of these challenges.
Marianne Winslett has been a professor at the University of Illinois at
Urbana-Champaign since 1987. Her current research interests include security
in open systems and data management for scientific applications. She has
served on the editorial boards of ACM Transactions on Database Systems and
IEEE Transactions on Knowledge and Data Engineering, and is currently on
the board of ACM Transactions on the Web. She is an ACM Fellow, a past
vice-chair of ACM SIGMOD and the recipient of an NSF Presidential Young
Investigator Award.
|
| 12:30 - 14:20 |
Lunch and Poster Session |
| 14:20 - 15:10 |
Invited Talk: Divesh Srivastava
(AT&T Labs
Research)
|
| |
The Bellman data quality browser
Data quality is a serious concern in complex industrial-scale
databases, which often have thousands of tables and tens of thousands
of columns. Commonly encountered problems include duplicates and
default values in columns treated as keys, data inconsistencies, and
poor quality join paths. Compounding the data quality problems are
incomplete and out-of-date metadata about the database and the
processes used to populate the database. These problems make the task
of analyzing data particularly challenging. The Bellman data quality
browser has been built to effectively address such problems. Bellman
profiles the database and computes concise statistical summaries of
the contents of the database to identify approximate keys, frequent
values of a field (often default values), joinable fields, and to
understand database dynamics (changes in a database over time). In
this talk, I'll describe the technology underlying Bellman and how
it is used to help make sense of complex databases.
Divesh Srivastava is the head of Database Research at AT&T Labs
Research. He received his Ph.D. from the University of Wisconsin,
Madison, and his Bachelor of Technology from the Indian Institute of
Technology, Bombay, India. His current research interests include
data quality and data stream management systems.
|
| 15:10 - 15:20 |
Coffee Break |
| 15:20 - 15:50 |
Short fast-forward research presentations |
| 15:50 - 16:00 |
Poster Awards |
| 19:00 |
Dinner Outing (in groups, on your own, with directions, mostly in the Port Jefferson Harbor) |
|
Saturday
| 10:00 - 14:00 |
Long Island Beach and Wine-Tasting Tour |
| |
|
| 14:00 - 16:00 |
Vineyard Lunch and Research Rump Sessions |
| |
|
| 15:00 - 18:00 |
Optional Boat Trip |
| |
|
| 16:00 |
Conclusion |
|
Registration.
While registration is free we appreciate your RSVP.
Past DB/IR Days.
Past DB/IR Days were hosted by
Columbia University (Spring 2005), University of Pennsylvania (Fall 2005),
Rutgers University (Spring 2006),
NYU (Fall 2006), and
IBM Research (Spring 2007) .
|