Workshop on Text-Mining & Link-Analysis (TextLink 2007)
January 7, 2007
To be held at
IJCAI-2007,
The Twentieth International Joint Conference on Artificial Intelligence,
January 6-12, 2007, Hyderabad, India
Call for papers was open until September 25, 2006
The workshop aims to focus the
intersection of the two still increasingly important areas of analytic research:
Text-Mining and Link-Analysis. Both areas deal with so-called unstructured data
representations like text and graphs sharing many similar characteristics in the
context of analysis. Although both areas are very much related in the technical
and the historical sense there has not been almost any events so far addressing
explicitly the common problems and techniques. Therefore, the aim of the
workshop is to attract the scientists in the both areas resulting in getting
better insights in the work of each other and potentially new ideas for future
research.
Link-Analysis is an area, which
developed in the last 20 years in various fields as Social Sciences
(Social-Network-Analysis), Mathematics (Graph-Theory), and Computer-Science
(graph as a data-structure). Recently the area got much bigger attention,
especially in Data Mining / KDD community because of its wide applicability in
the areas as law enforcement investigations (e.g. terrorism), fraud detection
(e.g. insurance, banking), web analytics (e.g. search engines, web marketing),
telecommunications (e.g. routers, traffic, connectivity).
Text-Mining area is receiving in the
last 6 years growing attention mainly because of the availability of large text
corpora in the electronic form and because there is lack of “intelligent” tools
and techniques for solving different difficult problems appearing on the market
like: information extraction, text categorization, ontology building,
visualization, intelligent search, etc.
On the intersection of both fields there
are many interesting problems and issues out of which both fields can benefit.
Just to name some of the potential problem and application areas: trend
analysis, community identification, web user profiling, media clipping,
marketing, etc. The intersection of both areas also includes ideas as for
instance representing text with the graph structure (which got popular in the
social-networks area recently) and analytic procedures for discovering various
pieces of knowledge using that kind of alternative representations. In
particular, currently “hot” areas of research and applications are analysis of
dynamic (evolving) datasets including text and link structure, emerging
semantics from electronic social structures (blogs, emails, folksonomies, social
bookmarking, Wikipedia etc.)
The broader context of the workshop can
be related in some respect to the areas of Data-Mining, Machine-Learning,
Semantic-Web, Information Retrieval, Natural-Language-Processing,
Social-Networks-Analysis and general Graph-Theory.
.
Topics of interest
Particular topics of interest for the workshop include but are not limited to:
- Link-Analysis / Social Networks Analysis
- Text-Mining / Language technologies
- Web-Mining
- Semantic-Web
- Emerging Semantics / Folksonomies
- Information-Extraction
- Scalability of developed approaches
- Visualization of text and link structures
- Performance evaluation measures
- Dynamic Networks
- Visualization / HCI
- Innovative applications
Workshop Program
The workshop consists of invited talk, presentation of refereed papers, and discussions.
We hope that the program will stimulate future collaboration among researchers
- 8:45 - 9:00 Opening
- 9:00 - 10:45 Session I: Classification, Clustering
- Classification of Assamese documents using word similarity heuristics, Kamakhya Prasad Gupta, Nitin Indurkhya
- Classification with Pedigree and its Applicability to Record Linkage, Evan S. Gamble, Sofus A. Macskassy, Steve Minton
- Improving Within-Network Classification with Local Attributes, Sofus A. Macskassy
- Document clustering using Lexical Chains, Dinakar Jayarajan, Dipti Deodhare, B.Ravindran, Sandipan Sarkar
- Information Gain Feature Selection for Ordinal Text Classification using Probability Re-distribution, Rahman Mukras, Nirmalie Wiratunga, Robert Lothian, Sutanu Chakraborti, David Harper
- 10:45 - 11:15 Coffee Break
- 11:15 - 12:30 Session II: Extracting Information
- Information Extraction using Non-consecutive Word Sequences, Sachindra Joshi, Ganesh Ramakrishnan, Sreeram Balakrishnan, Ashwin Srinivasan
- A Text Mining Model for Concept Chain Queries, Rohini K. Srihari, Anmol Bhasin, Li Xu
- Identifying People on the Web through Automatically Extracted Key Phrases, Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka
- 12:30-14:00 Lunch Break
- 14:00 - 15:00 Session III: Extracting Information
- Exploiting Syntactic and Semantic Information for Relation Extraction from Wikipedia, Dat P.T Nguyen, Yutaka Matsuo, Mitsuru Ishizuka
- Unsupervised extraction of concept instance names from Web sources, Kostyantyn Shchekotykhin, Gerhard Friedrich
- Discovering Semantic Similarity from the World Wide Web, Delip Rao, Deepak Khemani
- 15:00 - 15:45 Session IV: Linked Data
- Personalized Document Rankings by Incorporating Trust Information From Social Network Data into Link-Based Measures, Claudia Hess, Klaus Stein
- A Large-Scale Study on Persian Weblogs, Vahed Qazvinian, Abtin Rassolian, Mohammad Shafiei
- 15:45 - 16:15 Coffee Break
- 16:15 - 17:15 Session V: Linked Data
- From Social Network to Light-weight Ontology, Marko Grobelnik, Dunja Mladenic, Blaz Fortuna
- Analysis of Enron Email Threads and Quantification of Employee Responsiveness, Deepak Padmanabhan, Dinesh Garg, Virendra K. Varshney
- Discovering More Accurate Frequent Web Usage Patterns, Murat Ali Bayır, Ismail H. Toroslu, Ahmet Coşar, Güven Fidan
- 17:15 - 17:30 Discussion and Closing
Submissions
Submissions should be sent by September 25, 2006, in electronic form as a PDF file, to marko.grobelnik@ijs.si.
Please ensure you include the following text in your email subject: “TextLink-2007 Workshop submission”.
Submissions are limited to a maximum of 10 pages. Papers should be formatted
according to LNCS (Lecture Notes in Computer Science) format (templates can be
found at Springer-Verlag LNCS Authors’
Instructions page). Authors are strongly encouraged to use LaTeX2e, although
Word files, in PDF format, will also be accepted. The reviews will not be blind so authors should include their full contact information in the papers.
Submitted papers will be reviewed by referees from the Program Committee.
Accepted papers will be published in the Workshop proceedings.
Notification of acceptance and rejection will be sent by October 23, 2006.
Submission Deadline: September 25, 2006
Acceptance Notification: November 3, 2006
Camera-ready Copies: November 10, 2006
Workshop date: January 7, 2007
Attendance
Attendance is not limited to the paper authors.
The workshop should be interesting primarily for researchers,
students and company people working in the research and application areas
dealing with various aspects of data analysis and rich data & knowledge
representations.
We expect that, the workshop will attract people from the
areas and sub areas of:
- Academic Data-Mining (analytical aspects of dealing with text and link structures, dynamic networks)
- Commercial Data-Mining (new application areas, such as blog analysis, trend detection etc.)
- Natural-Language-Processing (representational aspects)
- Social-Networks-Analysis (algorithmic aspects of dealing with large network structures)
- Semantic-Web (especially emerging semantics coming out of bottom-up collaborative efforts e.g. folksonomies)
Our assumption is that the topic will attract people already being present at the
IJCAI and being interested in Data-Mining, Machine-Learning and
Natural-Language-Processing. We expect that there might be also some additional
participants just because of the workshop topics from Social-Network-Analysis
area which otherwise would not come to the IJCAI.
Organization
Program Chairs
Marko Grobelnik
J.Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Natasa
Milic-Frayling
Microsoft Research Ltd, 7 J J Thomson Avenue, Cambridge, CB3 0FB, United Kingdom
Dunja Mladenic
J.Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Program Committee
-
Deepak Agarwal, Yahoo Research, USA
-
Janez Brank, J.Stefan Institute, Ljubljana, Slovenia
-
Mark Craven, University of Wisconsin, USA
-
Chris Diehl, Johns Hopkins University, USA
-
Blaz Fortuna, J.Stefan Institute, Ljubljana, Slovenia
-
Lise Getoor, University of Maryland, USA
-
Rayid Ghani, Accenture Technology Labs - Research, Chicago, USA
-
Antonio Gulli, AskJeeves/Teoma and University of Pisa, Italy
-
Jure Leskovec, Carnegie Mellon University, USA
-
Rada Mihalcea, University of North Texas, USA
-
Blaz Novak, J.Stefan Institute, Ljubljana, Slovenia
-
Dragomir Radev, University of Michigan, USA
Past events
We feel that the continuity of meeting and exchanging ideas
is essential for effective promotion and development of this
research area.