### CFP: 25th Conference of the SEPLN (Spanish Society for Natural Language Processing)

. Thursday, December 18, 2008

September 8-10, 2009
Palacio Miramar, Donostia - San Sebastián
http://ixa2.si.ehu.es/sepln2009/

INTRODUCTION

The 25th edition of the Annual Conference of the Spanish Society for Natural Language Processing (SEPLN) will take place in the Miramar Palace in San Sebastian on September 8, 9 and 10, 2008.

We also expect to organise three satellite workshops during the week of the conference (see list of workshops).

The huge amount of information available in digital format and in different languages demands systems that enable us to access this vast library in an increasingly more structured way.

In this same area, there is a renewed interest in improving information accessibility and information exploitation in multilingual environments. Many of the formal foundations for dealing appropriately with these necessities have been, and are still being established in the area of Natural Language Processing and its many branches:

• Information extraction and retrieval, Questions answering systems,
• Machine Translation, Automatic analysis of textual content, Text
• Generation, and Speech recognition and synthesis.

The aim of the conference is to provide a forum for discussion and communication where the latest research work and developments in the field of Natural Language Processing (NLP) can be presented by scientific and business communities. The conference also aims at exposing new possibilities of real applications and R&D projects in this field.

Moreover, as in previous editions, there is the intention of identifying future guidelines or paths for basic research and foreseen software applications, in order to compare them against the market needs. Finally, the conference intends to be an appropriate forum in helping new professionals to become active members in this field.

TOPICS

Researchers and companies are encouraged to send communications, project abstracts or demonstrations related to any of the following language technology topics:
• Linguistic, mathematic and psycholinguistic models of language
• Corpus linguistics
• Development of linguistic resources and tools
• Grammars and formalisms for morphological and syntactic analysis
• Semantics, pragmatics and discourse
• Lexical ambiguity resolution
• Machine Learning in NLP
• Monolingual and multilingual text generation
• Machine translation
• Speech synthesis and recognition
• Monolingual and multilingual information extraction and retrieval
• Automatic textual content analysis
• Text summarization
• NLP-based generation of teaching resources
• NLP for languages with limited resources
• NLP industrial applications

STRUCTURE OF THE CONFERENCE

The conference will last three days, and will consist of sessions devoted to presenting papers, posters, ongoing research projects, prototype product demonstrations or products connected with topics addressed in the conference. Besides, we expect to organize three satellite workshops during the week of the conference.

SUBMISSIONS

The proposal must be submitted earlier than April 24, 2009 and they must meet certain format and style requirements.

Both the delivery and revision of proposals will be done exclusively in PDF electronic format via the Myreview system. We recommend using the LaTeX and Word templates that can be downloaded from the conference webpage.

Besides, the proposals will have to comply the following requirements, depending if they are communications, demos or projects.

COMMUNICATIONS

Authors are encouraged to send theoretical or system-related proposals.

The proposals must include the following sections:

• A title of the communication.
• The complete names of the authors, their affiliations, address, and e-mail (anonymous in the submitted proposal).
• An abstract in English and Spanish (maximum 150 words), including a list of keywords or related topics.
• The proposal can be written and presented in Spanish or English, and its overall maximum length will be 8 pages, excluding references, which can take up an additional whole page at the most.
• The documents must not include headings or footnotes.

The papers proposed will be assessed at least by three reviewers, and can be accepted to be presented either as posters or as communications, depending on the program necessities. However, no distinction will be made between communications and posters in the printed version of the SEPLN magazine.

PROJECTS AND DEMOS

As in previous editions, the organizers encourage participants to give oral presentations of R&D projects and demos of systems or tools related to the NLP field. For oral presentations on R&D projects to be accepted, the following information must be included:

• Project title
• Name, affiliation, address, e¬mail and phone number of the project director
• Funding institutions
• Groups participating in the project
• Abstract (2 pages maximum)

For demonstrations to be accepted, the following information is mandatory:

• Demo title
• Name, affiliation, e-mail and phone number of the authors
• Abstract (2 pages maximum)
• Time estimation for the whole presentation

IMPORTANT DATES

• April 24, 2009: Deadline for submitting papers, projects and demos
• May 25, 2009: Notification of acceptance
• June 19, 2009: Deadline for submitting the final version
• July 15, 2009: Deadline for early registration
• Sept. 7, 2009: Workshops
• Sept. 8, 9 & 10: 25th SEPLN Conference

SCIENTIFIC COMMITTEE

Chairman: Kepa Sarasola (Euskal Herriko Unibertsitatea)

Members:

* Itziar Aduriz (Universitat de Barcelona)
* José Gabriel Amores (Universidad de Sevilla)
* Jose Maria Arriola (Euskal Herriko Unibertsitatea)
* Xabier Artola (Euskal Herriko Unibertsitatea)
* Toni Badía (Universitat Pompeu Fabra)
* Irene Castellón (Universitat de Barcelona)
* Arantza Díaz de Ilarraza (Euskal Herriko Unibertsitatea)
* Antonio Ferrández (Universitat d'Alacant)
* Alexander Gelbukh (Instituto Politécnico Nacional. México)
* Koldo Gojenola (Euskal Herriko Unibertsitatea)
* Xavier Gómez Guinovart (Universidade de Vigo)
* Julio Gonzalo (UNED)
* Montserrat Marichalar (Euskal Herriko Unibertsitatea)
* José Mariño (Universitat Politècnica de Catalunya)
* M. Antonia Martí (Universitat de Barcelona)
* María Teresa Martín (Universidad de Jaén)
* Patricio Martínez (Universitat d'Alacant)
* Raquel Martínez (UNED)
* Ruslan Mitkov (Universidad de Wolverhampton)
* Manuel Montes y Gómez (Instituto Nacional de Astrofísica, Óptica y
Electrónica. México)
* Lidia Moreno (Universitat Politècnica de València)
* Lluís Padró (Universitat Politècnica de Catalunya)
* Manuel Palomar (Universitat d'Alacant)
* Ferrán Pla (Universitat Politècnica de València)
* German Rigau (Euskal Herriko Unibertsitatea)
* Horacio Rodríguez (Universitat Politècnica de Catalunya)
* Leonel Ruiz Miyares (Centro de Lingüística Aplicada de Santiago de
Cuba)
* Emilio Sanchís (Universitat Politècnica de València)
* Kepa Sarasola (Euskal Herriko Unibertsitatea)

* Mariona Taulé (Universitat de Barcelona)
* L. Alfonso Ureña (Universidad de Jaén)
* Felisa Verdejo (UNED)
* Manuel Vilares (Universidad de A Coruña)
* Luis Villaseñor-Pineda (Instituto Nacional de Astrofísica, Óptica y
Electrónica. México)

CONTACT INFORMATION

All the information about the Conference is available in the 25th SEPLN Conference website: http://ixa2.si.ehu.es/sepln2009/ E-mail: sepln2009@ehu.es

### CFP: 21st International Joint Conference on Artificial Intelligence (IJCAI-09)

. Monday, December 08, 2008

The IJCAI-09 Program Committee invites submissions of technical papers for IJCAI-09, to be held in Pasadena, CA, USA, July 11-17, 2009. Submissions are invited on significant, original, and previously unpublished research on all aspects of artificial intelligence.

The theme of IJCAI-09 is "The Interdisciplinary Reach of Artificial Intelligence," with a focus on the broad impact of artificial intelligence on science, engineering, medicine, social sciences, arts and humanities. The conference will include invited talks, workshops, tutorials, and other events dedicated to this theme.
• Important dates for authors of technical papers:
• Electronic abstract submission: January 7, 2009 (11:59PM, PST)
• Electronic paper submission: January 12, 2009 (11:59PM, PST)
• Author feedback period: March 13-16, 2009 (11:59PM, PDT). Please note: Daylight savings time starts on March 8.
• Author notification of acceptance/rejection: March 31, 2009
• Camera-ready copy due: April 14, 2009
• Technical sessions: July 13-17, 2009

Submission Details

Submitted papers must be formatted according to IJCAI guidelines and submitted electronically through the IJCAI-09 paper submission site. Full instructions for submission, including formatting guidelines and electronic templates for paper submission, are available on the IJCAI-09 website: http://www.ijcai-09.org (see the link titled Submission Details). Submitting authors will be required to register with the IJCAI-09 paper submission software (this will be linked from the IJCAI-09 website during the first week of December, 2008).

Papers may be accepted for either oral or poster presentation; papers accepted for either form of presentation will not be distinguished in the conference proceedings, nor will designation of oral or poster presentation be made on the quality of the contribution. Instead, these distinctions will be made in the interests of overall program coherence and quality.

To facilitate review, the paper title, author names, contact details, and a brief abstract must be submitted electronically by Jan. 7, 2009 (11:59 PST). No paper will be accepted for review unless an accompanying abstract is received by the deadline. Technical papers are due electronically on Jan. 12, 2009 (11:59 PST). Authors bear full responsibility for compliance with submission standards. Submissions received after the deadline or that do not meet the length or formatting requirements will not be accepted for review. No email or fax submissions will be accepted. Notification of receipt of the electronically submitted papers will be emailed to the designated contact author soon after receipt. If there are problems with the electronic submission, the program chair will contact the designated author by email. The last day for inquiries regarding lost submissions is Jan. 19, 2009. Notification of acceptance or rejection of submitted papers will be emailed to the designated author by March 31, 2009. The opportunity to respond to preliminary reviews will be made available to authors prior to this date, during the period March 13-16, 2009.

Guidelines for such responses, along with details of the reviewing process will be posted on the IJCAI-09 website. Camera-ready copy of accepted papers must be received by the publisher by April 14, 2009. Note: at least one author of each accepted paper is required to attend the conference to present the work. Authors will be required to confirm their acceptance of this requirement at the time of submission.

Authors who do not have access to the web should contact the program chair at pcchair09@ijcai.org no later than December 15, 2008 for alternate submission instructions.

Content Areas

To facilitate the reviewing process, authors will be required to choose two to four appropriate content area keywords from the list provided by the IJCAI-09 submission software, which will be part of the online paper registration process. Authors are encouraged to select the most specific keywords that accurately describe the main aspects of their contributions. General categories should only be used if specific categories do not apply or do not accurately reflect the main contributions. Each keyword is placed within one of ten 10 major themes; however, many of the keywords cut across multiple themes, and authors should feel free to select any keyword descriptive of the contribution, even if the major theme within which is it categorized is not the most appropriate. A list of keywords is appended to the end of this call.

The major themes are:

Agent-based and Multi-agent Systems
Constraints, Satisfiability, and Search
Knowledge Representation, Reasoning and Logic
Machine Learning
Multidisciplinary Topics And Applications
Natural Language Processing
Planning and Scheduling
Robotics and Vision
Uncertainty in AI
Web and Knowledge-based Information Systems

Policy on Multiple Submissions

IJCAI will not accept any paper which, at the time of submission, is under review for or has already been published or accepted for publication in a journal or another conference. Authors are also required not to submit their papers elsewhere during IJCAI's review period. These restrictions apply only to journals and conferences, not to workshops and similar specialized presentations with a limited audience and without archival proceedings. Authors will be required to confirm that their submissions conform to these requirements at the time
of submission.

Paper Length and Format

Submitted technical papers must be no longer than six pages, including all figures and references, and must be formatted according to posted IJCAI-09 guidelines. Specifically, papers must be formatted for "letter-size" (8.5" x 11") paper, in double-column format with a 10pt font. Electronic templates for the LaTeX typesetting package, as well as a Word template, that conform to IJCAI-09 guidelines will be made available at the conference website (see above) during the first week of December, as will further details on formatting.

Authors are required to submit their electronic papers in PDF format. Files in Postscript (ps), or any other format will not be accepted.

REFERENCES

### The Need for Open Source Software in Machine Learning

. Wednesday, June 25, 2008

Reading Undirect Grad blog, I found an interesting paper about the need of more Open Software in Machine Learning. The abstract:
Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the ﬁeld of machine learning has developed a large body of powerful learning algorithms for diverse applications. However, the true potential of these methods is not used, since existing implementations are not openly shared, resulting in software with low usability, and weak interoperability. We argue that this situation can be signiﬁcantly improved by increasing incentives for researchers to publish their software under an open source model. Additionally, we outline the problems authors are faced with when trying to publish algorithmic implementations of machine learning methods. We believe that a resource of peer reviewed software accompanied by short articles would be highly valuable to both the machine learning and the general scientiﬁc community.
I think this paper addresses a very interesting problem, not only for the ML community. As said in the paper, "Open Source model allows better reproducibility of the results, quicker detection errors, innovative applications, faster adoption of ML methods in other disciplines", but it also avoids a constant reinvention of the wheel, and is a fairer model because if most of the researchs are funded by public money, why should researchers stop the access to the code?

The same happens with publications. Open Access should be a neccesary condition for every public funded research. Luckily, there are several iniciatives all around the globe trying to spread the benefits of the Open Access model, as Harvard's addoption of Open Access or the support of the Comunidad de Madrid (a Spanish region) to several Open Access iniciatives (sorry for the link in Spanish).

In recent years, the ML community has improve in this aspects. We count on a very good Open Source ML framework as Weka, we have a top Open Access Journal as JMLR that also supports ML Open Source software and a very good Open Source software repository like MLOSS.

### Automated Microarray Classification Challenge

. Tuesday, June 24, 2008

The diagnosis of cancer on the basis of gene expression profiles is well established, so much so that micro-array classification has become one of the classic applications of machine learning in
computational biology. The field has now reached the stage where a large scale evaluation exercise is warranted to determine the advantages and disadvantages of competing approaches. We have therefore organized a challenge for ICMLA'08, the aim of which is to determine the best fully automated approach to micro-array classification. An unusual feature of the competition is that instead of submitting predictions on test cases, the competitors submit a MATLAB implementation of their algorithm (R and Java interfaces are also in development), which is then tested off-line by the challenge organizers. This will test the true operational value of the method, in the hands of an end user who is not necessarily an expert in a given technique. The winner of the challenge will receive a free registration to ICMLA'08.

Further details and background information regarding the competition are available from the challenge website, http://theoval.cmp.uea.ac.uk/~gcc/projects/amcc. If you have any questions, please feel free to contact the challenge organizers (g...@cmp.uea.ac.uk).

The results of the challenge will be presented at a special session at ICMLA'08. Competitors are encouraged to participate in the special session and are invited to submit a technical paper describing their technique. Submissions should be made electronically in PDF format using the central ICMLA'08 website. The deadline for submissions is June 15, 2008. All accepted papers must be presented by one of the authors in order to be published in the conference proceeding.

Important Dates

Challenge opens March 10, 2008
Challenge closes Julu 15, 2008
Paper submission due July 15, 2008
2008
Camera-ready papers & pre-registration October 1, 2008
ICMLA'08 conference December 11-13, 2008

Special Session Chair

Dr Wenjia Wang, University of East Anglia, Norwich, U.K.

Special Session Organizers

Dr Gavin Cawley, University of East Anglia, Norwich, U.K.
Dr Wenjia Wang, University of East Anglia, Norwich, U.K.
Mr Geoffrey Guile, University of East Anglia, Norwich, U.K.

### KDD Cup 2008 and Workshop on Mining Medical Data

.

KDD Cup is the first and the oldest data mining competition, and is an integral part of the annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). Based on data provided by Siemens Medical Solutions USA, this year's KDD Cup competition focuses on the early detection of breast cancer from X-ray images of the breast. We are looking forward to an interesting competition and your participation. We particularly encourage the participation of students.

There are 2 different parallel options for participating:
1. Submit entries to the KDD Cup competition
2. Paper submissions for the associated Workshop on MiningMedical Data

Further details on each option are provided below.

KDD Cup 2008

Siemens Medical Solutions is proud to provide the data for the KDD Cup 2008 competition. The competition focuses on the early detection of breast cancer from X-ray images of the breast. There are two specific tasks, selected to be interesting to participants from academia and industry. The tasks are described in detail at www.kddcup2008.com. You can choose to compete in either or both of the tasks. The training data can be downloaded after April 3, 2008. Important dates are listed below.

April 1 Web site up. Registration opens
April 3 Training data and evaluation code available after login
June 20 Registration for KDD Cup closes
July 7 Last date for submission of results on test set
July 15 Notification of KDD Cup competition results
July 31 Winners submit their camera ready papers to the workshop
August 24-27 Winners present their work at the workshop.

Workshop on Mining Medical Data

We invite the submission of papers related to mining medical data. Participants in the KDD Cup 2008 may optionally submit papers to this workshop describing their entry. However, the workshop is broader in scope, and we also welcome other submissions related to the mining of
medical data from structured sources such as structured databases and from unstructured data sources such as medical images, textual notes, etc. We particularly invite papers describing systems that are able to combine all available patient information whether from structured sources or from unstructured sources, to support medical decision making.

All submitted papers will be evaluated by the workshop program committee based on scientific merits and novelty as perceived by the committee. Accepted papers will appear in the workshop proceedings. Authors of the accepted papers are required to present their papers at the workshop. Depending on interest, a subset of the selected papers may also be published in a special issue of a journal later on. Important dates are listed below.

All submitted papers must be in PDF format, must be restricted to 4 pages, and must use the template found at http://www.acm.org/sigs/publications/proceedings-templates.

July 7 Last date for submitting papers for the workshop
July 31 Final Camera ready papers due
August 24-27 Authors of accepted papers present their work.

.

Before joining Yahoo!, Dr. Usama Fayyad worked 5 years in Microsoft Research and building data mining solutions for Microsoft's servers division. From 1989 to 1996, Usama held a leadership role at NASA's Jet Propulsion Laboratory (JPL) where. In 2000, he co-founded and served as CEO of digiMine Inc. (now Revenue Science Inc.), a data analysis and data mining company.

Dr. Fayyad has been in Yahoo! for more than 4 years, being chief data officer and executive vice president of research and strategic data solutions. From that position , Fayyad has been the responsible for Yahoo!'s overall data strategy, the Yahoo!'s architecting data policies and systems, and the manager of Yahoo!'s data analytics and data processing infrastructure.

On June 12, New York Times Bits reported that
Mr. Fayyad told his staff yesterday that he would be leaving and his departure is expected to be officially announced later today. Mr. Fayyad was the data guru at Yahoo, the person in charge of mining the terabytes of data collected by the company to improve things like the targeting of ads and content to Yahoo users. He was also in charge of Yahoo’s well-respected research organization.
Gregory Piatetsky-Shapiro reported in KDnuggets some interesting words from Usama Fayyad, where he says it is a good time to quit Yahoo! as his team will be able to continue his work. Usama seems to want starting a new company taking advantage of his data mining knowledges and the huge vision about Internet, search, advertising and the future of interactive media that Yahoo! has offered to him.

With this announcement, Usama joins to many other Yahoo! execs that are actually trying to "run away" from Yahoo!.

### Computational Linguistics (CL) goes Open Access

. Thursday, June 19, 2008

Hal announces that CL journal would be open access from the first issue of the next year. There will be no print version of the journal and the electronic version will be Open Access.

The existence of an importan Open Access journal related to Computational Linguistics has been a discussion topic last years. On May 2007, Hal published the post "Whence JCLR?" where he discussed about the existence of the JMLR Journal, an Open Access Machine Learning journal that is one of the key journals for the ML community.

It is really a very good new for the CL community.

### The Discipline of Machine Learning

. Tuesday, June 17, 2008

Tom Mitchell is one of the key personalities of Machine Learning discipline. He has been working in this area since the end of the 70's, published some reference ML textbooks and, first of all, he is the head of the first Machine Learning department all around the world.

In 2006, when he was "fighting" for the creation of the ML department at the Carnegie Mellon University, he was said that "you can only have a department if you have a discipline that is going to be here in one hundred years otherwise you can not have a department". For stating that ML would last more that a hundred years, he wrote a white paper, "The Discipline of Machine Learning", that is a real must-read paper for all the people interested in ML. The abstract of the paper states

Over the past 50 years the study of Machine Learning has grown from the efforts of a handful of computer engineers exploring whether computers could learn to play games, and a field of Statistics that largely ignored computational considerations, to a broad discipline that has produced fundamental statistical-computational theories of learning processes, has designed learning algorithms that are routinely used in commercial systems for speech recognition, computer vision, and a variety of other tasks, and has spun off an industry in data mining to discover hidden regularities in the growing volumes of online data. This document provides a brief and personal view of the discipline that has emerged as Machine Learning, the fundamental questions it addresses, its relationship to other sciences and society, and where it might be headed.

Tom also gave a speech related to this matter at the Carnegie Mellon University School of Computer Science's Machine Learning Department in March 2007. You can watch Mitchell's speech in this video.

### ECML PKDD Discovery Challenge 2008

. Monday, June 16, 2008

This year, the ECML/PKDD's discovery challenge is set about social bookmarking. There are two main tasks: Spam Detection in Social Bookmarking Systems and Tag Recommendation in Social Bookmark Systems. This challenge is organized in conjunction with the Web 2.0 Mining workshop, and seems very interesting. Test data set will be released on July 30th, there is enough time to try something :)

### Interviews

. Wednesday, June 11, 2008

Some interesting interviews to important people from DM&ML communities. Thanks to VideoLectures for hosting all that interesting stuff.

Dr. Usama Fayyad is responsible for Yahoo!'s overall data strategy, architecting Yahoo!'s data policies and systems, prioritizing data investments, and managing the Company's data analytics and data processing infrastructure.

Tom Mitchell is the first Chair of Department of the first Machine Learning Department in the World, based at Carnegie Mellon.

Gregory Piatetsky-Shapiro, Ph.D. is the President of KDnuggets, which provides research and consulting services in the areas of data mining, knowledge discovery, bioinformatics, and business analytics

### Journal of Interesting Negative Results in Natural Language Processing and Machine Learning

. Saturday, May 24, 2008

Johannes Fuernkranz sent this announcement to the ML-news list. I think it is a great new to the NLP and ML communities as some negative results can be even more useful than some positive results. This is a good way to prevent others to do not expend time exploring hypothesis that have been invalidated by others.

------------------------------------------------------------------------
Journal of Intersting Negative Results
http://www.jinr.org
------------------------------------------------------------------------

We are happy to announce the on-line publication of the first article in the Journal of Interesting Negative Results in Natural Language Processing and Machine Learning. Please visit http://www.jinr.org and click on "articles".

JINR is an electronic journal, with a printed version to be negotiated with a major publisher once we have established a steady presence. The journal will bring to the fore research in Natural Language Processing and Machine Learning that uncovers interesting negative results.

It is becoming more and more obvious that the research community in general, and those who work NLP and ML in particular, are biased towards publishing successful ideas and experiments. Insofar as both our research areas focus on theories "proven" via empirical methods, we are sure to encounter ideas that fail at the experimental stage for unexpected, and often interesting, reasons. Much can be learned by analysing why some ideas, while intuitive and plausible, do not work. The importance of counter-examples for disproving conjectures is already well known. Negative results may point to interesting and important open problems. Knowing directions that lead to dead-ends in research can help others avoid replicating paths that take them nowhere. This might accelerate progress or even break through walls!

We propose this journal as a resource that gives a voice to negative results which stem from intuitive and justifiable ideas, proven wrong through thorough and well-conducted experiments. We also encourage the submission of short papers/communications presenting counter-examples to usually accepted conjectures or to published papers.

The journal's scope encompasses all areas of Natural Language Processing and Machine Learning. Papers published in JINR will meet the highest quality standards, as measured by the originality and significance of the contribution. They will describe research with theoretical and practical significance. All theories and ideas will have to be clearly stated and justified by a deep literature review.