Lab of Advanced Algorithms and Applications
Highlights
We are developing a new semantic-annotation technology for short textual fragments, called TAGME. This tool has been applied succesfully to many contexts concerning with the clustering, the classification and the similarity-comparison of short texts.We are also studying the design of distribution-aware compressed indexes (à la FM-index) and new data compressors based on the philosophy of “Multi-objective optimization design”, in which the user can specify a set of computational resources and the index or the compressor adapt themselves in order to optimize all of them, simultaneously.
Current Partnerships
- Tiscali Italia
- Google Zurich
- Bassilichi
- Net7
- SpazioDati
Current Grants
- [2013-2015] Regione Toscana, Net7, StudioFlu, SpazioDati
- [2013-2014] Bassilichi
- [2013-2014] Google Research Award
- [2012-2014] Italian MIUR-PRIN Project "ARS-Technomedia"
- [2011-2012] Telecom Italia Working Capital
- [2010] Google Research Award
- [2010-2012] Italian MIUR-PRIN Project "The Mad Web"
- [2006-2011] Yahoo! Research
- [2009-2012] Italian MIUR-FIRB Project "Linguistica"
Latest News
New google grant! » February 21, 2013
Our project proposal “A novel graph for social-network analysis and search built by entity-annotators, and its applications” has received a Google grant as part of Google’s Faculty Research Award program!
BAT-Framework for benchmarking Topic Annotators now available! » January 30, 2013
As a contribute to the scientific community working on the field of topic annotation, we developed a framework to compare text annotators: systems that, given a text document, aim at finding the topics the text is about, identified as Wikipedia pages. The BAT-Fframework, written in Java, comes along with a formal framework that defines a set of problems, the way systems can be compared to each other, and a set of measures that – extending classic IR measures – fairly and fully compares topic annotators features. The formal framework, whose understanding is required to use the benchmark framework, is presented in this thesis.
Main features:
- Compare in a fair and complete way any Topic Annotation system.
- Provides an implementation for all defined measures and match relations.
- Easily extensible adding new systems, new datasets, new similarity measures.
- Performs extensive testing on any Topic Annotator and any dataset.
- Performs runtime testing.
- Generates gnuplot charts and Latex tables summarizing test results.
- Completely open source, distributed under the GPLv3 license.
You can download the BAT-Framework Environment 0.1 and read the Quick Reference.
New grants for our lab! » November 2, 2012
We are pleased to announce that our lab has been awarded a MIUR-PRIN grant, named “ARS Techno-media”, on algorithms for big data arising from Web and Social Networks. These data are challenging because they are derived from activities with an extreme level of detail, are variegate in their nature (structured or not), and typically consists of short, fragmented and noisy texts (e.g. micro-blogs, tweets, news,…), so they require innovative algorithms for their storage, analysis, access and communication in order to make these data used efficiently and efficaciously in several kinds of (social-network) applications which are spreading daily over our PCs, smart-phones and tablets. We will mainly concentrate on 3 issues: storage and indexing of these massive datasets, efficient and efficacious mining of these heterogeneous datasets, the study of Mobile Ad-hoc Social Networks (MASN).
ACM CIKM 2010 and IEEE software » January 17, 2012
A paper on our topic-annotation technology TAGME, already presented as short paper on CIKM 2010, has been published on IEEE software of January/February 2012 (vol. 29 no. 1).


