Carsten Eickhoff
Carsten Eickhoff
- T
- +31 15 27 87241
- E
- C.eickhoff@tudelft.nl
Intelligent Systems
Mekelweg 4
2628 CD Delft
The Netherlands
Office: HB 11.080
About Me:
I am a PhD researcher working on the PuppyIR project. The project aims to facilitate information search for children. My particular research interest resides on relevance modelling for information retrieval. Additionally, I am interested in natural language technologies such as statistical NLP and machine translation. A current CV, more information on my research topics and general news can be found on my home page.
News:
- 04.04.2013: SIGIR'13: Copulas for Information Retrieval
- 04.12.2012: ECIR:
Exploiting User Comments for Audio-visual Content Indexing and Retrieval - 04.12.2012: ECIR: Designing Human-Readable User Profiles for Search Evaluation
- 16.11.2012: WSDM: Personalizing Atypical Web Search Sessions
- 16.07.2012: CIKM: The Downside of Markup...
- Solid programming skills
- Understanding of modern web technologies and markup languages
- Foundations of IR techniques
- experience with large-scale text processing and big data (optional)
- Foundations of IR techniques
- Solid understanding of data mining and statistical methods
- Programming skills
- Server administration skills in case of additional data collection (optional)
- Playing experience of MMOGs may come in handy ;-)
Students:
Indexability of Web Pages
The continued development and maturation of advanced HTML features such as Cascading style sheets (CSS), JS, and AJAX, as well as their widespread adoption by browsers, has enabled web pages to flourish with sophistication and interactivity. Unfortunately, this presents challenges to the web search community, as a web page's representation in the browser (i.e., what users see) can diverge dramatically from its raw HTML content (i.e., what search engines index and retrieve). For example, interactive pages may contain content in regions that are not visible before a user action, such as focusing a tab, but which are nonetheless still contained within the raw HTML.
Previous work (http://dmirlab.tudelft.nl/sites/default/files/sp253-gyllstrom.pdf) has confirmed this divergence and its effect on state-of-the-art retrieval systems. The aim of this project is to devise a metric that accurately describes how far apart a given web page's rendered and static representations are. Such a metric could ultimately be used as a surrogate for indexing quality in modern web search engines.
This project requires:
Available M.Sc. Projects - Information Retrieval in Virtual Worlds
Virtual worlds (VW) are a topic of steadily growing relevance. Some of the providers report user numbers that exceed the population of real world countries such as Chile or the Netherlands. VWs typically provide a high degree of complexity which in some areas approaches the real world's richness of detail. Without “living” in any given VW it is hard to get insights about that world and its inhabitants. Knowledge about users' interactions within a virtual world can be of socio-economical and scientific interest.
Information Retrieval (IR) and related fields offer a wide array of methods for extracting traces of knowledge from (potentially unstructured) large-scale resources. In this project, we propose bringing together the two fields for mutual benefit. Example applications include event prediction, social interaction analysis or friend recommendation within the VW.
The concrete scope of the project is relatively open and we welcome ideas from your side. We already have a large corpus of MMORPG server logs to work with.
This project requires:
For further inquiries or if you would like to discuss your own project ideas, please contact me.


