You are currently browsing jordivitria’s articles.
Our work on using Deep Learning techniques for endoluminal image analysis was presented as example at NVIDIA’s GPU Technical Conference (http://www.gputechconf.com/).
Next autumn, BCNPCL will perform a series of “lightning talks”. A lightning talk is a very short presentation given at a conference or similar forum. Unlike other presentations, lightning talks last only a few minutes and several will usually be delivered in a single period by different speakers.
We propose to run 5 minute lightning talks per doctor proposing one or several open research topics where we look for collaboration. After the talks we will open a discussion to enquire more details.
- Michal Drozdzal: TBA
- Laura Igual: Problems in Neuroimaging
- David Masip: Emotion classification: Deep features and dynamics
- Jordi Vitrià: Zero-shot learning of food
- Oriol Pujol: TBA
- Petia Radeva: Lifelogging: a problem to solve.
- Xevi Baró: TBA
- Simone Balocco: TBA
Universitat de Barcelona’s Data Science and Big Data course offers students a program that covers the concepts and tools you will need throughout the entire data science pipeline: asking the right questions; wrangling and cleaning data; generating hypothesis; making inferences; visualizing data; assessing solutions; and building data products.
Schedule: October 2, 2014 – May 30, 2015, Thursday 18h-21h
Estimated workload: 6 hours per week (including lectures and homework)
Location: Edifici Històric de la Universitat de Barcelona, Gran Via de les Corts Catalanes 585, 08007, Barcelona
The paper “Gradient Histogram Background Modeling for People Detection in Stationary Camera Environments“, by V.Borjas, J.Vitrià and P.Radeva received the Best Poster Award at the 13th IAPR Conference on Machine Vision Applications, Japan, 2013.
Multimodal Interaction in Image and Video Applications.
Sappa, Angel D., Vitrià, Jordi.
Springer, Series: Intelligent Systems Reference Library, Vol. 48. 2013, XIV, 203 p.
Traditional Pattern Recognition (PR) and Computer Vision (CV) technologies have mainly focused on full automation, even though full automation often proves elusive or unnatural in many applications, where the technology is expected to assist rather than replace the human agents. However, not all the problems can be automatically solved being the human interaction the only way to tackle those applications.
Recently, multimodal human interaction has become an important field of increasing interest in the research community. Advanced man-machine interfaces with high cognitive capabilities are a hot research topic that aims at solving challenging problems in image and video applications. Actually, the idea of computer interactive systems was already proposed on the early stages of computer science. Nowadays, the ubiquity of image sensors together with the ever-increasing computing performance has open new and challenging opportunities for research in multimodal human interaction.
This book aims to show how existing PR and CV technologies can naturally evolve using this new paradigm. The chapters of this book show different successful case studies of multimodal interactive technologies for both image and video applications. They cover a wide spectrum of applications, ranging from interactive handwriting transcriptions to human-robot interactions in real environments.
We have published a new book: Escalera, S., Baró, X., Pujol, O., Vitrià, J., Radeva, P. Traffic-Sign Recognition Systems. Sringer. Series: SpringerBriefs in Computer Science. 2011, ISBN 978-1-4471-2244-9, based on our work in a long coordinated project with the Institut Cartogràfic de Catalunya.
This work presents a full generic approach to the detection and recognition of traffic signs. The approach, originally developed for a mobile mapping application, is based on the latest computer vision methods for object detection, and on powerful methods for multiclass classification. The challenge was to robustly detect a set of different sign classes in real time, and to classify each detected sign into a large, extensible set of classes. To address this challenge, several state-of-the-art methods were developed that can be used for different recognition problems. Following an introduction to the problems of traffic sign detection and categorization, the text focuses on the problem of detection, and presents recent developments in this field. The text then surveys a specific methodology for the problem of traffic sign categorization – Error-Correcting Output Codes – and presents several algorithms, performing experimental validation on a mobile mapping application. The work ends with a discussion on future lines of research, and continuing challenges for traffic sign recognition.
We have recently published a new paper in the Journal PLOS One:
Rojas Q. M, Masip D, Todorov A, Vitria J (2011) Automatic Prediction of Facial Trait Judgments: Appearance vs. Structural Models. PLoS ONE 6(8): e23323. doi:10.1371/journal.pone.0023323
This paper, which studied whether the ability to decide which faces fall into social categories like attractive or threatening is learnable from the point of view of computer science, has attracted a lot of interest and it has been featured in:
- Radio Nacional Vasca.
- Cadena COPE http://www.cope.es/el-plan-c/audio-te-decimos-si-eres-atractivo-o-no-120681
We are also deeply involved in the organization of the Fifth
Iberian Conference on Pattern Recognition and Image Analysis, which will be held in Las Palmas de Gran Canaria (Spain), June 8-10, 2011. The deadline for proposals has been extended until November 29th.
The 13th International Conference on Multimodal Interaction, ICMI 2011, will be held in Alicante, Spain, 14-18 November 2011. Our group is involved in the organization. We expect a great success for this conference!
The public website of the conference is available at: http://www.acm.org/icmi/2011