Research Class: Aktivosti u sklopu projekta HRZZ “Automatsko raspoznavanje akcija i aktivnosti u multimedijalnom sadržaju iz domene sporta - RAASS”

Datum održavanja: srijeda, 10.4.2019. u 12:00 sati, prostorija O-403 (Vijećnica Odjela za informatiku)
Predavači: Mate Krišto, Ingrid Hrga, Matija Burić i Goran Paulin


Predstavit će se aktivnosti u sklopu projekta “Automatsko raspoznavanje akcija i aktivnosti u multimedijalnom sadržaju iz domene sporta” kojemu je cilj razvoj postupaka za detekciju i automatsko raspoznavanje individualnih akcija u multimedijalnom sadržaju iz odabrane domene sporta, razvoj modela za predstavljanje znanja i tumačenje aktivnosti iz te domene te definiranje metrike za usporedbu izvedbe akcije s referentnom izvedbom te akcije.

Thermal Imaging Dataset for Person Detection
Predavač: Mate Krišto
Abstract:  An original thermal dataset designed for training machine learning models for person detection will be presented. The dataset contains 7412 thermal images of humans captured in various scenarios while walking, running or sneaking. The recordings are captured in the LWIR segment of the electromagnetic (EM) in various weather condition- clear, fog and rain at different distances from the camera, different body positions (upright, hunched) and movement speeds (regular walking, running). In addition to the standard lens of the camera, we used a telephoto lens for video recording, and we compared the image quality at different weather conditions and at different distances in both cases and set parameters that provided the level of detail in the image that can be used to detect the person.

Keywords: Person detection, thermal imaging, surveillance, access control, dataset, biometric

Deep Image Captioning: An Overview
Predavač: Ingrid Hrga
Abstract:  Image captioning is a process of automatically describing an image with one or more natural language sentences. In recent years, image captioning has witnessed rapid progress, from initial template-based models to the current ones, based on deep neural networks. An overview of issues and recent image captioning research, with a particular emphasis on models that use the deep encoder-decoder architecture will be given. The advantages and disadvantages of different approaches, along with reviewing some of the most commonly used evaluation metrics and datasets will be discussed.

Keywords – image captioning, encoder-decoder, attention mechanism, deep neural networks

Ball detection in handball scenes
Predavač: Matija Burić
Abstract:  Detecting the ball is still a challenge because it is a very small object that takes only a few pixels in the image but carries a lot of information relevant to the interpretation of scenes.
Balls can vary greatly regarding color and appearance due to various distances to the camera and motion blur. Occlusion is also present, especially as handball players carry the ball in their hands during the game and it is understood that the player with the ball is a key player for the current action. Handball players are located at different distances from the camera, often occluded and have a posture that differs from ordinary activities for which most object detectors are commonly learned. We compare the performance of 4 models based on the YOLOv2 object detector, trained on an image dataset of publicly available sports images and images from custom handball recordings.

Keywords – Object Detector, Convolutional Neural Networks, YOLO, Sports, Handball

Method for procedural generation of optimal image datasets
Predavač: Goran Paulin
Abstract: Creation and annotation of image datasets, designed for training machine learning models, is a very time consuming and costly process when it is based on a photo or video sources. It can be replaced by artificial data synthesis, providing 100% error-free segmentation by pixel-level classification of every object in the scene. Generation of such artificial data is currently based on building 3D scenes with already made 3D objects. We are proposing a method for artificial data synthesis based on a procedural generation of 3D objects and scenes along with the optimal set of parameters to efficiently produce image datasets for training models with good generalization.

Keywords: artificial data synthesis, image dataset, procedural generation, convolutional neural networks