III Workshop on Data and Knowledge Engineering – WDKE 2019

10th International Conference on Computing and Informatics in Northern Chile

The Workshop on Data and Knowledge Engineering is a space for the dissemination of scientific and professional academic activity in the area data and knowledge engineering, including invited talks, posters and presentations of short papers. WDKE accepts research works in the subject of data and knowledge engineering, mainly on but not limited to: big data, intelligent systems, data analytics, web and data mining, machine learning, deep learning, knowledge representation, expert systems, data visualization, and computer vision.

Invited Talk #1: Learning Bayesian network classifiers with applications in orthodontics and sentiment analysis

Speaker: Dr. Gonzalo A. Ruz, Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibáñez, Chile.

Abstract. A Bayesian network is a directed acyclic graph, whose nodes represent discrete attributes and the edges probabilistic relationships among them. An interesting feature of Bayesian networks is that they satisfy the Markov condition, thus enabling the computation of the joint probability distribution of all the attributes (variables) in a factorized form. Learning Bayesian networks from data, has two components that must be handled: 1) the structure of the networks, 2) the parameters (conditional probability tables). This is a difficult task (NP- complete). Therefore, several approximate learning approaches have been devised in order to simplify the learning process. Probabilistic classification consists in computing a posterior probability given an input data point. On the basis of the Bayes rule, the posterior probability can be computed by means of the joint probability distribution of the attributes with the class variable; and if we use Bayesian networks to compute this joint probability distribution, we obtain Bayesian network classifiers. In this talk we will review some of the most popular Bayesian networks classifiers. Then we will analyze two recent applications. First, we consider the problem of facial biotype classification, an important stage during orthodontic treatment planning, using Bayesian network classifiers with continuous attributes. Secondly, we consider sentiment analysis which involves classifying opinions in text into categories like positive, negative, or neutral. In particular, we will consider Twitter data during critical events such as natural disasters and social movements. The authors would like to thank Conicyt-Chile under grant Fondecyt 1180706.

________________________________________________________________________

Invited Talk #2: Elasticsearch Experiences in Astronomical Observatories

Speaker: Juan Pablo Gil Ramírez, Software Engineer at ESO-Paranal.

 

 

 

 

 

 

 

Abstract. Elasticsearch is a distributed search engine configured as dabatase implementing a NoSQL paradigm. It has been used extensively for software logs analysis by operation teams in scientific centers as ALMA and CERN, and is currently part of the Paranal MSE Datalab for the same purpose. This talk is focused in technical aspects based on experiences at Paranal and ALMA, but also describes implementation and its adoption into existing workflows.

____________________________________________________________

Special Session: Natural Language Processing & Computational Linguistics

Chairs: Dr. Bell Manrique, Dr. Claudio Meneses

The University of Medellín (Colombia) and the Universidad Católica del Norte (Chile) in the framework of INFONOR 2019 and specifically within the IV Workshop on Data and Knowledge Engineering, are organizing a special session whose purpose is to know research in the area and form a Latin American network on the subject of Natural Language Processing and Computational Linguistics and related issues of pattern recognition in general.

Download the invitation to the NLP&CL special session here (in Spanish).

________________________________________________________________________

Invited Talk #3 WDKE: Paranal DataLab

Speaker: Eduardo Peña, Software Engineer at ESO-Paranal.

Abstract. For almost two decades, large volumes of technical data, in a variety of formats, have resulted from the normal operations at the observatory. Similarly, in the last few years, dealing with huge amounts of data has become a priority for several industries, and as consequence, terms like “Big Data” or “Data Lake” have started to be more and more commonly used. Under these circumstances, frameworks and tools have proliferated and later released as “Open Software”; the hardware, on the other hand, has also changed giving the power to deal with this volume of data in a reasonable timeframe, and at a reasonable price. We hereafter present the first version of a modern data lab developed for the Maintenance Support and Engineering Department (MSE) at the Paranal Observatory, “The MSE DataLab”. This DataLab will allow us to take advantage of this new technological evolution and to be prepared for the current and further challenges to come. These challenges, of course, refer to improving the overall observatory dependability (Reliability, Availability and Maintainability) by supporting the operations in our current and forthcoming telescopes. First, in our Very Large Telescopes (VLT), the VLT Interferometer (VLTI) and the survey telescopes (VISTA and VST). Secondly, in the Extremely Large Telescope (ELT) and the Cherenkov Telescope Array (CTA).

____________________________________________________________

Tutorial: Deep Learning

Speaker: Dr. Juan Bekios Calfa, Universidad Católica del Norte, Chile.

Abstract.  The concept of Artificial Intelligence involves a set of techniques and algorithms that aim to solve tasks that are usually solved by human beings, and that are often repetitive or risky. There are numerous ways to solve these types of problems. In the last decade, within the area of ​​machine learning, connectionist techniques have been highlighted for the development of models that are capable of learning and predicting patterns that do not belong to the training data set. Since 1943, with the work of Warren McCulloch and Walter Pitts, and more contemporary works, Geoffrey Hinton and many others, connectionist theories have reached a degree of precision never seen before. These new technologies, based on networks of artificial neurons, have allowed the development of tools and applications for daily use in fields such as robotics, natural language, computer vision, etc. This Deep Learning tutorial aims to show the theoretical foundations of artificial neural networks and how they work. Additionally, some simple cases of Deep Learning models will be studied using Python and the libraries provided by Keras. 
The author would like to thank Conicyt-Chile under grant PCI-REDI170607.

Short Bio.  Bachelor of Engineering Science and Civil Engineer in Computing and Informatics from the Universidad Católica del Norte, in Antofagasta, Chile. Since 1996 he is an academic of the Department of Systems and Computing Engineering at the same University. I obtained my PhD in Artificial Intelligence at the Polytechnic University of Madrid in 2016. My research topic is the automatic analysis of the human face (demographic classification) using computer vision and machine learning techniques. Additionally, I have worked on data analysis projects and knowledge-based systems.