Multimedia Technologies

At atlanTTic we work with technologies related to the understanding of human communication, such as technologies related to the audiovisual source (speech recognition and synthesis, recognition of expressions and identification of people) and technologies related to textual communication (linguistic, syntactic and semantics analysis in the fields of natural language, natural language generation, chatterbots and technologies to support communication with people with cognitive disabilities).

In addition to the communicative analysis, this line also deals with technologies for detection and classification of events based on image, video and acoustic signals. By means of the acquisition of audio with sensors of last generation, called Acoustic Vector Sensors, Atlantic has developed tools that allow to locate its origin. This technology and its combination with video detection technologies opens up the possibility of a multitude of security related applications (detection and location of intruders) and the monitoring of complex machines (not only detecting an abnormal operating pattern, but also determining its Location and therefore the cause of the problem).

Lines of Research

Líñas de Investigación

Natural language analysis

Development of programs capable of abstracting behaviors from information provided in the form of examples.

Automatic generation of natural language based on linguistic and statistical knowledge

A system for the automatic generation of natural language in Spanish has been developed based on linguistic and statistical knowledge, which integrates lexicons of own production.

Communication and stimulation for people with cognitive disabilities

A family of apps aimed at communication with people with cognitive disabilities and their installation has been developed. It is an altruistic initiative with a large user base today.

Conversation technologies in natural language (chatterbots)

atlanTTic has developed a family of chatterbots technology that allows the construction of conversational interfaces for advanced applications. Digital assistants have been adapted to specialized areas.

Voice-to-Text Conversion Technologies

Recognition engines have been developed for Spanish and Galician. High quality linguistic resources are available in Galician that can be used in the development of speech technology.

Text-to-speech and voice conversion technologies

A text-to-speech converter has been developed in Galician and Spanish of open source (https://sourceforge.net/projects/cotovia/). Various methods for transforming / converting the speech signal aimed at modifying the speaker identity have been proposed. Applications of these techniques include incorporating text-to-speech converters of new speakers and de-identification (anonymization) of speakers while preserving the rest of the information from the original recording.

Mental state detection technologies

Development of detection systems based on statistical classification that through speech processing evaluate a person’s depression status. The ultimate goal is to screen patients with a very high level of depression.

Biometric identification technologies and personal traits

Development of technologies for identifying people from biometric features such as voice, face or handwritten signature, as well as estimation of personal attributes such as age and sex or temporary features such as emotional state. Some of these technologies are transferred to the productive sector.

Automatic image analysis and video streaming technologies

Applied to systems of assistance to the driving (ADAS: recognition of signals, detection of pedestrians, of vehicles, lane departure, etc.), to environments with flow of people, to vision in the industry, etc.

Systems for the acquisition and processing of low cost audio and ultrasonic signals for monitoring and diagnostics in industrial environments

Design and prototyping of acoustic sensors adapted to the characteristics of the industrial environment, and development of algorithms of treatment of the signals for the automatic detection of the condition of operation of machines or detection of events.

Assessment of sound quality

Using both subjective test batteries with opinion collection and objective measures based on perception. It allows to classify sounds according to their perceptual relevance, pleasure / dislike and other metrics associated with the concept of acoustic comfort.

Research Group

Multimedia Technology Group (GTM)
Information Technologies Group (GTI)

The research area uses the following equipment for the implementation of algorithms of multimedia processing, machine learning and deep learning:

  • 1x servidor Dual Xeon + 2 GPUs
  • 2 x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40Ghz , 12cores/24threads, 128GB RAM, 4 x NVidia GeForce GTX Titan X 12GB GDDR5 3072 CUDA colors.
  • 2 x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40Ghz , 12cores/24threads, 128GB RAM, 4 x NVidia GeForce GTX Titan Black 6GB GDDR5 2880 CUDA colors.
  • 2 x Intel(R) Xeon(R) CPU E5-2609 v4 @ 1.70Ghz, 16 cores/16 threads, 128GB RAM, 2 x TITAN X (Pascal) 12GB GDDR5 3584 CUDA colors.

For integration of systems for detection, location and classification of acoustic events:

  • Kit de desenvolvemento Nvidia Jetson TK1:
  • Development Kit Nvidia Jetson TK1: Chip Tegra K1: This chip contains a Kepeler 192-core GPU and a 4-core Arm Cortex A15 CPU (to get you Linux). It also has 2Gb of RAM, 16GB of storage and all the connectivity of a computer (USB, HDMI, Ethernet, etc …)
  • Pcb welding equipment
  • 3D printers

In terms of more specific equipment:

  • A sensor room for the reception of audiovisual signals (smart-room), consisting of 3 arrays of 8 directional microphones, 6 Kinects, 5 fixed RGB + 3 cameras with PTZ control, 1 infrared and 2 HD webcams.
  • A semi-anechoic chamber with Metadyne technology, cutoff frequency 100 Hz, 5.1 high definition hearing room, non-environment type/span>
  • Acoustic equipment: multi-channel acoustic and vibration acquisition system, binaural recording manikin (HATS), microphones, accelerometers, p-p and p-u intensity probes, acoustic modeling software.
  • Spanish and English dictionaries and vocabularies, such as EuroWordNet and GilCUB.
  • Analytical marketing.
  • Assistance to people affected by communication disorders.
  • Collectives with different types of disability, especially children with autism spectrum disorders, but also, for example, patients in hospital settings or elderly people.
  • Assistants for communication with consumers through mobile devices, web support in general and, in particular, in the educational field.
  • E-learning
  • Communication with consumers through mobile devices, web support in general and, in particular, in the educational field.
  • Security (Restricted Access, Video Surveillance).
  • Audiovisual content (Retail, Advertising, Media).
  • Building (insulation and acoustic conditioning).
  • Automotive (tests and acoustic measures, acoustic comfort).
  • Energy sector.
Title
Natural language analysis
Summary Development of programs capable of abstracting behaviors from information provided in the form of examples. We are currently working on automatic analysis systems that integrate mathematical and semantic analysis of natural language in deep learning systems.
Application and pros Analysis of aspects, topics and feelings in human texts (such as social networks), with automatic detection of relevant areas of the text. Our differential value, compared to other approaches, is the valorization of semantic analysis as part of hybrid solutions.
Application sectors Analytical marketing, consumer opinion analysis in call centers, chats or web forms, opinion polls or collective intelligence systems for business collaborative tools, among other fields.
Intellectual property Industrial secret.
Title
Automatic generation of natural language based on linguistic and statistical knowledge
Summary atlanTTic has developed a system for the automatic generation of natural language in several languages ​​based on linguistic and statistical knowledge, which integrates lexicons of own production. The system receives words as input and returns complete and coherent sentences.
Application and pros Automatic generation of information, commercial and business analysis. Education and pedagogy in general, and in particular reinforced learning to correct communicative deficiencies.
Pros:

  • Low response time and minimum storage requirements.
  • Robustness and consistency.
  • Ease of integration and extension to other languages ​​and fields of application.
Application sectors All those where it is necessary to generate intelligible text by humans from any type of data. We are recently applying generation technology to augmentative and alternative communication systems to assist people affected by communication disorders
Intellectual property Industrial secret.
Title
Communication and stimulation for people with cognitive disabilities
Summary atlanTTic has developed a family of apps aimed at communication with people with cognitive disabilities. It is an altruistic initiative with a large user base today.
Application and pros Communicators for people with disabilities and games of cognitive stimulation. All our applications are highly configurable and free of charge, and can run on low cost Android devices.
Application sectors Collectives with different types of disability, especially children with autism spectrum disorders, but also, for example, patients in hospital settings or elderly people. The Accegal project, developed with the support of researchers from the Department of Didactics of Language, Literature and Social Sciences at the University of Santiago de Compostela stands out. Accegal offers fourteen applications for mobile devices with Android operating system, with more than 70,000 downloads to date. All applications are available in five languages ​​and are highly customizable. They have received several prizes and they have been frequently reviewed in press.
Intellectual property Free Android apps, but not open source.
Title
Conversation technologies in natural language (chatterbots)
Summary We have adapted digital assistants (similar to Siri, Cortana, etc.) to specialized areas. To do this we use open source technologies such as the AIML (Artificial Intelligent Markup Language) language and adapted interpreters to improve the functions of dialogue and understanding.
Application and pros Fundamentally, these technologies are used for the implementation of wizards, based on Android smartphones (with text-voice support of Google Voice) or Web pages. Attendees can search for content of interest, recommendations in a specific area, support frequently asked questions or help the user in general.
They have also been used on Twitter or for tutoring in eLearning environments.
Among our main cases of success is the integration of our technology in Negobot, a virtual trap for the capture of pedophiles in the networks.
We are currently in the process of adapting our digital assistants to communicate with people with cognitive disabilities. The communication itself will generate databases for the training of affective computation algorithms, an aspect that links to the work of atlanTTic in natural language processing.
Application sectors Assistants for communication with consumers through mobile devices, web support in general and, in particular, in the educational field.
Intellectual property Solutions based on modification of open source technologies.
Title
Multimedia indexing technologies
Summary We have integrated video, audio and text processing technologies for the indexing of multimedia content with information related to the people present in the multimedia material.
Application and pros The main advantage of this integration is that it allows to analyze the content of an audiovisual source in a communicative environment (news, interviews, debates, etc.) to provide useful information in advanced searches, greatly expanding the few metadata that usually accompany these formats.
Application sectors Media companies, media publishers, creators and reusers of content, creation and consumption of online courses (MOOCS).
Intellectual property Solutions built on open source technologies and proprietary technologies.
Title
Text-to-speech and voice conversion technologies
Summary Development of a open source text-to-speech converter in Galician and Spanish. It applies several methods of transformation / conversion of the voice signal oriented to modify the identity of the speaker.
Application and pros Applications with spoken response to the user.
Text-to-speech converters with multiple broadcasters.
De-identification (anonymization) of speakers in recordings.
Application sectors Man-machine interaction, privacy protection
Intellectual property Cotovía: text-to-speech conversion system in Galician and Spanish (Open source) (https://sourceforge.net/projects/cotovia/).
Title
Biometric identification technologies and personal traits
Summary atlanTTic has developed technologies for identifying people based on biometric features such as voice, face or handwritten signature, as well as estimation of personal attributes such as age and sex or temporal features such as emotional state. Both modeling and learning techniques are used as well as deep learning.
Application and pros The fields of application of these technologies are very varied: systems of restricted access (both physical and logical), demographic analysis, analysis of emotional response, segmentation of speakers, etc. One of the main advantages over other systems is the ability to combine multimodal.
Application sectors Security, banking, retail, advertising
Intellectual property Software register:

  • VG330-11 – Demographic estimation module (transferred to company)
  • VG332-11 – Tools for coupled hidden Markov models (transferred to Technological Center).
  • VG331-11 – Verification of dynamic signature (transferred to Technological Center).
Title
Automatic image analysis and video streaming technologies
Summary atlanTTic has developed a workflow of image and video processing and applied to a wide variety of systems: driving assistance (ADAS: signal recognition, pedestrian detection, vehicle detection, lane abandonment, etc.), environments with people flow , Vision in the industry, medical image, etc.
Application and pros Any environment in which a decision must be made based on the content of an image, sequence of images or video flow. Systems can make decisions autonomously or help diagnosis by a human.
Application sectors The sectors are very varied: automotive, audiovisual, retail, medical diagnosis, quality control, etc.
Intellectual property Solutions built on proprietary and open source technologies./td>
Title
Systems for the acquisition and processing of low cost audio and ultrasonic signals for monitoring and diagnostics in industrial environments
Summary Integration of low cost sound pressure sensors, signal conditioning systems and digitization. Temporary and frequency analyzes are carried out on the signals thus acquired that allow the detection of events and classification from ad-hoc databases.
Application and pros These technologies allow the monitoring of the operating condition of mechanical systems from the noise they generate, contributing to the prevention or minimization of malfunctions, or the detection of abnormal behaviors. The advantage of using sound signals is that the instrumentation is non-invasive, and its installation does not interfere with the operation of the system being monitored. Capture systems are tailor-made to suit the environment and the peculiarities of the system to be monitored, and with robust and low-cost technologies. Signal analysis processes are developed specifically for the events or operating conditions to be detected or classified. Depending on the application the necessary intelligence can be installed in-situ, centralized in a remote computer or combined both solutions.
Application sectors Wind energy sector, hydroelectric energy sector, automotive.
Intellectual property Industrial secret.
Title
Detection System of vehicles that circulate by a road from the sound
Summary Within the general line of detection and classification of events, a vehicle pass detection system and its automatic classification based on the audio signal (light / heavy vehicles) is specified.
Application and pros Carrying out noise maps of road infrastructures (urban and interurban) requires knowledge of the number and type of vehicles on the road. In many cases this information is not available. This system allows a fast and portable and non-invasive, to obtain the data of capacity of one way
Application sectors Environmental, traffic
Intellectual property Spanish patent: P200801046.