Multimedia Technologies

At atlanTTic we work with technologies that aim to understand human communication, such as technologies related to the audiovisual source (speech recognition and synthesis, recognition of expressions and identification of people) and technologies related to textual communication (linguistic, syntactic and semantics analysis in the fields of natural language, natural language generation, chatterbots and technologies to support communication with people with cognitive disabilities).

In addition to the communicative analysis, this line of work also uses technologies to detect and classify events based on image, video and acoustic signals. Capturing audio with state-of-the-art sensors, called Acoustic Vector Sensors, atlanTTic has developed tools that allow to locate its origin. This technology and its combination with video detection technologies opens up the possibility of a numerous security related applications (detection and location of intruders) and the monitoring of complex machines (not only detecting an abnormal operating pattern, but also determining its location and, therefore, the cause of the problem).

Lines of Research

Líñas de Investigación

Natural language analysis

Development of programs capable of constructing behaviors from information based on examples.

Automatic generation of natural language based on linguistic and statistical knowledge

A system for the automatic generation of natural language in Spanish has been developed based on linguistic and statistical knowledge, which integrates lexicons of own production.

Communication and stimulation for people with cognitive disabilities

A range of apps aimed at communicating with people with cognitive disabilities and their installation has been developed. It is an altruistic initiative with a large user base today.

Conversation technologies in natural language (chatterbots)

atlanTTic has developed a family of chatterbots technology that allows the creation of conversational interfaces for advanced applications. Digital assistants have been adapted to specialized areas.

Voice-to-Text Conversion Technologies

Recognition engines have been developed for Spanish and Galician. High quality linguistic resources are available in Galician that can be used in the development of speech technology.

Text-to-speech and voice conversion technologies

An open source text-to-speech converter has been developed in Galician and Spanish (https://sourceforge.net/projects/cotovia/). Various methods for transforming / converting the speech signal aimed at modifying the speaker identity have been proposed. Applications of these techniques include incorporating text-to-speech converters of new speakers and de-identification (anonymization) of speakers while preserving the rest of the information from the original recording.

Mental state detection technologies

Development of detection systems based on statistical classification that through speech processing evaluate a person’s depression status. The ultimate goal is to screen patients with a very high level of depression.

Biometric identification technologies and personal traits

Development of technologies to identify people based on their biometric features such as voice, face or handwritten signature, as well as estimation of personal attributes such as age and sex or temporary features such as emotional state. Some of these technologies are transferred to the productive sector.

Automatic image analysis and video streaming technologies

Applied to driver assistance systems (ADAS: recognition of signals, detection of pedestrians, of vehicles, lane departure, etc.), to environments with flows of people, to vision in the industry, etc.

Systems for the acquisition and processing of low cost audio and ultrasonic signals for monitoring and diagnostics in industrial environments

Designing and prototyping acoustic sensors suited to the characteristics of the industrial environment, and developing algorithms to process signals for the automatic detection of the operation conditions of machines or detection of events.

Assessment of sound quality

Using both subjective test batteries with opinion collection and objective measures based on perception, it classifies sounds according to their perceptual relevance, pleasure / dislike and other metrics associated to the concept of acoustic comfort.

Research Group

Multimedia Technology Group (GTM)
Information Technologies Group (GTI)

The research area uses the following equipment for the implementation of algorithms of multimedia processing, machine learning and deep learning:

  • 1x Dual Xeon server + 2 GPUs.
  • 2 x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40Ghz , 12cores/24threads, 128GB RAM, 4 x NVidia GeForce GTX Titan X 12GB GDDR5 3072 CUDA colors.
  • 2 x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40Ghz , 12cores/24threads, 128GB RAM, 4 x NVidia GeForce GTX Titan Black 6GB GDDR5 2880 CUDA colors.
  • 2 x Intel(R) Xeon(R) CPU E5-2609 v4 @ 1.70Ghz, 16 cores/16 threads, 128GB RAM, 2 x TITAN X (Pascal) 12GB GDDR5 3584 CUDA colors.

For the integration detection, location and classification of acoustic events systems:

  • Nvidia Jetson TK1 development kit:
  • Development Kit Nvidia Jetson TK1: Chip Tegra K1: This chip contains a Kepeler 192-core GPU and a 4-core Arm Cortex A15 CPU (to get you Linux). It also has 2Gb of RAM, 16GB of storage and all the connectivity of a computer (USB, HDMI, Ethernet, etc …)
  • Pcb welding equipment
  • 3D printers

Other specific equipment:

  • A sensor room for the reception of audiovisual signals (smart-room), consisting of 3 arrays of 8 directional microphones, 6 Kinects, 5 fixed RGB + 3 cameras with PTZ control, 1 infrared and 2 HD webcams.
  • A semi-anechoic chamber with Metadyne technology, cutoff frequency 100 Hz, 5.1 high definition hearing room, non-environment
  • Acoustic equipment: multi-channel acoustic and vibration acquisition system, binaural recording manikin (HATS), microphones, accelerometers, p-p and p-u intensity probes, acoustic modeling software.
  • Spanish and English dictionaries and vocabularies, such as EuroWordNet and GilCUB.
  • Analytical marketing.
  • Assistance to people affected by communication disorders.
  • Collectives with different types of disability, especially children with autism spectrum disorders, but also patients in hospital settings or elderly people.
  • Assistants for communication with consumers through mobile devices, web support in the educational field.
  • E-learning
  • Security (Restricted Access, Video Surveillance).
  • Audiovisual content (Retail, Advertising, Media).
  • Building (insulation and acoustic conditioning).
  • Automotive (tests and acoustic measures, acoustic comfort).
  • Energy sector.
Title
Natural language analysis
Summary Development of programs capable of constructing behaviors from information provided in the form of examples. We are currently working on automatic analysis systems that integrate mathematical and semantic analysis of natural language in deep learning systems.
Application and pros Analysis of aspects, topics and feelings in human texts (such as social networks), with automatic detection of relevant areas of the text. Our differential value, compared to other approaches, is the valorization of semantic analysis as part of hybrid solutions.
Application sectors Analytical marketing, consumer opinion analysis in call centers, chats or web forms, opinion polls or collective intelligence systems for business collaborative tools, among other fields.
Intellectual property Industrial secret.
Title
Automatic generation of natural language based on linguistic and statistical knowledge
Summary atlanTTic has developed a system for the automatic generation of natural language in several languages ​​based on linguistic and statistical knowledge, which integrates lexicons of own production. The system receives words as input and returns complete and coherent sentences.
Application and pros Automatic generation of general, commercial and business analysis information. Education and teaching in general, and in particular reinforced learning to correct communicative deficiencies.
Pros:

  • Low response time and minimum storage requirements.
  • Robustness and consistency.
  • Ease of integration and extension to other languages ​​and fields of application.
Application sectors All those where it is necessary to generate intelligible text by humans from any type of data. We have recently started applying production technology to augmentative and alternative communication systems to assist people affected by communication disorders
Intellectual property Industrial secret.
Title
Communication and stimulation for people with cognitive disabilities
Summary atlanTTic has developed a family of apps aimed at communication with people with cognitive disabilities. It is an altruistic initiative with a large user base today.
Application and pros Communicators for people with disabilities and games of cognitive stimulation. All our applications are highly configurable and free of charge, and can run on low cost Android devices.
Application sectors Groups with different types of disability, especially children with autism spectrum disorders, but also patients in hospital settings or elderly people. The Accegal project, developed with the support of researchers from the Department of Didactics of Language, Literature and Social Sciences at the University of Santiago de Compostela stands out. Accegal offers fourteen applications for mobile devices with Android operating system, with more than 70,000 downloads to date. All applications are available in five languages ​​and are highly customizable. They have received several prizes and they have been frequently reviewed in press.
Intellectual property Free Android apps, but not open source.
Title
Conversation technologies in natural language (chatterbots)
Summary We have adapted digital assistants (similar to Siri, Cortana, etc.) to specialized areas. To do this we use open source technologies such as the AIML (Artificial Intelligent Markup Language) language and adapted interpreters to improve the functions of dialogue and understanding.
Application and pros Fundamentally, these technologies are used for the implementation of wizards, based on Android smartphones (with text-voice support of Google Voice) or Web pages. Wizards can search for content of interest, recommendations in a specific area, support frequently asked questions or help the user in general.
They have also been used on Twitter or for tutoring in eLearning environments.
Among our main cases of success is the integration of our technology in Negobot, a virtual trap for the capture of pedophiles in the networks.
We are currently in the process of adapting our digital assistants to communicate with people with cognitive disabilities. The communication itself will generate databases for the training of affective computation algorithms, an aspect that links to the work of atlanTTic in natural language processing.
Application sectors Assistants for communication with consumers through mobile devices, web support in general and, in particular, in the educational field.
Intellectual property Solutions based on modification of open source technologies.
Title
Multimedia indexing technologies
Summary We have integrated video, audio and text processing technologies to index multimedia content with information related to the people present in the multimedia material.
Application and pros The main advantage of this integration is that it can analyze the content of an audiovisual source in a communicative environment (news, interviews, debates, etc.) to provide useful information in advanced searches, greatly expanding the few metadata that usually accompany these formats.
Application sectors Media companies, media publishers, creators and reusers of content, creation and consumption of online courses (MOOCS).
Intellectual property Solutions built on open source technologies and proprietary technologies.
Title
Text-to-speech and voice conversion technologies
Summary Development of a open source text-to-speech converter in Galician and Spanish. It applies several methods of transformation / conversion of the voice signal oriented to modify the identity of the speaker.
Application and pros Applications with spoken response to the user.
Text-to-speech converters with multiple broadcasters.
De-identification (anonymization) of speakers in recordings.
Application sectors Man-machine interaction, privacy protection
Intellectual property Cotovía: text-to-speech conversion system in Galician and Spanish (Open source) (https://sourceforge.net/projects/cotovia/).
Title
Biometric identification technologies and personal traits
Summary atlanTTic has developed technologies to identify people based on their biometric features, such as voice, face or handwritten signature, as well as estimation of personal attributes, such as age and sex or temporal features, such as emotional state. Both modeling and learning techniques are used as well as deep learning.
Application and pros The fields of application of these technologies are multiple: systems of restricted access (both physical and logical), demographic analysis, analysis of emotional response, segmentation of speakers, etc. One of the main advantages over other systems is the ability to multimode combination.
Application sectors Security, banking, retail, advertising
Intellectual property Software register:

  • VG330-11 – Demographic estimation module (transferred to company)
  • VG332-11 – Tools for coupled hidden Markov models (transferred to Technological Center).
  • VG331-11 – Verification of dynamic signature (transferred to Technological Center).
Title
Automatic image analysis and video streaming technologies
Summary atlanTTic has developed a workflow of image and video processing and applied it to a wide variety of systems: driving assistance (ADAS: signal recognition, pedestrian detection, vehicle detection, lane abandonment, etc.), environments with people flows, vision in the industry, medical imaging, etc.
Application and pros Any environment in which a decision must be made based on the content of an image, sequence of images or video flow. Systems can make decisions autonomously or help in human diagnosing.
Application sectors The sectors are very varied: automotive, audiovisual, retail, medical diagnosis, quality control, etc.
Intellectual property Solutions built on proprietary and open source technologies.
Title
Systems for the acquisition and processing of low cost audio and ultrasonic signals for monitoring and diagnostics in industrial environments
Summary Integration of low cost sound pressure sensors, signal conditioning systems and digitization. Temporary and frequency analysis are carried out on the signals acquired this way that allow the detection of events and classification from ad-hoc databases.
Application and pros These technologies monitor the operating condition of mechanical systems from the noise they generate, contributing to the prevention or minimization of malfunctions, to detecting abnormal behaviors. The advantage of using sound signals is that the instrumentation is non-invasive, and the installation does not interfere with the operation of the system being monitored. Capture systems are tailor-made to fit the surroundings and the peculiarities of the system being monitored, and with robust and low-cost technologies. Signal analysis processes are developed specifically for the events or operating conditions to be detected or classified. Depending on the application the necessary intelligence can be installed in-situ, centralized in a remote computer or both solutions combined.
Application sectors Wind energy sector, hydroelectric energy sector, automotive.
Intellectual property Industrial secret.
Title
System for the Detection of vehicles that pass by a road based on the sound
Summary Within the general line of detection and classification of events, a system to detect passing vehicles and their automatic classification based on the audio signal (light / heavy vehicles) is specified.
Application and pros Carrying out noise maps of road infrastructures (urban and interurban) requires knowing how many and the type of vehicles on the road. In many cases this information is not available. This system allows a fast, portable and non-invasive way to obtain data about the capacity of a road.
Application sectors Environmental, traffic
Intellectual property Spanish patent: P200801046.