Show simple item record

dc.contributor.advisorNolazco Flores, Juan Arturo
dc.contributor.authorBáez Suárez, Abraham
dc.creatorhttps://orcid.org/0000-0001-8729-0781
dc.date.accessioned2020-04-17T16:43:34Z
dc.date.available2020-04-17T16:43:34Z
dc.date.created2020-04-16
dc.identifier.citationBáez Suárez, A. (2020). Unsupervised Deep Learning Recurrent Model for Audio Fingerprinting (Doctoral Dissertation). Instituto Tecnológico y de Estudios Superiores de Monterrey (ITESM), Monterrey, México. https://hdl.handle.net/11285/636319es_MX
dc.identifier.doi10.1145/3380828
dc.identifier.urihttps://hdl.handle.net/11285/636319
dc.description.abstractAudio fingerprinting techniques were developed to index and retrieve audio samples by comparing a content-based compact signature of the audio instead of the entire audio sample, thereby reducing memory and computational expense. Different techniques have been applied to create audio fingerprints, however, with the introduction of deep learning, new data-driven unsupervised approaches are available. This doctoral dissertation presents a Sequence-to-Sequence Autoencoder Model for Audio Fingerprinting (SAMAF) which improved hash generation through a novel loss function composed of terms: Mean Square Error, minimizing the reconstruction error; Hash Loss, minimizing the distance between similar hashes and encouraging clustering; and Bitwise Entropy Loss, minimizing the variation inside the clusters. The performance of the model was assessed with a subset of VoxCeleb1 dataset, a "speech in-the-wild" dataset. Furthermore, the model was compared against three baselines: Dejavu, a Shazam-like algorithm; Robust Audio Fingerprinting System (RAFS), a Bit Error Rate (BER) methodology robust to time-frequency distortions and coding/decoding transformations; and Panako, a constellation algorithm-based adding time-frequency distortion resilience. Extensive empirical evidence showed that our approach outperformed all the baselines in the audio identification task and other classification tasks related to the attributes of the audio signal with an economical hash size of either 128 or 256 bits for one second of audio. Additionally, the developed technology was deployed into two 9-1-1 Emergency Operation Centers (EOCs), located in Palm Beach County (PBC) and Greater Harris County (GH), allowing us to evaluate the performance in real-time in an industrial environment.es_MX
dc.format.mediumTextoes_MX
dc.language.isoenges_MX
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterreyes_MX
dc.relationDepartment of Homeland Security (DHS) D15PC00185es_MX
dc.relationConsejo Nacional de Ciencia y Tecnología (CONACYT) 328083es_MX
dc.relationNorth Atlantic Treaty Organization (NATO) G4919es_MX
dc.relation.isFormatOfversión publicadaes_MX
dc.rightsopenAccesses_MX
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0*
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LAS TELECOMUNICACIONESes_MX
dc.subject.lcshTechnologyes_MX
dc.titleUnsupervised Deep Learning Recurrent Model for Audio Fingerprintinges_MX
dc.typeTesis Doctorado / doctoral Thesises_MX
dc.contributor.departmentEscuela de Ingeniería y Cienciases_MX
dc.contributor.committeememberVargas Rosales, César Vargas
dc.contributor.committeememberGutiérrez Rodríguez, Andrés Eduardo
dc.contributor.committeememberRodríguez Dagnino, Ramón Martín
dc.contributor.committeememberLoyola González, Octavio
dc.subject.keywordArtificial Intellligencees_MX
dc.subject.keywordMachine Learninges_MX
dc.subject.keywordDeep Learninges_MX
dc.subject.keywordUnsupervised Learninges_MX
dc.subject.keywordSequence-to-Sequence Autoencoderes_MX
dc.subject.keywordAudio Fingerprintinges_MX
dc.subject.keywordAudio Identificationes_MX
dc.subject.keywordMusic Information Retrievales_MX
dc.contributor.institutionCampus Monterreyes_MX
dc.description.degreeDoctorado en Tecnologías de la Información y Comunicacioneses_MX
dc.audience.educationlevelInvestigadores/Researcherses_MX
dc.date.Accepted2020-04-15
dc.identificator7||33||3325es_MX


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

openAccess
Except where otherwise noted, this item's license is described as openAccess