Mostrar el registro sencillo del ítem
Visualization and machine learning techniques to support web traffic analysis
dc.contributor.advisor | Monroy, Raúl | |
dc.creator | Gómez-Herrera, Fernando | |
dc.date.accessioned | 2018-12-10T20:18:04Z | |
dc.date.available | 2018-12-10T20:18:04Z | |
dc.date.created | 2016 | |
dc.identifier.citation | Gómez-Herrera, F. (2018). Visualization and Machine Learning Techniques to Support Web Traffic Analysis. | en_US |
dc.identifier.uri | http://hdl.handle.net/11285/632432 | |
dc.description.abstract | Web Analytics (WA) services are one of the main tools that marketing experts use to measure the success of an online business. Thus, it is extremely important to have tools that support WA analysis. Nevertheless, we observed that there has not been much change in how services display traffic reports. Regarding the trustworthiness of the information, Web Analytics Services (WAS) are facing the problem that more than half of Internet traffic is Non-Human Traffic (NHT). Misleading online reports and marketing budget could be wasted because of that. Some research has been done, yet, most of the work involves intrusive methods and do not take advantage of information provided by current WAS. In the present work, we provide tools that can help the marketing expert to get better reports, to have useful visualizations, and to ensure the trustworthiness of the traffic. First, we propose a new Visualization Tool. It helps to show the website performance in terms of a preferred metric and enable us to identify potential online strategies upon that. Second, we use Machine Learning Binary Classification (BC) and One-Class Classification (OCC) to get more reliable information by identifying NHT and abnormal traffic. Then, marketing analysts could contrast NHT against their current reports. Third, we show how Pattern Extraction algorithms (like PBC4cip's miner) could help to conduct traffic analysis (once visitor segmentation is done), and to propose new strategies that may improve the online business. Later on, the patterns can be used in the Visualization Tool to analyze the traffic in detail. We confirmed the usefulness of the Visualization Tool by using it to analyze bot traffic we generated. NHT traffic shared a very similar linear navigation path, contrasted with the more complex human path. Furthermore, BC and OCC (BaggingTPMiner) worked successfully in the detection of well-known bots and abnormal traffic. We achieved a ROC AUC of 0.844 and 0.982 for each approach, respectively. | en_US |
dc.format.medium | Texto | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Instituto Tecnológico y de Estudios Superiores de Monterrey | esp |
dc.relation.ispartof | N/A | en_US |
dc.rights | Open Access | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by-sa/3.0/us/ | * |
dc.subject | 7 INGENIERÍA Y TECNOLOGÍA | en_US |
dc.title | Visualization and machine learning techniques to support web traffic analysis | en_US |
dc.type | Tesis de Maestría / master Thesis | en_US |
dc.contributor.mentor | Monroy, Raúl | |
dc.publisher.institution | Instituto Tecnológico y de Estudios Superiores de Monterrey | en_US |
dc.subject.keyword | machine learning | en_US |
dc.subject.keyword | web analytics | en_US |
dc.subject.keyword | bot detection | en_US |
dc.contributor.institution | Campus Estado de México | en_US |
dc.contributor.institution | Campus Estado de México | en_US |
dc.contributor.institution | Campus Estado de México | en_US |
dc.subject.discipline | Ingeniería y Ciencias Aplicadas / Engineering & Applied Sciences | en_US |
dc.description.degree | Master of Science in Computer Science | en_US |
dc.audience.educationlevel | Empresas/Companies | en_US |
dc.audience.educationlevel | Estudiantes/Students | en_US |
dc.audience.educationlevel | Investigadores/Researchers | en_US |
dc.audience.educationlevel | Medios de comunicación/News Media | en_US |
dc.relation.impreso | 2018-12 |
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
Ciencias Exactas y Ciencias de la Salud 5426
Ingeniería y Ciencias / Medicina y Ciencias de la Salud