Citation
Share
Abstract
This dissertation is submitted to the Graduate Programs in the School of Engineering and Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science. This document explores how a digital twin model for the TEC District is developed and how the knowledge graph can be used for dense captioning of security events happening in the TEC District digital twin model. This also describes and analyses factors responsible for security breaches in the city using the concept of neutrosophy. The thesis proposes novel techniques for advancing dense captioning with the integration of knowledge graphs and neutrosophy, harnessing the capabilities of digital twin technology. Digital twins, as virtual replicas of physical entities or systems, offer a comprehensive framework for understanding and simulating real-world scenarios. They have emerged as a powerful tool in various industries, including manufacturing, healthcare, and urban planning. These models rely on detailed simulations of cities, including video data, to analyze and describe various security events using dense captioning. However, the accuracy and relevance of these simulations depend heavily on the quality of the captions generated for the video content. Captioning of videos based on temporal information presents a challenging task, involving the limitation of distracting information over time and space, which is crucial but poses difficulties. Additionally, ensuring robustness to false positives during captioning and addressing storage issues are significant challenges obtained from the literature. Also gathering information from knowledge graphs and providing context is another key task because of the presence of indeterminacy in data. This poses a challenge for defining aggression subjectively and automatically describing events while optimizing classifiers for faster caption generation and selecting optimal parameters. A widely used technique for dense video captioning is a knowledge graph that provides a structured representation of knowledge, organizing and connecting information extracted from videos. By incorporating knowledge graphs into the digital twin model, the relevance and context of the captions are significantly enhanced. However, knowledge graphs may fail to capture indeterminate factors that can dramatically impact situation analysis. Indeterminate factors, such as unpredictable human behavior or environmental conditions, are crucial in determining event sequences in digital twin models. In this dissertation, we aim to create a digital twin model for the TEC District for effective dense captioning of events with the knowledge graph model district area. In the proposed model, knowledge graphs play a crucial role in enhancing the context and relevance of captions by organizing and connecting information extracted from videos. It provides a structured representation of knowledge, enabling a more comprehensive understanding of video content. We have also utilized neutrosophy to address indeterminate and uncertain events, thereby enhancing the efficiency of dense captioning. This work is carried out in three phases; the first phase identifies various traits of character taken from datasets and literature, leading to different events among the masses using Neutrosophic Cognitive Maps (NCMs). This is done to identify the significance of various determinate and indeterminate factors while analyzing the security events. This task was earlier performed using Fuzzy Cognitive Maps (FCMs) in some research domains other than dense video captioning where indeterminate or uncertain factors were not considered. Therefore, we provide a brief comparison between NCMs and FCMs and show how effective NCMs are when considering the uncertainty of concepts while carrying out tests for describing events. In the second phase, a knowledge graph model for dense captioning is developed. As captioning is based on a knowledge graph, the time consumption for generating the video captions was considerably reduced. Also, we used the Bidirectional Long Short-Term Memory (BiLSTM) classifier to analyze the flow of the information provided by the captions, and the efficiency is further enhanced by using the Recurrent Neural Network (RNN). The enabling of the Squacc optimization algorithm in both RNN and BiLSTM effectively optimized the classifier’s parameters and helped to obtain an efficient output. The performance metrics BLEU, ROUGE, CIDEr, METEOR, and SPICE demonstrated the superiority of the research. Later in the third phase, we developed a digital twin for the TEC District, Monterrey, Nuevo Leon, Mexico. We carried out this work by defining and developing five layers in our digital twin model: the ground layer, BIM layer, Mobility infrastructure, district 3D model, and finally, the digital twin. Here, we used some common software applications for the development of TEC District Digital Twin, such as Esri ArcGIS for data management (Map data, GeoJson, and 2D data), City Engine for assigning rule files of buildings, vegetation, water, road network and manipulation of 2D, 3D data, and QGIS for shape files. 3D modeling software Blender, and Nvidia Omniverse for the final digital twin was used. Using the potential of these tools and techniques, Digital Twin is proposed for the buildings, road network, and vegetation of the TEC District (Tecnologico De Monterrey District) region. Here, we integrated our dense captioning model with the TEC Distrcit digital twin to obtain captions of security events using knowledge graphs. The general idea of this investigation is to provide a better understanding of digital twins and dense video captioning. By leveraging the capabilities of these technologies, organizations can generate more accurate and insightful analyses of digital twin models, enabling a wide range of applications in various fields. These technologies will also aid surveillance and security in urban planning, offering significant benefits for organizations looking to optimize their operations and enhance their decision-making processes. All the models described in this investigation can be applied to a wider range of instances to achieve acceptable results with respect to time and quality.
Description
https://orcid.org/0000-0002-5320-0773