Data Analytics and Cybersecurity Poster Session

Fastnet: A revolutionary tool to handle big data from real-world networks

Xu Dong and Nazrul I. Shaikh

Department of Industrial Engineering

Networks exist everywhere in the world. Internet, transportation networks, social networks, IoT networks, and biological networks are typical instances of them. These real-world networks usually have millions of nodes and edges and a complex structure, since the interactions among nodes are irregular, or stochastic. Measuring the structures of real-world networks helps us better understand the dynamic processes and mechanisms, such as spread of virus, cascade of failures, and propagation of fake news, that take place on them. However, traditional tools are rarely scalable or efficient to handle the magnitude of scale and the complexity of real-world networks. We need more efficient way to present network data and easier way to analyze networks. This poster presents an open-source network analytics tool, called fastnet (https://cran.r-project.org/web/packages/fastnet/index.html), aiming at fast simulating and analyzing large-size relational data via sampling techniques. By utilizing multi-core processing and nodes-/edge-wise sampling strategies, fastnet can boost the graph analytics speed 10X+ faster compared with previous network analysis tools. In addition, this R-based tool works seamlessly with other data acquisition and analytics packages in the R environment, allowing more users to get access to the analytical power from both the tool and the R environment. fastnet is attracting attentions from both academic and industrial domains.

View poster

Facial Analytics: 3D Reconstruction from Video
Olga Jumbo1, Dr.  Mohamed Abdel-Mottaleb2, Dr. Shihab Asfour2, Maolin Pang3 1Department of Industrial Engineering, 2Department of Electrical and Computer Engineering, 3University of Science and Technology of China Retracted by Author. Poster
The Role of Data Analytics in Operations Research and Management Science

Busra Keles

Department of Industrial Engineering

Theorists in operations research (OR) and management science (MS) have favored of analytical tools and mathematical models, although many manufacturing and service environments are too complex to model and develop solutions by using such techniques. On the other hand, firms incorporating (big) data analytics have stated that they are 5–6% greater in productivity and profitability. Surely, the roles of data and the effective use of data are very exciting matters. However, as theorists in OR/MS, we cannot put aside the premises of mathematical modeling techniques because they provide the most acceptable idealizations in a utopian-world setting. But, we must not also develop theoretically correct models that ignore what actually is being done in real-world settings. Data is indeed a very exciting matter to search for such information in order to build computing algorithms based on statistics and mathematics for solving real-world business problems. In this context, this presentation will cover four topics in data science: philosophy of data, four operations of data science (descriptive, diagnostic, predictive and prescriptive), compass of data science (reasoning, searching for facts and factors, and learning), and exploratory examples from industries (e.g., Amazon, eBay, Netflix) to point out where we are today in OR/MS.

Multimodel deep representation learning for disaster information management

Yudong Tao, Yuexuan Tu, Mei-Ling Shyu

Department of Electrical and Computer Engineering

Data Analysis for real-world applications, such as disaster information management, usually encounters data with various modalities. Disaster information management applications have always attracted lots of attention due to its impacts on the society and government. To enhance these applications, it is essential to adequately analyze all information extracted from different data modalities, while most existing learning models only focus on a single modality. This poster presents a multimodal deep learning framework for the content analysis on disaster-related videos. First, several deep learning models are utilized to extract useful information from multiple modalities. Among them, the pre-trained Convolutional Neural Network (CNN) for visual and audio feature extraction and a word embedding model for textual analysis are utilized. Next, our novel fusion technique is applied to integrate the data representation in different levels. The proposed multimodal framework can reason about a missing data type using other available data modalities. It is then evaluated on a web-crawled disaster video dataset and compared with several state-of-the-art single modality and fusion techniques. 

Poster

Understanding high-throughput DNA methylation data via computationally affordable analytics

Haluk Damgacioglua1, Nurcin Celik1, Emrah Celik PhD2

1Department of Industrial Engineering

2Department of Mechanical and Aerospace Engineering, University of Miami, Coral Gables

EPIGENETIC refers to all heritable alterations that occur in a given gene function without having any change on DNA sequence. DNA methylation, i.e., the addition of a methyl-group to cytosine is a very common type of epigenetic alteration. Identifying idiosyncratic DNA methylation profiles of different tumor types and subtypes can provide invaluable insights for 

  • Accurate diagnosis, 
  • Early detection, 
  • Tailoring of the related treatment for cancer. 

This profiling has led to extensive usage of conventional distance-based clustering algorithms such as hierarchical clustering, k-means clustering, etc.  Despite their speed in conduct of high-throughput analysis, these methods commonly result in suboptimal solutions and/or trivial clusters due to their greedy search nature. Hence, methodologies are needed to improve the quality of clusters formed by these algorithms without sacrificing from its speed. We introduce three algorithms for a complete high-throughput methylation analysis: 

  1. variance-based dimension reduction algorithm to reduce the number of dimensions of methylation data before it is processed for clustering, 
  2. distance-based  outlier detection algorithm to detect the outliers and micro-clusters of the methylation data with reduced dimensionality, 
  3. advanced Tabu-based iterative k-medoids clustering algorithm (T-CLUST) to reduce the impact of initial solutions on the performance of the conventional k-medoids algorithm

 

Contextual Combinatorial Multi-armed Bandits with Volatile Arms and Submodular Reward
Lixing Chen1; Jie Xu1; Zhuo Lu2 1University of Miami, 2University of South Florida Exploration-Exploitation Dilemma
  • Explore: learn the reward of arms
  • Exploit: pull the arm that yielded highest reward in the past
Objective
  • Maximize cumulative reward over time horizon by balancing exploration and exploitation.  
Poster
Efficient Computation of Belief Theoretic Conditionals for Time Sensitive Uncertainty Reasoning Applications
Lalintha G. Polpitiya, Dr. Kamal Premaratne, Dr. Manohar N. Murthi, Dr. Dilip Sarkar Department of Electrical and Computer Engineering and Department of Computer Science Artificial Intelligence (AI) applications in Data Analytics are growing at a rapid pace in a wide range of critical and sensitive domains. However, expert systems are still prone to collapse due to the difficulty in accommodating uncertainties and replicating complex environments in many real-world domains. Dempster-Shafer (DS) belief theory plays a major role in modeling these uncertainties and data imperfections. A major limitation associated with the application of DS theoretic techniques for reasoning under uncertainty is the absence of a feasible computational framework to overcome the prohibitive computational burden involved in the conditional operation. This is a known problem with non-deterministic polynomial-time hardness (NP-hard). We address this critical challenge via two novel generalized conditional computational models ---DS-Conditional-One and DSConditional- All --- which allow the conditional mass and belief to be computed in significantly less computational time and space complexity. They provide valuable insight into the DS theoretic conditional itself and can be utilized as a tool for visualizing the conditional computation. We provide the implementation for computing both the Dempster's conditional and the Fagin-Halpern conditional, the two most widely utilized DST conditional strategies. A new computational library, which we refer to as DS-COCA (DS-Conditional-One and DS-Conditional-All) is developed and harnessed in the simulat Poster
Convolutional Neural Network Transfer for Automated Glaucoma Detection
Manal Ghamdi, Linhao Luo, Arda Efe Okay, Mohamed Adbel-Mottaleb, Mohamed Abou Shousha Department of Electrical and Computer Engineering, Umm Al-Qura University, Harbin Institute of Technology, Bascom Palmer Eye Institute Retracted by the author.  
Characterization of Perfluoroalkyl and Polyfluoroalkyl Substances (PFAS) in Landfill Leachate and Preliminary Evaluation of Leachate Treatment Processes
Athena Jones, Hekai Zhang, Helena Solo-Gabriele Department of Civil, Architectural, and Environmental Engineering Perfluoroalkyl and Polyfluoroalkyl Substances (PFAS) are fluorine-containing chemicals that are found in many products that are stick and stain resistant. The most common of the PFASs are perfluorooctanoic acid (PFOA) which is used to make Teflon, and perfluorooctane sulfonate (PFOS), a breakdown product of a common water resistant chemical known as Scotchgard. Although used widely, only recently have their human health impacts been recognized. Studies have linked PFOA and PFOS to thyroid and liver diseases, diseases of the immune system, and cancer. Due to their wide ranging usage in consumer products, landfills represent a logical end-of-life reservoir for PFASs. The objectives of this study are to evaluate the concentrations of PFASs in leachates from Florida landfills and to assess the capacity of current treatments to remove PFASs from leachate. Leachate samples will be collected from landfills in the State of Florida and from the effluent of leachate treatment facilities. These samples are to be analyzed with LC-MS/MS for PFASs. Data on leachate volumes and treatment data have been consolidated for landfills in the State of Florida. From this literature information coupled with leachate measurements, a preliminary assessment will be made about the effectiveness of existing leachate treatment strategies in reducing PFOA and PFOS levels. In an effort to broadly assess the health risks associated with the PFASs, results from leachate measurements will be compared to the U.S. Environmental Protection Agency’s PFCs health advisory of 0.07 parts per billion. Results can be used by regulators to assess whether treatment systems are needed to remove PFASs from landfill leachates in Florida.  
Translate »