Machine Learning and Neural Network Approaches to Statement Validation (Work in progress)
Currently, I am working on a project to design a system that can determine the level of fact statement accuracy of particular statements. We are designing a multimodal learning model that can predict whether a statement is true or false. We also plan to create an information retrieval module that is capable of automatically retrieving necessary evidence to support the prediction of the model. For our project, we are planning to use multiple datasets. For now, we are considering LIAR, Sentimental LIAR, MM-COVID, FEVER, FakeNewsNet, and X-fact.
Microsoft & Open Data Institute (ODI)
Education Open Data Challenge
As the Lead Machine Learning Engineer, I was able to cluster and categorize Texas zip codes by the state of the digital divide in Texas prior to and during the COVID-19 pandemic using the K-Means Clustering algorithm. From high to low digital divides, our team was able to map a color gradation representing each zip code’s categorization.
Dimensionality Reduction with PCA
For the data analysis project in the course: SDS 395 – Data Viz in R, I was able to conduct a Principal Component Analysis (PCA) statistical procedure to find underlying correlations and interrelations among the set of observed variables in the Airline Safety dataset available in the FiveThirtyEight package found in R.