TY - GEN
T1 - A Review of Unstructured Data Analysis and Parsing Methods
AU - Jain, Shubham
AU - De Buitleir, Amy
AU - Fallon, Enda
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/3
Y1 - 2020/3
N2 - Computer applications generate an enormous amount of data every day through their logs, system-generated files or other reports. This generated data depicts the state of the running system and contains abundant information that can be used for system diagnostics and monitoring. Network monitoring systems produce a wide variety of unstructured information, so there is a need for an automated way to extract the relevant data, which currently requires multitude of custom parsers. Developing and testing custom parsers can be time-consuming. Instead, data can be automatically processed and parsed into a machine-readable format, building a generic model for standard or vendor-specific data, and generating insights for analytics, anomaly detection, intrusion detection, node failures and various other applications. This paper reviews some existing approaches for unstructured data mining and parsing and discusses the challenges in information extraction, creation of knowledge bases and presents a generic framework for automatic parsing.
AB - Computer applications generate an enormous amount of data every day through their logs, system-generated files or other reports. This generated data depicts the state of the running system and contains abundant information that can be used for system diagnostics and monitoring. Network monitoring systems produce a wide variety of unstructured information, so there is a need for an automated way to extract the relevant data, which currently requires multitude of custom parsers. Developing and testing custom parsers can be time-consuming. Instead, data can be automatically processed and parsed into a machine-readable format, building a generic model for standard or vendor-specific data, and generating insights for analytics, anomaly detection, intrusion detection, node failures and various other applications. This paper reviews some existing approaches for unstructured data mining and parsing and discusses the challenges in information extraction, creation of knowledge bases and presents a generic framework for automatic parsing.
KW - Data Mining
KW - Information Extraction
KW - Knowledge base
KW - NLP
KW - Similarity
UR - http://www.scopus.com/inward/record.url?scp=85092523337&partnerID=8YFLogxK
U2 - 10.1109/ESCI48226.2020.9167588
DO - 10.1109/ESCI48226.2020.9167588
M3 - Conference contribution
AN - SCOPUS:85092523337
SN - 9781728152639
T3 - 2020 International Conference on Emerging Smart Computing and Informatics, ESCI 2020
SP - 164
EP - 169
BT - 2020 International Conference on Emerging Smart Computing and Informatics, ESCI 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE International Conference on Emerging Smart Computing and Informatics, ESCI 2020
Y2 - 12 March 2020 through 14 March 2020
ER -