One of my most treasured experiences here at Parsons has been as a mentor through the internship program run by our Early Talent Program. Students can be a valuable resource for fresh ideas and are often eager to learn and contribute to research. This past summer, I had the opportunity to work with two such individuals, Garrett Fuchs and Nic Quance. Together, we designed a machine learning tool to aid our solutions in deep packet inspection and network traffic identification.
In order to identify network packets by their payload contents, engineers often construct databases of regular expression signatures, which may be used to match network packets on common features like method names or byte sequences. “RExACtor” is a regular expression signature apriori constructor which uses unsupervised learning and natural language processing (NLP) to extract these common features and motifs.
Furthermore, it uses a bioinformatics approach to sequence alignment in order to generate reusable regular expressions which accurately match network traffic. Automating the signature generation process not only frees engineers of the responsibility of manually extracting common features and creating complex signatures but also allows for the discovery of features from new or unknown protocol stacks. Here, we may use this tool to examine traffic surveys and generate signatures which we may apply to scanning databases for future identification workloads.
The processing pipeline which my team developed was a novel work in automatic signature generation and multi-disciplinary in its approach to artificial intelligence. A particularly unique contribution to the field of regular expression (“regex”) generation algorithms was our genetic sequence alignment-based encoding, which pairwise-aligns packet payloads, using logic to replace certain characters with encoded symbols to represent regex classes. In the final result, the string is decoded by replacing the symbols with their respective regex interpretations.
Our algorithm is able to generate signatures with greater complexity and verbosity than previous industry research, pushing our teams to the forefront of this area of network traffic processing.
As representatives of Parsons and the Defense and Intelligence market, we shared our research at the IEEE 2021 International Symposium on Network Computing and Applications and published “RExACtor” in the conference proceedings. Sharing the invention of RExACtor with others in the research community is an important step for our team and business unit toward advancing the forefront of theoretical computing. More importantly, we, as solutions engineers, are placing emphasis on building tools for people from these ideas.
Through projects like RExACtor, we are filling a pivotal role in the actualization of theoretical research into real systems. This innovation would not be possible without the collaboration of other industry partners and even university students, which have proven to be a source of meaningful growth and contribution.
Together, we can continue to go beyond in the mission for our partners in the Department of Defense and the broader intelligence community.