TY - JOUR
T1 - Invoice #31415 attached
T2 - Automated analysis of malicious Microsoft Office documents
AU - Koutsokostas, Vasilios
AU - Lykousas, Nikolaos
AU - Apostolopoulos, Theodoros
AU - Orazi, Gabriele
AU - Ghosal, Amrita
AU - Casino, Fran
AU - Conti, Mauro
AU - Patsakis, Constantinos
N1 - Publisher Copyright:
© 2021
PY - 2022/3
Y1 - 2022/3
N2 - Microsoft Office may be by far the most widely used suite for processing documents, spreadsheets, and presentations. Due to its popularity, it is continuously utilised to carry out malicious campaigns. Threat actors, exploiting the platform's dynamic features, use it to launch their attacks and penetrate millions of hosts in their campaigns. This work explores the modern landscape of malicious Microsoft Office documents, exposing the means that malware authors use. We leverage a taxonomy of the tools used to weaponise Microsoft Office documents and explore the modus operandi of malicious actors. Moreover, we generated and publicly shared a specially crafted dataset, which relies on incorporating benign and malicious documents containing many dynamic features such as VBA macros and DDE. The latter is crucial for a fair and realistic analysis, an open issue in the current state of the art. This allows us to draw safe conclusions on the malicious features and behaviour. More precisely, we extract the necessary features with an automated analysis pipeline to efficiently and accurately classify a document as benign or malicious using machine learning with an F1 score above 0.98, outperforming the current state of the art detection algorithms.
AB - Microsoft Office may be by far the most widely used suite for processing documents, spreadsheets, and presentations. Due to its popularity, it is continuously utilised to carry out malicious campaigns. Threat actors, exploiting the platform's dynamic features, use it to launch their attacks and penetrate millions of hosts in their campaigns. This work explores the modern landscape of malicious Microsoft Office documents, exposing the means that malware authors use. We leverage a taxonomy of the tools used to weaponise Microsoft Office documents and explore the modus operandi of malicious actors. Moreover, we generated and publicly shared a specially crafted dataset, which relies on incorporating benign and malicious documents containing many dynamic features such as VBA macros and DDE. The latter is crucial for a fair and realistic analysis, an open issue in the current state of the art. This allows us to draw safe conclusions on the malicious features and behaviour. More precisely, we extract the necessary features with an automated analysis pipeline to efficiently and accurately classify a document as benign or malicious using machine learning with an F1 score above 0.98, outperforming the current state of the art detection algorithms.
KW - LOLBAS
KW - Macro malware
KW - Malware
KW - Office documents
KW - Powershell
UR - http://www.scopus.com/inward/record.url?scp=85121786444&partnerID=8YFLogxK
U2 - 10.1016/j.cose.2021.102582
DO - 10.1016/j.cose.2021.102582
M3 - Article
AN - SCOPUS:85121786444
SN - 0167-4048
VL - 114
JO - Computers and Security
JF - Computers and Security
M1 - 102582
ER -