[clamav-users] How do heuristics block MS Office xml OLE blobs?

Alessandro Vesely vesely at tana.it
Thu Nov 15 18:29:54 UTC 2018

Hi all,

I'm trying to block Office files which contain executable stuff.  Decalage's mraptor works fine, except it doesn't cover Office 2007 and similar.  Those have 4-char extensions, like xlsx (Xml), xlsm (Macro), xlsb (Binary), and many more.  For a tentative list, see e.g.:

They are zip containers, possibly containing xml and other files.  Most often, they contain a file named printerSettings1.bin.  An xlsx I got also contains a file named oleObject1.bin.  Kaspersky flags it as HEUR:Exploit.MSOffice.Generic, see:

The whole xlsx file is detected similarly.  However, the content of the only OLE stream contained therein, extracted using oledump, is flagged clean in VirusTotal.  I don't understand what kind of content it is.  VirusTotal say it is an MS Word Document, see:

I only get:

ale at pcale:~/tmp$ python oledump.py sample.xlsx 
A: xl/embeddings/oleObject1.bin
 A1:      1386 'eQuaTion nATIve'

ale at pcale:~/tmp$ python oledump.py -s A1 -d  sample.xlsx > streamA1_of_oleObject1.bin
ale at pcale:~/tmp$ file !$
file streamA1_of_oleObject1.bin
streamA1_of_oleObject1.bin: data

So, what is the heuristic?  If it contains an OLE object then it is evil?


