[clamav-users] Heuristics.Limits.Exceeded FOUND
Micah Snyder (micasnyd)
micasnyd at cisco.com
Sun Apr 12 17:39:22 UTC 2020
Paul,
I investigated further and realize now that it ISN'T double-extracting files from plain zips. It is double-extracting files from zips within other raw image file formats, like TAR or image file formats. For a plain zip, It detects the file entries twice, but doesn't extract them if the parent file is a zip.
I tested this by making a simple zip with two text files in it, then tar.gz'd it. Scanning the zip.tar.gz file resulted in double-extraction of both text files.
Funny story, the omni.ja file is not a real zip. The author of the format decided to place the central directory header at the beginning of the file instead of at the end, resulting in a new zip-like file format. We're able to parse out the files from omni.ja okay because we have self-extracting zip signatures that identify the individual file entries and because the omni.ja file itself is detected as "binary data" (so the ZIPSFX-in-a-ZIP exclusion rule does not apply).
Anyhow, I now suspect that the omni.ja file in a tar.gz file will also get double-extracted. The simplest option would be to disable file-type-recognition scans for embedded files file formats in TAR files (and also GPT and other non-compressed archive file formats). I had been wanting to do this anyways after investigating a closely related issue regarding ISO/GPT file formats. This definitely gives us more reason to do so.
-Micah
On 4/10/20, 6:55 PM, "Paul Kosinski" <clamav-users at iment.com> wrote:
Is this a generic problem with compressed archives (like the Firefox
".tar.bz2") or is it zip specific?
If it is zip specific, there are 2 files in the Firefox distribution
file that are zip format compressed which might explain the slowness.
(They are both named omni.ja, but have different contents).
On Fri, 10 Apr 2020 19:58:35 +0000
"Micah Snyder (micasnyd)" <micasnyd at cisco.com> wrote:
> One issue ClamAV currently has with scanning Zip archives is that
> ClamAV's self-extracting zip detection logic has a flaw wherein it
> detects every file within a zip as a new self-extracting zip. As a
> result, I believe (and I could be wrong on this), that Clam ends up
> extracting and scanning every file in a zip *twice*. I'm still
> brainstorming the best way to fix this -- but I suspect this is a
> large part of why zip-based file formats take much longer than
> expected to scan.
>
> -Micah
>
>
> Micah Snyder
> ClamAV Development
> Talos
> Cisco Systems, Inc.
>
>
>
>
> On 4/7/20, 1:38 PM, "clamav-users on behalf of Paul Kosinski via
> clamav-users" <clamav-users-bounces at lists.clamav.net on behalf of
> clamav-users at lists.clamav.net> wrote:
>
> I didn't want to screw around with my clamdscan (clamd.conf)
> settings, so I ran my optioned-up clamscan command on a smaller and
> much less complicated file. It took less than 11 seconds total time.
> (My previous guess on clamscan's DB load time was apparently way off.)
>
> This suggests that the ClamAV scanning process really does take a
> lot of CPU to deal with a big, complicated file like a Firefox
> package:
> time clamscan
> --alert-exceeds-max=yes --max-scantime=999999
> --max-scansize=4090M --max-filesize=4090M --max-files=30000
> --max-recursion=30 --pcre-match-limit=999999999
> --pcre-max-filesize=999999999 audiofile.wav
> audiofile.wav: OK
>
> ----------- SCAN SUMMARY -----------
> Known viruses: 6804144
> Engine version: 0.102.1
> Scanned directories: 0
> Scanned files: 1
> Infected files: 0
> Data scanned: 1.74 MB
> Data read: 1.73 MB (ratio 1.01:1)
> Time: 10.836 sec (0 m 10 s)
>
> real 0m10.851s
> user 0m10.439s
> sys 0m0.412s
>
> P.S. This is an actual audio intermediate file, not just random
> bytes.
>
>
> On Mon, 6 Apr 2020 21:50:15 -0700
> Al Varnell via clamav-users <clamav-users at lists.clamav.net> wrote:
>
> > Much of that time is almost certainly being consumed by loading
> > the signature database into RAM. How long does it take using
> > clamdscan?
> >
> > Sent from my iPad
> >
> > -Al-
> >
> > On Apr 6, 2020, at 12:29, Paul Kosinski via clamav-users
> > <clamav-users at lists.clamav.net> wrote:
> > >
> > > It *does* take more than 120 secs for the clamscan command to
> > > fully scan the 62 MB Firefox installation file (.tar.bz2).
> > > Trying the scan with the default clamscan limits results in
> > > 62 MB "Data read" but *zero* "Data scanned"!
>
More information about the clamav-users
mailing list