[clamav-users] Scanning a large file through HTTP

Micah Snyder (micasnyd) micasnyd at cisco.com
Wed Apr 7 22:14:39 UTC 2021


Hi,

Is it 4 GB? Can this size be increased ?

You can increase the maximum file size by setting the MaxFileSize option in clamd.conf.  ClamAV’s option parser won’t allow you to set a maximum scan size higher than 4GB.  In reality, the file size limit is 2GB.  Anything larger than that will be automatically skipped and marked as “OK”.

The reason for the 2GB file size limit is that in the past there were several bug reports for files larger than 2GB causing crashes.  Rather than fix the parsers, the devs slipped in a 2GB file size limit to prevent crashes.  I only just realized it a few weeks ago, and you can see my comments on this here: https://github.com/Cisco-Talos/clamav-devel/commit/1a3b784e1954e00b6463000a817da0c5092296cd

There’s a lot of technical work to be done to safely raise that limitation, as large files of various file types types have never been tested.  A large TAR, for example, may well work fine when a large ZIP might crash the program.  We really have no idea.  Basically it’s going to take a bunch of testing when someone goes to work on this.

A lot of folks seem to be unhappy with it saying “OK” when a file hasn’t been scanned (myself included).  So we have been talking about changing the output to something like the following messages when files are not scanned or are only partially scanned:

  *   “SKIPPED (exceeded max file size)”
  *   “INCOMPLETE (exceeded max scan size)”

The exact wording is TBD.  If anyone has any specific requests, I’d enjoy some help brainstorming.

Is the memory or persistent storage a limit for ClamAV to scan a file ? if it is a persistent storage then can i increase the limit by attaching an external NFS ?

Sorry, persistent storage is not the concern.

Read somewhere the full file size is mapped to memory. Is it the case for INSTREAM command also ?

Yes, INSTREAM is also limited to 4GB (or _really_ 2GB).

If it is the case then even if chunking is supported then the server side must have at least 4GB of memory.

Scanning a file in chunks is a waste of CPU cycles.  ClamAV was designed to process a whole file all at once.  Some file formats, like PDF, DMG, and ZIP* store metadata at the end of the file which is necessary to properly parse the file.  Streaming scanners like the one in Snort struggle or can’t process these files.  I put a * near ZIP because zips are actually pretty easy to parse in-order even if the central directory is missing.  Files like DMG, on the other hand, can’t even be identified as DMG’s without reading the end of the file first, or trusting the “.dmg” file extension (which is dangerous).

In short, don’t send chunks of files as separate files to be scanned; It probably won’t catch any malware that way and may print lots of warnings or errors if it gets confused about the type of the file and starts processing it with the wrong parser.

Regards,
-Micah


Micah Snyder
ClamAV
Talos
Cisco Systems, Inc.



From: clamav-users <clamav-users-bounces at lists.clamav.net> On Behalf Of Saurav Sarkar via clamav-users
Sent: Wednesday, April 7, 2021 7:39 AM
To: clamav-users at lists.clamav.net
Cc: Saurav Sarkar <saurav.sarkar1 at gmail.com>
Subject: [clamav-users] Scanning a large file through HTTP

Hi All,

We are using a HTTP enabled malware scanning service based on Clam AV.

The service is made something like this
https://github.com/solita/clamav-rest

We have files like CAD files which can go in GBs and want to send to this malware scanning service.

Is there a possibility to send the file in chunks and get it scanned in the server side in chunks.

I observed that there is a INSTREAM command in clamd for this purpose and also there is a 4GB size limit.
https://linux.die.net/man/8/clamd


Read somewhere the full file size is mapped to memory. Is it the case for INSTREAM command also ?

If it is the case then even if chunking is supported then the server side must have at least 4GB of memory.

Best Regards,
Saurav




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clamav.net/pipermail/clamav-users/attachments/20210407/3a3746c1/attachment.htm>


More information about the clamav-users mailing list