[clamav-users] Google safebrowsing types and usage questions

Micah Snyder (micasnyd) micasnyd at cisco.com
Fri Oct 16 17:24:42 UTC 2020


Hi Alex,

I'm glad to hear that the clamav-safebrowsing tool is working for you.  Please do report bugs to the project's github issue tracker though do bear in mind that it may take a while before anyone has time to work on it.  If you're able to submit bugfixes yourself, pull requests are always appreciated.

The clamav-safebrowsing tool was developed by another team and gifted to my team for OSS maintenance. While I've tested it a little bit, I'm not intimately familiar with it. I'll try to answer your questions as best I can inline...

> From: clamav-users <clamav-users-bounces at lists.clamav.net> On Behalf Of
> Alex via clamav-users 

> btw, I found out the hard way that having a percent sign in the password
> causes the clamav-safebrowsing script to fail.

That's a good observation.  If it's not an escaping issue as Ged suggested, then can you please create a github issue for this item?

> It appears to have loaded another 3M signatures. Where can I find more info
> about those signatures? I'm especially interested in the types of attacks it is
> designed to stop. I've located this URL that appears to describe four
> categories, but is there any more info available?
> 
> https://developers.google.com/safe-browsing/v4/reference/rest/v4/ThreatType
> 
> Are there any more specifics available about each category? Do the patterns
> have names in the same way the sanesecurity patterns do?'

As far as I know, google safebrowsing rules have no names and you simply have to trust that they are sites which are not safe to browse. 

> What is the purpose of the mysql database if the signatures are in a GDB file in
> /var/lib/clamav? I'm assuming the database is updated then "build" is used to
> dump it to a file instead of having to download it in full every time?
> 
> I'd like to replicate the database across all servers to save on bandwidth and
> just have the master be updated. Does this make sense? I can then rsync the
> GDB file from the master server, or is it possible to just dump the database
> without also trying to update it?

The mysql database exists for precisely that reason. The safebrowsing rules change a lot and as you noted it's a huge ruleset.  Mysql is more efficient to update than our .gdb file format.  Your idea to rsync the .gdb file after each update makes good sense to me.

> I also still have the old safebrowsing.cld database from the end of
> 2019 (version: 49191, sigs: 2213119, f-level: 63, builder: google).
> Should I delete that?

Yes, if you're generating your own safebrowsing ruleset, the one safebrowsing.cld file is obsolete.

> How much memory needs to be allocated for clamav to store/process 14M
> signatures?

I don't know off-hand. It certainly varies by signature type.  If you're asking about safebrowsing rules, then what you can do is modify your clamd.conf to point DatabaseDirectory to a location that only contains safebrowsing rules.  When it loads, check how much ram it's using.  The summary info for `clamscan -d <DATABASE> blah` will tell you the # of "known viruses" (loaded signatures).  After that, a little math will help you estimate how much RAM a larger rule set would require.


Regards,
Micah



More information about the clamav-users mailing list