[clamav-users] Google safebrowsing types and usage questions
Micah Snyder (micasnyd)
micasnyd at cisco.com
Fri Oct 16 17:24:42 UTC 2020
Hi Alex,
I'm glad to hear that the clamav-safebrowsing tool is working for you. Please do report bugs to the project's github issue tracker though do bear in mind that it may take a while before anyone has time to work on it. If you're able to submit bugfixes yourself, pull requests are always appreciated.
The clamav-safebrowsing tool was developed by another team and gifted to my team for OSS maintenance. While I've tested it a little bit, I'm not intimately familiar with it. I'll try to answer your questions as best I can inline...
> From: clamav-users <clamav-users-bounces at lists.clamav.net> On Behalf Of
> Alex via clamav-users
> btw, I found out the hard way that having a percent sign in the password
> causes the clamav-safebrowsing script to fail.
That's a good observation. If it's not an escaping issue as Ged suggested, then can you please create a github issue for this item?
> It appears to have loaded another 3M signatures. Where can I find more info
> about those signatures? I'm especially interested in the types of attacks it is
> designed to stop. I've located this URL that appears to describe four
> categories, but is there any more info available?
>
> https://developers.google.com/safe-browsing/v4/reference/rest/v4/ThreatType
>
> Are there any more specifics available about each category? Do the patterns
> have names in the same way the sanesecurity patterns do?'
As far as I know, google safebrowsing rules have no names and you simply have to trust that they are sites which are not safe to browse.
> What is the purpose of the mysql database if the signatures are in a GDB file in
> /var/lib/clamav? I'm assuming the database is updated then "build" is used to
> dump it to a file instead of having to download it in full every time?
>
> I'd like to replicate the database across all servers to save on bandwidth and
> just have the master be updated. Does this make sense? I can then rsync the
> GDB file from the master server, or is it possible to just dump the database
> without also trying to update it?
The mysql database exists for precisely that reason. The safebrowsing rules change a lot and as you noted it's a huge ruleset. Mysql is more efficient to update than our .gdb file format. Your idea to rsync the .gdb file after each update makes good sense to me.
> I also still have the old safebrowsing.cld database from the end of
> 2019 (version: 49191, sigs: 2213119, f-level: 63, builder: google).
> Should I delete that?
Yes, if you're generating your own safebrowsing ruleset, the one safebrowsing.cld file is obsolete.
> How much memory needs to be allocated for clamav to store/process 14M
> signatures?
I don't know off-hand. It certainly varies by signature type. If you're asking about safebrowsing rules, then what you can do is modify your clamd.conf to point DatabaseDirectory to a location that only contains safebrowsing rules. When it loads, check how much ram it's using. The summary info for `clamscan -d <DATABASE> blah` will tell you the # of "known viruses" (loaded signatures). After that, a little math will help you estimate how much RAM a larger rule set would require.
Regards,
Micah
More information about the clamav-users
mailing list