[clamav-users] No good deed goes unpunished, or, why CVD files don't work

Paul Kosinski clamav-users at iment.com
Wed Dec 19 15:26:51 EST 2018

In light of The Delays, and the fact that CVDs are so much bigger than
CDIFFs, I have changed our ClamAVs to use Scripted Update (CDIFFs) and
thus fetch directly from database.clamav.net.

We currently have fewer than a half-dozen machines on our LAN, which
share a single Comcast dynamic IP address (and hit Cloudflare's BOS) and
one remote machine -- our virtual (cloud) Web server, which has static
IPs (and hits Cloudflare's IAD).

They all do DNS TXT queries 3-5 times per hour, and *only* if that says
there are new CDIFFs do they invoke freshclam. As before, this is all
based on cron, and the times are staggered to avoid peaking.

So far, we have seen none of the dreaded Delays. And since the CDIFFs
should be "immune" to cache misbehavior, we don't expect to.

Importantly, using Scripted Update (CDIFFs and thus CLDs) seems to work
OK with HAVP: but it sort of has to since HAVP just uses libclamav to
(re)load the database. (HAVP can also work with other AV engines, and
they all load their databases however *they* please.)

This is not as elegant as locally mirroring the updates, and uses a bit
more Cloudflare bandwidth (although still less in steady state than
using CVDs). I had considered using Polipo as a RAM-only HTTP proxy just
for this, but that would put more load on our gateway machine, and take
up more of my time, so I will defer it until later. (Much later.)


P.S. Thanks to Steve Basford for suggesting Polipo.

On Mon, 17 Dec 2018 19:57:35 +0000
"Joel Esler (jesler)" <jesler at cisco.com> wrote:

> Inline:
> > On Dec 15, 2018, at 6:23 PM, Paul Kosinski <clamav-users at iment.com>
> > wrote:
> > 
> > I don't know if flushing the daily.cvd cache would be adequate,
> > since there are probably some downstream caches that wouldn't
> > follow suit.
> Actually I had someone correct me after I wrote this email, we
> already have been doing that the whole time.  
> > 
> > Pointing *everyone* directly at Cloudflare might be expensive, if
> > that meant millions (or even thousands) of new clients.
> At least it would let us know how many users we have.  Best I can
> tell on a given day, we have 2.5M users daily that hit us.  Obviously
> the unique user count is much higher (as there are several users
> behind one NAT IP, and local mirrors and the like.). Our monthly
> numbers are north of 11M users, (as some people only run freshclam
> once a week or something like that.). I guess what I am trying to say
> is, it may not be that much more traffic.
> > 
> > How does Cloudflare charge Talos for ClamAV? Is the cost only per
> > byte, or is there also a significant per-connection charge. (And if
> > so, is it per HTTP or per TCP connection)? Unless the per-byte cost
> > is near zero (which is unlikely), multiple cdiffs are almost
> > certainly cheaper than one cvd.
> I can't disclose those details, I'm sorry.
> > 
> > For my experiment, I used tinyproxy on our web server machine to
> > access Cloudflare's IAD servers instead of the BOS servers that
> > Comcast routed to, but tinyproxy doesn't do caching. That being the
> > case, I don't much like the idea of having to run squid just to
> > cache what amounts to one cdiff file for each ClamAV update.
> Paul, how about you just point everything you have at us and see if
> it makes a difference?

More information about the clamav-users mailing list