[clamav-users] We STILL cannot reliably get virus updates (since new mirrors)

Paul Kosinski clamav-users at iment.com
Mon Jul 2 21:37:51 EDT 2018

Any system whereby new versions of files are announced before they are
actually available to automated downloads is awkward (to say the least).

If, in addition, a server which doesn't have the announced version is
blacklisted by the automated downloader, the whole mechanism can grind
to a halt (as it has for us).

Even if a server which is out of sync (i.e., behind) is not
blacklisted, but merely temporarily skipped, it uses extra bandwidth in
the current scheme. In the case of daily.cvd, the only way freshclam
detects that the server is out of sync is by downloading the whole file
(currently about 47 MB) -- the waste of bandwidth is enormous. For
example, our logs this afternoon show 15 complete downloads of
daily.cvd over about 1 hour. Of these, all but the last failed due to
out of sync. This is why we have recently taken to deleting mirrors.dat
before each freshclam run -- to compensate for the blacklisting -- and
running freshclam 3 times an hour hoping for sync.

This behavior is both unreasonable and inefficient.

P.S. Just before I sent this mail, I sent some proposals for how ClamAV
might possibly avoid this behavior.

On Mon, 2 Jul 2018 15:12:56 -0700
Dennis Peterson <dennispe at inetnw.com> wrote:

> The current system announces a new version of the signatures is
> available before all the mirrors have received the update.  Another
> design option is for ClamAV to upload the updates to all the mirrors
> and then announce the new version. That is not what we have and there
> are good reasons for it.
> ####
> Freshclam 101 as I understand it:
> The site that announces the version ID via DNS is not the mirror that 
> distributes the new version. In the current system there is lag
> between the announcement in DNS and the availability at the mirrors.
> In alternate design described above the mirrors are guaranteed to
> have the update when the announcement (DNS result) is updated. It
> also means that many mirrors have the update long before the last
> mirror does, but the freshclam clients don't know to look for it
> until the DNS record is changed.
> The as-built process:
> Freshclam is aware of the currently installed signature version and
> if the next time it runs it learns there is a new version from the
> DNS server it will attempt to retrieve it from a mirror. Freshclam
> has no means of knowing if any of the mirrors it is configured to
> retrieve from is synched until it asks a mirror. Being told it does
> not exist at the first mirror it polls, it gracefully adds a log
> entry and tries the next. It is entirely possible that none of the
> configured mirrors has the update because there is an unavoidable lag
> in the system. It will gracefully stop trying when the last mirror is
> found out of sync. So far, so good. It will try again at the next
> scheduled poll. ####
> Things will go to hell in a hurry though if the mirrors.dat algorithm 
> disqualifies a mirror for being out of sync. I've never looked but I
> hope this is not the case because a mirror that is out of sync is not
> evidence of a broken mirror.
> Not having run a mirror I don't know of the updates are pushed out
> from the repository of if they are pulled from the mirror. If they
> are pushed there should be little lag updating the first mirror. If
> they are pushed out in parallel to every mirror there will be a big
> burst of bandwidth and lag will be determined by bandwidth. If they
> are pushed out serially there will be lag between the first and last
> mirror update.
> A different push method would push the new version ID to the mirrors
> via a lightweight process (DNS, for example) and each mirror would
> respond by pulling the new signature. This would also create a burst
> of bandwidth, but it would be more gentle than a parallel push.
> If a push is not employed then the repository would have to be polled
> from the mirror and there is lag in the polling process. How often
> does the polling run?
> An intelligent design would create mirror tiers where a subset of
> mirrors are synched (push) quickly from the repository and the next
> tier of mirrors can now update from this block of mirrors rather than
> the repository alone, and this will distribute the load and minimize
> bandwidth induced lag. NIS works in this fashion.
> Another option is to build a tuple space server which can service
> these requests in a massively parallel way.
> dp
> On 7/2/18 7:20 AM, Paul Kosinski wrote:
> > I don't understand your reply. Exactly *how* do we "wait until every
> > mirror is synchonized, become notified, then try".
> >
> > Freshclam is run periodically, automatically (via cron, in our
> > case). Shouldn't it be freshclam's job to do things at the right
> > time. And how would *it* know when all mirrors are synced? Is it
> > Talos that populates the mirrors? Then Talos shouldn't update the
> > DNS TXT records until *all* mirrors are ready.
> >
> > P.S. The client's mirrors.dat file is updated in 18 different
> > places in manager.c, which is in the freshclam subsystem.
> >
> >
> > On Sun, 1 Jul 2018 21:11:29 -0700
> > Dennis Peterson <dennispe at inetnw.com> wrote:
> >
> >> What makes it a problem? You can never dl it until it is available,
> >> so the problem is you become aware of it too soon. But think about
> >> what that means. Your choices are to know immediately when an
> >> update is available and try to get it, or wait until every mirror
> >> is synchonized, become notified, then try. The first choice is a
> >> crapshoot you might win. The second choice isn't a crapshoot but it
> >> also doesn't save time. Remembering all this is automated the
> >> result is actually some uninteresting log entries.
> >>
> >> It would be interesting to know if an update notice is sent to all
> >> mirrors in the fashion of a DNS notification to slaves which would
> >> cause a parallel pull, or if the update itself is pushed, and what
> >> the process is for updating the client mirrors.dat file.
> >>
> >> dp
> >>
> >> On 7/1/18 9:01 PM, Al Varnell wrote:
> >>> Seems to me that it's only a problem if it takes a significant
> >>> amount of time between the DNS update and the mirror updates. I
> >>> don't have a good feel for how long that is from the postings so
> >>> far, but it does sound like it may have increased as a result of
> >>> the move from ClamAV mirrors to the ClamAV CDN.
> >>>
> >>> Sent from my iPad
> >>>
> >>> -Al-
> >>>
> >>>> On Jul 1, 2018, at 20:38, Dennis Peterson <dennispe at inetnw.com>
> >>>> wrote:
> >>>>
> >>>>> On 7/1/18 8:24 PM, Paul Kosinski wrote:
> >>>>> My conclusion is that the cause of this is a typical race
> >>>>> condition: the DNS TXT record is updated before Cloudflare has
> >>>>> propagated the new cvd file to all the mirrors.
> >>>>>
> >>>>>
> >>>> Is this a problem?
> >>>>
> >>>> dp

More information about the clamav-users mailing list