[clamav-users] excluding a URL from "heueristics" scanning

joe a joea-lists at j4computers.com
Thu Aug 11 20:01:38 UTC 2022


On 8/11/2022 2:02 PM, joe a wrote:
> On 8/11/2022 1:17 PM, G.W. Haywood via clamav-users wrote:
>> Hi there,
>>
>> On Thu, 11 Aug 2022, joe a wrote:
>>
>>> A while back discussed excluding some URL's from triggering the 
>>> heueristics scan.   Seemed to work.  Postfix, spamassassin, clamav in 
>>> use.
>>>
>>> Now seems some addtional URL's are involved. Perhaps I am doing 
>>> something wrong here.
>>>
>>> Been determining (?) the offending URL's by examining the entire 
>>> email using:
>>>
>>> clamscan --debug --file-list=SFILE --log=RESULT.txt 2> result.txt
>>>
>>> then looking for offenders using:
>>>
>>> grep -iB4 "Phishing scan result: URLs are way too different" myfile.txt
>>>
>>> entering the URL seen in "Real URL:  http://some.url" into 
>>> "/var/lib/clamav/somefile.wdb" and restarting clamd (systemctl 
>>> restart clamd.service)
>>>
>>> I would presume re-scanning as above should no longer flag the 
>>> offending URL(s)?
>>
>> You presume a lot.  The documentation seems to say otherwise:
>>
>> https://docs.clamav.net/manual/Signatures/PhishSigs.html#wdb-format
>>
> 
> Well!.
> 
> Thanks for the direct links.   The content appears a bit different than 
> I recall, when attempting to decipher it some months back.
> 
> Might even prove enjoyable wading through it, were I an S&M enthusiast.
> 
> 
> _______________________________________________
> 
> clamav-users mailing list
> clamav-users at lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
> 
> 
> Help us build a comprehensive ClamAV guide:
> https://github.com/Cisco-Talos/clamav-documentation
> 
> https://docs.clamav.net/#mailing-lists-and-chat

I do not understand why, when entering more than one URL, the first line 
in my "exclude" file: "/var/lib/clamav/ImaOK2day.wdb" seems to be able 
to match when entered "in plain text", while subsequent lines seem to 
want actual "regex" notation (escaped "."), with only the domains entered.

At least that is what it seems takes to "run clean" when re-scanned in 
debug mode.

To add do the above, I found a few recent emails containing the URLs in 
the first entry, mentioned above, that were flagged.  Those emails 
passed without notice when scanned as above.  I removed that first 
entry, scanned again and the email were flagged.  I then entered those 
URL's again, as the first line, this time in regex notation ("." 
escaped, no "http or https"), scanned again, and it was not flagged.



More information about the clamav-users mailing list