[clamav-users] Kindly help in create unofficial signature

Kris Deugau kdeugau at vianet.ca
Mon Sep 21 16:23:15 UTC 2020


Dismas Axel (Thomas) via clamav-users wrote:
> 3) I ran the command:
> 
> cat Returned_Swift Copy,PDF.tar.xz | sigtool --hex-dump | head -c 2048 > Returned_Swift_Copy.ndb

If you don't have multiple similar but not quite identical samples, and 
you're not familiar with the structure of Windows executables, I'd 
advise against this.  It's likely to either trigger false positives, or 
match so little live traffic you'd be no better off if you had just 
created a hash signature.

Locally, I generally just create hash signatures unless I have a couple 
of samples that look to be similar.

I use the attached Perl script to read a set of files and spit out the 
hex dump with wildcards in place of the bytes that are different between 
files at that exact byte position.  Adjust the $baseoffset and 
$fromstart variables to shift the starting point forward and back 
through the file, or to work from the end of the file instead of the 
beginning.  (I originally created it to automate signature creation on 
files that had variable chunks of different data starting ~30-60 bytes 
or so in, but had big long runs of identical bytes starting from the end.)

Note that many sets of files will quickly turn into one giant wildcard, 
or a generic match on any Windows executable.  It also does no checking 
for whether the result is actually an acceptable ClamAV signature.  You 
may want to use clamscan --leave-temps to extract whatever extractable 
subsections it can, and build signatures based on those (I sometimes do 
this with sets of Office documents).

If you're looking to be more aggressive about blocking "possibely 
malicious things in archive files", I'd suggest the Sanesecurity 
"Foxhole" signatures.  They simply match on the filename extension of 
the file inside the archive or disk image.  If you're sure you'll never 
need to send or receive Windows executables or a couple of document file 
types wrapped in an archive file, they'll block a lot of 0-day junk.

-kgd
-------------- next part --------------
#!/usr/bin/perl
# generate clamav sigs from variant virus files

use strict;
use warnings;
use Data::Dumper;

my $trimextra = 0;
my $reffile = shift @ARGV;
my $refsig = qx { sigtool --hex-dump < "$reffile" };

my $baseoffset = 0;
# work from the front or or the back of the file?
my $fromstart = 1;

my $hs = 0;
# number of different bytes to consider the string "many" instead of "limited or"
my $ndiff = 1;

if ($fromstart) {
  $refsig =~ s/^.{1024}// while $hs++ < $baseoffset;
} else {
  $refsig =~ s/.{1024}$// while $hs++ < $baseoffset;
}
if ($fromstart) {
  $refsig =~ s/^(.{16384}).+/$1/;
} else {
  $refsig =~ s/.+(.{16384})$/$1/;
}

my @refbytes = ($refsig =~ /../g);

my @basesigs;
foreach my $vfile (@ARGV) {
  my $sig = qx { sigtool --hex-dump < "$vfile" };
  $hs = 0;
  if ($fromstart) {
    $sig =~ s/^.{1024}// while $hs++ < $baseoffset;
  } else {
    $sig =~ s/.{1024}$// while $hs++ < $baseoffset;
  }
  if ($fromstart) {
    $sig =~ s/^(.{16384}).+/$1/;
  } else {
    $sig =~ s/.+(.{16384})$/$1/;
  }
  my @foo = ($sig =~ /../g);
  push @basesigs, \@foo;
}

my @outsig;
for (my $i = 0; $i < 8192; $i++) {
  my @tmp;
  push @tmp, $refbytes[$i];
  no warnings qw (uninitialized);
  foreach my $b (@basesigs) {
    next if !defined($b->[$i]);
    next if !defined($b);
    push @tmp, $b->[$i] if !grep /^$b->[$i]$/, @tmp;
  }
  if ($#tmp) {
    if ($#tmp >= $ndiff) {
      push @outsig, '{1}';
    } else {
      push @outsig, '('.join('|', at tmp).')';
    }
  } else {
    push @outsig, $tmp[0];
  }
}

my $n = 0;
my $i = 0;
foreach my $byte (@outsig) {
  next if !defined($byte);
  if ($byte eq '{1}') {
    $n++;
    next;
  } else {
    if ($n) {
      print "{$n}" if $n > 1;
      print '??' if $n == 1;
    }
    print $byte;
    $n = 0;
  }
}
print "\n";


More information about the clamav-users mailing list