Antivirus software (Al Viro; Theodore Tso)

Index Home About Blog

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro
Date: Wed, 06 Aug 2008 01:51:48 UTC
Message-ID: <fa.2t8IdR9TdaGpORThuyOKN8z3EA4@ifi.uio.no>

On Tue, Aug 05, 2008 at 08:46:00PM -0400, Rik van Riel wrote:
> My real worry is that the anti-virus companies have been working
> with an enforcement policy that has been evolving slowly from the
> DOS days, while today's threat model has changed considerably.

... and which also doesn't into account some of the facilities which
Linux has, that DOS/Windows does not have.

Part of the problem I suspect is that the AV folks have managed to get
CIO's believe that all computer systems need to have anti-virus
software, of the same design that is needed for DOS/Windows systems.
This state of delusion is so bad that apparently some AV engineers
aren't even willing to reason from first principles what is necessary
or not to maintain a secure system.

And arguably, if the goal is security theater, much like the security
lines in airports, perhaps it doesn't matter.  If there are silly
CIO's that are willing to pay for such a thing, regardless of whether
or not it is actually *necessary* to maintain security, one school of
capitalism would say it doesn't matter if it actually provides any
functional value or not.

On the other hand, it seems pretty clear there are plenty of LKML
developers who aren't buying it.  :-)

It may be helpful to separate the threat model into at least three
different scenarios:

	The Linux Desktop (where clueless users may be tricked into
		running malware).

	The Linux File Server (where it is *highly* unlikely to have
		active running malware, since there are no clueless
		users running on said file server), but where malware
		may be stored and read over CIFS, NFS, etc.

	The Linux Mail server is really a restricted case of the Linux
		Fileserver; where the only way in is SMTP, and the
		only protocol out is IMAP/POP.

Clamav arguably does a very nice job for the third case.  And the
number of ways in and out for a Linux fileserver is sufficiently small
(and there are no clueless users to start the malware program
running), that it's relatively easy to reason about.

In the Linux Desktop case, you do have to worry about clueless users,
but in general you don't have to worry about serving CIFS or NFS on
such boxes.

It seems that the AV folks are trying to argue for a worst case
scenario --- one where you have a clueless user, *and* you have a root
compromise, *and* it is also simultaneously serving as a high output
fileserver.  #1, I think it is questionable whether this is a
reasonable model, and #2, if root is compromised, no amount of
scanning software will help you, since the malware can simply directly
attack and disable the scanning software.

But it is specifically this sort of threat analysis and explicit
detailing of the assumptions of what capabilities the attacker has
which is critical for proceeding.  The fact at least one AV engineer
thinks it's pointless to do this sort of low-level design is
disappointing.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro
Date: Tue, 05 Aug 2008 23:58:22 UTC
Message-ID: <fa.tlGdVvzvgY7tAprI0yi0jIFvIYo@ifi.uio.no>

On Tue, Aug 05, 2008 at 06:12:34PM -0400, Press, Jonathan wrote:
>
> I don't think I'm stupid, but frankly I don't understand the point
> of the questions being asked in the last three responses to my
> statement.  I don't know why they are relevant, and I don't know how
> to answer them in a framework that we can all understand at the same
> time.  What is my threat model?  Naively stated, it is that there is
> a file on a machine that might do damage, either there or elsewhere,
> and I want to find it and get rid of it in the most efficient way.
> I am not defining the nature of the damage or the mechanism.

This is actually quite shocking to me.  You don't know how to define
the threat model?  And you call yourself in the security business?
Read some books or essays by Bruce Schneier.  A good one might be his
recent book, "Beyond Fear: Thinking Sensibly About Security In An
Uncertain World".

The naive refusal to think about threat models is why we have to
submit to really insane, useless, "security theater" every time we get
on an Airplane and have to take off our shoes and throw our bottleed
water into a huge heap in front of the security line.  (If they really
thought the water bottles could contain explosives, why leave them in
a huge pile in front of the TSA employees.  :-)

If the goal is to get make we are proof against malware, we need to be
very clear about the whys and wherefores about how the file might have
gotten there.  And if you are going to be serving that file a million
times a day, does it really make sense to block the open a million
times a day, or do you make sure that you notice when it gets
corrupted in the first place?

And security is not an absolute.  Just as the terrorists win if it can
induce the White House to shred the constitution and force us all to
live in a constant state of fear, it is also pointless to induce
people to install software that horrifically slows down their server
so badly that you can't get anything done.

If people in the AV industry don't know how to think about threat
models, it says a lot about their competence as security engineers.
And I say this as someone who was team lead of Kerberos at MIT, and
was the chair of the IP Security working group at the IETF (the
standards body for the Internet), and who has served on the Security
Area Directorate (alongside Bruce Schneier) at the IETF.

     		 	    	  	       	   - Ted

From: Al Viro <viro@ZenIV.linux.org.uk>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro to a
Date: Tue, 05 Aug 2008 21:06:01 UTC
Message-ID: <fa.cK8tvGUDN/6+dhqnu3FvOMl1NYs@ifi.uio.no>

On Tue, Aug 05, 2008 at 01:38:32PM -0700, Arjan van de Ven wrote:

> This does assume that at some point you have a transition from "ok"
> program to the first time you run a "bad" one (via exec or open); and
> that you catch it at that point.
>
> I don't yet buy the argument "but what if the virus corrupted your ld
> preload", because if it can do that your own virus scanner is also
> corrupted.
>
>
> Can you explain what gap is left after you do these two things?

Actually, the real question (and the reason why I question the personal
integrity of the people in "AV community" pushing that kind of trash)
is very simple:

Where Is Your Threat Profile?

Various people had been asking for _years_ to define what the hell are you
trying to prevent.  Not only there'd been no coherent answer (and no, this
list of requirements is _not_ that - it's "what kind of hooks do we want"),
you guys seem to be unable to decide whether you expect the malware in
question to be passive or to be actively evading detection with infected
processes running on the host that does scanning.

Moreover, the answer seems to be changing back and forth to suit the needs
of the moment in the argument.  Slightly exaggregated it goes like this:

-- Why don't you do $FOO?
-- Running virus would be able to evade $FOO, of course!
-- No shit, Sherlock; it would also be able to evade much more intrusive $BAR
you are proposing; here's how <obvious evasion method>
-- Oh, but that's not a problem; think of Linux server with Windows clients
and Windows viruses...

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro to a linux interface
Date: Wed, 06 Aug 2008 21:53:25 UTC
Message-ID: <fa.uXugm//GdehBAW9R3BZYMyCGaPo@ifi.uio.no>

On Wed, Aug 06, 2008 at 05:28:01PM -0400, Eric Paris wrote:
> > In this scenario, are you positing that you are worried about Windows
> > malware, or Linux malware?  What OS are the clients running?  I will
> > note that Windows has such a sucky NFS implementation that nearly all
> > Widows clients will be running CIFS/SMB, not NFS
>
> I believe I specifically did not make any such claims at all about the
> client OS and merely claimed the intended target was not the linux NFS
> server.

I know you didn't say; that's why I asked.  :-)

I dispute your assertion that this question is irrelevant.  It's
highly relevant, because if it's Windows clients, they ***won't*** be
using NFS.

As for other large desktop OS's, that would be MacOS and Linux;
anything else?  And the big, huge, vast difference between Windows and
MacOS/Linux is that with Windows, in practice people ran with
Administrator privileges because most applications (including at one
point Microsoft Visual Studio :-) died and/or completely refused to
install if you didn't have Administrator privileges.  So people very
regularly ran with Root privs.  With Vista, you no longer run with
root privileges by default --- instead, applications still assume they
have Administrator privileges, causing the Really Annoying Popup boxes
to pop up each time the application needs to do something that require
privileges --- which has trained users to mindlessly click "OK" each
time the Annoying Popup Box comes up.

Given that MacOS and Linux don't have these flaws with respect to
applications regularly expecting root privileges, will you admit that
perhaps some of the extreme scanning tactics that were required by
AntiVirus vendors might be not as necessary for "other desktops"?

Asking the question is important because if they are spending all of
their time on Windows virii, then your "elementary threat" is really
an "elementary strawman".  Or, at the very least, it's a low priority
effort, since the number of virii out in the field for Linux and MacOS
desktops is in the noise compared to Windows.  I know that it's
convenient for AV vendors to claim in their marketing literature that
this is only because Windows is more popular, but while that might be
part of it, it is also true that there are significant, structural
differences between Windows and those other large desktop candidates.

> Your argument is irrelevant for the threat given and you seem to have
> contorted the actual point of the statements to fit something else.  But
> I'm sure you a fan of multiple layers of security that you don't
> actually believe that "just check on the clients" is the right thing to
> do.

Giving up my water bottles and having to take off my shoes at airport
security has been justified in the name of "multiple layers of
security".  No, I'm NOT a fan of mindlessly using "defense in depth"
as an excuse for arbitrary amounts of security and giving up arbitrary
amounts of my private data.  You need to prove to me that from a cost
benefit tradeoff it's really worth it.

   	       	    	    	     	  	 - Ted

From: Theodore Tso <tytso@MIT.EDU>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro to a
Date: Wed, 06 Aug 2008 15:23:48 UTC
Message-ID: <fa.kCJNPJjLo6/k+c3bjfllUGLOihQ@ifi.uio.no>

On Wed, Aug 06, 2008 at 03:16:02PM +0100, tvrtko.ursulin@sophos.com wrote:
>
> You can't do something like inotify("/") (made up API) but you have to set
> up a watch for every directory you wan't to watch. That seems like a waste
> of resources.
>
> Then you get back a file name, if you wan't to report it or attempt* to
> scan it you have to build a pathname yourself, which means you have to
> maintain the whole tree of names in memory. Even bigger waste.

Yes, it would be nice if inotify gave you back a full pathname and
where a single watch would return all changes anywhere in the
filesystem tree.  I'd recommend that folks try to create such a patch.

> When I say attempt to scan it above I mean that we are back into the
> pathanme teritorry. It is not guaranteed we will be able to open and scan
> using that pathname. I don't know what inotify reports with chroots and
> private namespaces, but it can certainly fail with NFS and root_squash. So
> it is less effective as well as being resource intensive.

Linux's namespace support does break a lot of traditional paradigm.
I'll note the TALPA "requirements" are broken themselves since they
refer to pathnames.

Furthermore, I assume you'll always want to do the scanning in
userspace; the virus signature files for Windows are ***huge***.  And
if you are going to be scanning for Windows virii on the argument that
you want to stop malware on fileservers, I don't think you want to put
all of that code into the kernel.  (I'll note that all that code
complexity leads to bugs, which will in kernel code cause system
crashes.  One company's Linux AV code --- I won't say which --- almost
lead to a rather big and public customer abandoning an Linux
deployment because said proprietary, badly/disastrously written,
kernel code was leaking a small amount of memory on every file open,
and no one could figure out why their file server was crashing every
five days or so.  I was called in to rescue said customer before they
cancelled the contract in disgust, and I traced it back to a
proprietary AV kernel module.  What fun...)

So if we are going to have to deal with namespaces, I suspect the best
we can do for any interface (whether it is inotify based or not) is to
have it return pathnames that are valid in the namespace that the
program calling said interface happens to be running in.  If necessary
the AV program can be given access to a highly privileged namespace
where all mounts are visible.  And if you want to restrict namespaces
from being created at all, that's a security policy decision that
should be made via the LSM hooks.

As far as blocking opens are concerned, my suggestion there would be
changes would probably be much more likely accepted if they solved
more problems than just what the AV folks need.  For example, think
about hierarchical storage management, and DMAPI.  DMAPI is a total
disaster because it doesn't know about namespaces and so is completely
pathname based (which doesn't work well when you have namespaces).
But a solution which is general enough that it can also be used to
support HSM would probably be a good thing.

Also, it may very well be that instead of one, purpose-specific
interface such as what you suggested in TALPA, it might be much better
if it was a series of different interfaces; and in some cases, some of
the changes might be extensions and improvments to existing
facilities, such inotify.

Regards,

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: TALPA - a threat model?  well sorta.
Date: Wed, 13 Aug 2008 18:16:53 UTC
Message-ID: <fa.D5ARj3AI9lXr/bzjI6ijC9bwhSs@ifi.uio.no>

On Wed, Aug 13, 2008 at 10:39:51AM -0700, Arjan van de Ven wrote:
> for the "dirty" case it gets muddy. You clearly want to scan "some
> time" after the write, from the principle of getting rid of malware
> that's on the disk, but it's unclear if this HAS to be synchronous.
> (obviously, synchronous behavior hurts performance bigtime so lets do
> as little as we can of that without hurting the protection).

Something else to think about is what happens if the file is naturally
written in pieces.  For example, I've been playing with bittorrent
recently, and it appears that trackerd will do something... not very
intelligent in that it will repeatedly try to index a file which is
being written in pieces, and in some cases, it will do things like
call pdftext that aren't exactly cheap.  A timeout *can* help (i.e.,
don't try to scan/index this file until 15 minutes after the last
write), but it won't help if the torrent is very large, or the
download bitrate is very slow.  One very simple workaround is to
disable trackerd altogether while you are downloading the file, but
that's not very pleasant solution; it's horribly manual.

Most of this may end up being outside of the kernel (i.e.,some kind of
interface where a bittorrent client can say, "look this file is still
being downloaded, so it's don't bother scanning it unless some process
*other* than the bittorrent client tries to access the file".  And
maybe there should be some other more complex policies, such as the
bittorrent client explicitly telling the indexer/scanner that the file
is has been completely downloaded, so it's safe to index it now.

But what this points out is that if you want a good solution, (a) it
probably shouldn't all be in the kernel, since trying to get all of
this complexity into the kernel will be painful, and (b) the policy
about whether or not a bittorrent client should be allowed to say,
"it's OK not to check the file until it's completely downloaded, even
if I am handing out pieces to other people over the network --- after
all the entire file has its own SHA checksum for data integrity
verification --- is very much a policy question where different system
administrators will come down on different sides about what should and
shouldn't be allowed --- and therefore this kind of policy decision
should ****NOT**** be in the kernel.

> For efficiency the kernel ought to keep track of which files have been
> declared clean, and it needs to track of a 'generation' of the scan
> with which it has been found clean (so that if you update your virus
> definitions, you can invalidate all previous scanning just by bumping
> the 'generation' number in whatever format we use).

We have an i_version support for NFSv4, so we have that already as far
as the version of the file.  We can have a single bit which means
"block on open" that is stored on a file, and some kind of policy
which dictates whether or not any modification to the file contens
should automatically set the bit.

However, questions of which version of virus database was used to scan
a particular file should be stored outside of the filesystem, since
each product will have its own version namespace, and the questions of
what happens if a user switches from one version checker to another is
going to be messy.  So better that this be done in userspace, and that
this information be stored in some on-disk database.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: TALPA - a threat model?  well sorta.
Date: Wed, 13 Aug 2008 19:29:57 UTC
Message-ID: <fa.icZtg4DAzdRyR4WldkOHAa9ULz4@ifi.uio.no>

On Wed, Aug 13, 2008 at 03:02:48PM -0400, Eric Paris wrote:
> I never suggested putting a scanner in kernel.  Sound like you want the
> "allow don't cache" response from your userspace scanner while this is
> going on.  The kernel doesn't need to be making decisions about when to
> send events, nor should userspace tell the kernel not to send events.
> Its up to whatever the scanner is to agree not to actually do any
> scanning...

And if the system isn't running a virus checker, but just a file
indexer (ala tracker), it shouldn't go to userspace at all.  In that
case all that is necessary is an asynchronous notification.

Also something else that is needed is support for multiple clients.
(i.e., what happens if the user runs two virus checkers, or a virus
checker plus a hierarchical storage manager driving a tape robot, or
all of the above plus trackerd --- where some clients need to block
open(2) access, and some do not need block open(2) --- and in the case
of HSM, ordering becomes important; you want to retrieve the file from
the tape robot first, *then* scan it using the virus checker.  :-)

> No.  How in the heck can some out of kernel database store information
> about what inodes have been scanned in any even slightly sane way?  And
> people think the race between open and read is too large and you suggest
> moving clean/dirty marking to a userspace database?  I MUCH prefer my
> (and it sounds like arjan agrees) clean/dirty versioned flag in inode.

Don't ask me; I think most AV checkers for linux are security theater
and not very much use (other than making money for the AV company's
shareholders) anyway.  I thought you were the one who wanted to record
information about which version of the virus db a particular file had
been scanned against.  The place where I can see this being useful is
what happens you get a new virus DB, and so you need to start scanning
all of the files in your 5TB enterprise file server --- and then the
system crashes or it needs to be taken down for scheduled maintenance.

You want to have *some* off-line database for storing this
information, since it would be silly to want to have the first thing
that happens after a new virus DB gets downloaded is to interate over
the entire filesystem, clearing a persistent the "clean" bit --- that
would take *forever* on a 5TB filerserver; and what happens if you
crash in the middle of clearing the "clean" bit..  And if the system
gets shutdown in the middle of the scan, you need some way of
remembering which inodes have been scanned using the "new" db, and
which ones haven't yet been scanned via the new virus db.  All of this
should be kept in userspace, and is strictly speaking Not Our Problem.

I'm just arguing that there should be absolutely *no* support in the
kernel for solving this particular problem, since the question of
whether a file has been scanned with a particular version of the virus
DB is purely a userspace problem.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] TALPA - a threat model?  well sorta.
Date: Thu, 14 Aug 2008 15:50:57 UTC
Message-ID: <fa.9t+P608EY4ynZb0ObQndN8/Yx7w@ifi.uio.no>

On Thu, Aug 14, 2008 at 09:48:33AM -0400, Eric Paris wrote:
>
> There needs to be a way to say that an inode in cache needs to be
> rescanned.  3 states this flag can be.  Clean, Dirty, Infected.  The
> current talpa solution involves a global monotomically increasing
> counter every time you change virus defs or make some "interesting"
> change.  If global == inode flag we are clean.  If global == negative
> inode flag we are infected.  if global > inode flag we are dirty and
> need a scan.

"Infected" just means to instantly return an error when the file is
opened or if an already opened file descriptor is read or mmap'ed,
right?  If file is already mmaped(), what's the plan?  Send a kill -9
to the process, even if it ends up kill off an emacs or openoffice
process?

> > That seems fair; if it turns out there is an AV product that wants to
> > optimize this a bit further, as long as we provide a persistent inode
> > version/generation number, they can always do their own persistent
> > database in userspace.
>
> exporting i_version might be useful for better userspace caching,
> although I've yet to see any reasonable description of how a userspace
> database can map between data on disk and what they have in userspace.
> How can a userspace process, given 2 file descriptors know they are
> actually the same thing on disk?
>

If a userspace database knows that inode X, i_version Y was checked a
day ago, and inode X still has i_version Y, even if that inode has
been evicted from memory, the contents will be the same absent root
messing about with direct access to the block device.  If there was an
intervening boot, the someone could remove the disk, edit the disk
block directly -- but that person could also add a backdoor to the
kernel while they were at it.

If your threat model is, "we do file scanning; that's it", then having
an external database which uses the inode number and i_version as a
tuple makes a lot of sense --- for filesystems where i_version is
getting bumped on every disk write, which is needed to support NFSv4
cache support, anyway.

							- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] TALPA - a threat model?  well sorta.
Date: Thu, 14 Aug 2008 19:17:54 UTC
Message-ID: <fa.+qJ4X2GyUZiZbLn5HrxhcRrELpg@ifi.uio.no>

On Thu, Aug 14, 2008 at 01:29:45PM -0400, Eric Paris wrote:
>
> is i_version an on disk think?  didn't realize that and just assumed it
> was in in core thing.  I wouldn't have an issue sending i_version to the
> userspace scanner for them to use as they like.
>

It's on-disk for some filesystems, in order to support NFsv4 advanced
caching semantics (which means i_version has to survive a reboot,
which means it has to be on disk).  It is *not* on disk for ext3,
although it is for ext4.

					- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] TALPA - a threat model?  well sorta.
Date: Thu, 14 Aug 2008 19:41:44 UTC
Message-ID: <fa.0Lezb1fC+bXiSmYMMG2vbRx3UMU@ifi.uio.no>

On Thu, Aug 14, 2008 at 03:34:08PM -0400, Christoph Hellwig wrote:
>
> It's not used at all on regular files except for ext4 with a non-default,
> undocumented mount option.  XFS will grow it soon in a similar way as ext4,
> except that it will be documented or I might have even figured out by
> then how to just enabled it from nfsd.

We do need a standardized way of enabling it (since it does cost you
something in performance, so not everyone will want it on), and a
standardized way of reading i_version out to userspace.  Maybe a mount
option is the right way to do it, maybe not.

We may want to take this to the linux-fs list and try to get
agreements on these points; the main reason why it's not enabled by
default in ext4 is because the NFSv4 advanced caching code is in
common use (is it even in mainline)?

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro to alinuxinterfaceforon
Date: Fri, 15 Aug 2008 00:44:21 UTC
Message-ID: <fa.AiUrhslomdSj+AatSL8+nNRDBeg@ifi.uio.no>

On Thu, Aug 14, 2008 at 08:00:05PM -0400, Rik van Riel wrote:
> > Yes, that's the part libmalware.so proposal solves. Given scary number
> > of 0 Linux viruses in wild, it seems to solve the problem pretty well.
>
> If you're trolling, you're not being very good at it.
>
> Just because you cannot easily infect a Linux system from a
> user application does not mean malware cannot do all kinds
> of damage with user privileges.  Think of a key sniffer (using
> the same interface that the X screensavers use) or a spam bot
> running with user privileges.

But Pavel is raising a good question.  In Eric's proposed threat
model, he claimed the only thing that he was trying to solve was
"scanning".  Just file scanning.  That implies no root privileges, but
it also implied that he wasn't worried about malware running with user
privileges, either.  Presumbly, that would be caught and stopped by
the file scanner before the malware had a chance to run; that is the
execve(2) system call would also be blocked until the executable was
scanned.

So if that is the threat model, then the only thing libmalware.so
doesn't solve is knfsd access, and it should be evaluated on that
basis.  If the threat model *does* include malware which is **not**
caught by the AV scanner, and is running with user privileges, then
there are a whole host of other attacks that we have to worry about.
So let's be real clear, up front, what the threat model is, and avoid
changing the model around to rule out solutions that don't fit the
initially preconceived one.  That's how you get to the TSA
confiscating water bottles in airport security lines.

	     	   	      	      	       - Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro to alinuxinterfaceforon
Date: Fri, 15 Aug 2008 11:36:26 UTC
Message-ID: <fa.FnyMcgi1FSciq2uoMsX0Wu5W20I@ifi.uio.no>

On Fri, Aug 15, 2008 at 09:35:13AM +0100, Alan Cox wrote:
> We shouldn't need to care what people do with good interface. What
> matters is in your airport example is that at the infrastructure level
> there is a point you can choose to do scanning and we agree where.
> Whether people use this to provide a Starbucks or goons with rubber
> gloves who take away babies milk is an application layer problem.

If it's a good interface that also happens to address HSM/DMAPI
functionality, as well as a more efficient way for trackerd to work, I
agree completely.  I think you will agree the proposed TALPA interface
is a bit too virus-scanner specific, though?  Especially with explicit
talk of specialized (persistent or not) "clean/dirty/infected" bits
that the kernel would store in the inode for the benefit of the AV
scanner?  That's rather optimized for the goons-with-rubber-gloves
that-make-mothers-drink-their-own-breast-to-prove-it's-not-explosives
crowd, I think...

       	     	      	  	    		   - Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] [RFC 0/5] [TALPA] Intro to
Date: Fri, 15 Aug 2008 13:17:13 UTC
Message-ID: <fa.YXBjFlRoHYSIJNtEs/t+KDyRsJk@ifi.uio.no>

On Fri, Aug 15, 2008 at 08:57:48AM -0400, Press, Jonathan wrote:
> That may just be a question of terminology.  If the bits are construed
> not as clean/dirty/infected, but as "I care about this file" vs. "I
> don't care about this file" then the rubber gloves come off.

Sure, as long as we're very clear about the semantics of the bits.  If
the bits are not persistent, but which get dropped if the inode is
every evicted from memory, and it's considered OK, or even desirable,
to rescan the file when it is brought back into memory, that may be
acceptable to the rubber gloves folks (make people go through lots
superflous of security scans, even when they are transfering betewen
flights --- security is always more important than passengers'
convenience!), but perhaps not to other applications such as file
indexers, who would view rescanning files that have already been
scanned, and not have been modified, as a waste of time, battery, CPU
and disk bandwidth, etc.

As I understand it, the TALPA proposal had non-persistent
clean/dirty/infected bits.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] TALPA - a threat model?  well sorta.
Date: Fri, 15 Aug 2008 17:56:30 UTC
Message-ID: <fa.7B9cpGb876vkCwdPVeUcC8LDiwM@ifi.uio.no>

On Fri, Aug 15, 2008 at 02:18:12PM +0100, douglas.leeder@sophos.com wrote:
> > - New infection makes it onto the machine before the signatures have
> > caught up with it.  This also happens.  There is an ongoing PR race
> > among AV vendors about who was faster on the draw to get out signatures
> > to detect some new malware.  The fact that this race exists reflects
> > that reality that there is some window during which new malware will
> > make it onto some number of machines before the scanners catch up.

Let's go back to the threat model.  The Threat Model which Eric Paris
has suggested is that we are only trying to solve the Scanning
Problem.  Just Scanning.

That implies if the malware has been written to the disk, we will
catch it once AV catching is turned on and the user attempts to run or
otherwise access the file with the bad content.  However, if the
malware starts running, then regardless of whether the malware is
running with user privileges, or manages to get root privileges via
some buffer overflow that wasn't caught via
LSM/SELinux/AppArmor/whatever, this is out of scope of Eric's proposal.

Are we agreed on that?  There may be other components of the solution
such as LSM, SELinux, etc., that will very likely be useful in
protecting the system once the malware starts running.  But I thought
Eric's proposal proposed excluding that from the Threat Model for the
purposes of the interface we are trying to solve.  If that's not true,
let's deal with it now.

> Not to mention removable media - it might be old hat, but infected/malware
> files can come in on floppies, CDs or USB flash discs careless left on the
> pavement outside an office.

That's not a problem given the scanning model proposed by Eric; when
you insert removable media, it will get scanned when it is first
accessed.

	  	    	     	     	     - Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] TALPA - a threat model? well sorta.
Date: Fri, 15 Aug 2008 20:18:08 UTC
Message-ID: <fa.bPERXFfcvaZkGXC0lffvErEmuPo@ifi.uio.no>

On Fri, Aug 15, 2008 at 02:06:47PM -0400, Valdis.Kletnieks@vt.edu wrote:
> This problem is actually identical to "new file scanned, but you don't have
> the signature available yet so malware isn't detected".
>
> Those of us who have seen large mail servers pile up queues in the 10s of
> millions in the 45 minutes between when the worm went critical-mass and when
> we got a signature might disagree on it not being a big problem in practice.

For a mail server, I really think something specialized like ClamAV is
a much better solution than something in userspace, which will
probably decide it has to rescan every single file that gets written,
including your mail server logs.   :-)

A specialized solution for a mail server is *always* going to be able
to a more efficient, more practical, and be able to do
application-specialized things (such as refusing the e-mail while the
connection is still open, so you don't have to worry about being RFC
compliant about sending bounce mails when the SMTP return-path is most
likely bogus).

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [malware-list] scanner interface proposal was: [TALPA] Intro
Date: Mon, 18 Aug 2008 14:25:49 UTC
Message-ID: <fa.F8inMQKQEBuX5OQMCxSjDBoTOik@ifi.uio.no>

On Mon, Aug 18, 2008 at 02:15:24PM +0100, tvrtko.ursulin@sophos.com wrote:
> Then there is still a question of who allows some binary to declare itself
> exempt. If that decision was a mistake, or it gets compromised security
> will be off. A very powerful mechanism which must not be easily
> accessible.  With a good cache your worries go away even without a scheme
> like this.

I have one word for you --- bittorrent.  If you are downloading a very
large torrent (say approximately a gigabyte), and it contains many
pdf's that are say a few megabytes a piece, and things are coming in
tribbles, having either a indexing scanner or an AV scanner wake up
and rescan the file from scratch each time a tiny piece of the pdf
comes in is going to eat your machine alive....

						- Ted

Index Home About Blog