Better Late Than Never: Linux Malware Detect 1.3

Today I have released Linux Malware Detect (LMD) 1.3, the first public stable release of my malware detection tool. The documentation is a little thin but the details are on the project page and the README file should fill you in on anything you need to know, otherwise you can post a comment on the bottom of the project page and I will assist where possible. Input on feature ideas, bugs and malware data is always welcome, see the –help options on LMD for the checkout feature to upload malware data to rfxn.com.

In October I detailed the concepts behind the then to-be-released LMD in a post, though allot has changed since then in how LMD operates, the jist of the post is still on point.

To those (unfortunate?) enough to ride in on the closed testing, it certainly was a long road and I thank everyone that over time submitted new malware data, bug reports and feature ideas. To say this is the most banged-in release of one of my projects would be understated and I hope it shows in the end product.

So, What has changed since the first incarnations of LMD? Well first is that I ditched the whole “chunked hash” concept for a simpler HEX based pattern matching feature to find malware variants which has proved far more accurate and easier to manage. Though I can see some scaling issues with the current implementation of the HEX scanner as the signature set grows, this is something I do expect to resolve in a future release. The basic MD5 hashed scanning is still the stage-1 scanning component and then the HEX scanner picks up as a stage-2 scanner if no MD5 hit was found.

The kernel based inotify real-time file creation/modification monitoring has been reworked and now more gracefully handles users of any type in addition to monitoring the /dev/shm, /var/tmp and /tmp paths on execution of the monitoring component. Also changed is that the scanner will now batch through new/changed files every 30 seconds for the sake of efficiency but this can easily be modified in the internals.conf down to as low as a 1 second iteration on the scanning of new/changed files.

The quarantine queue now stores files original path, owner and mode to facilitate a –restore feature that allows any file to be restored to its original path with owner and file modes restored as well. This can be used to recover false-positive hits or to restore files after you have cleaned malware from within its contents (default quarantining of malware is now also disabled, see conf.maldet).

The final notable change is that there is now a quarantine suspend account feature, the owning user account (UID>=500) can optionally be Cpanel suspended or have its shell set to /bin/false on non-cpanel systems (configurable in conf.maldet). When Cpanel users are suspended, they will have a comment attached to it with the ‘maldet –report SCANID’ value so you can easily call up the report that suspended the user.

There has been many more changes to LMD but I certainly can not list them all, give it a spin and let me know how it goes, happy malware hunting!

BFD 1.4: Important Security Fix

Today I have put up a new release of BFD, version 1.4, that addresses an unsanitized variable issue that is used on the command line. This is a serious issue and should be treated as such, if you currently have BFD installed I would encourage you to update it immediately, the install.sh script in the BFD package will retain all your options and tracking data so the update process is painless.

Current Release:
http://www.rfxn.com/downloads/bfd-current.tar.gz

Change Log:
[Fix] properly sanitized vars passed to the command line
[Fix] ignore.hosts is now updated with system addresses on each bfd run
[Note] thanks to [email protected] for invaluable input and pointers

wget http://www.rfxn.com/downloads/bfd-current.tar.gz
tar xvfz bfd-current.tar.gz
cd bfd-1.4/
./install.sh

Although this issue has many mitigating factors that lessen the severity of the potential impact it is nevertheless very serious and best to opt on the side of caution. I need to extend a special thanks to Jeff Petersen of webhostsecurity.com for identifying this issue in a very professional fashion and offering technical input.

Nginx: Caching Proxy

Recently I started to tackle a load problem on one of my personal sites, the issue was that of a poorly written but exceedingly MySQL heavy application and the load it would induce on the SQL server when 400-500 people were hammering the site at once. Further compounding this was Apache’s horrible ability to gracefully handle excessive requests on object heavy pages (i.e: images). This left me with a site that was almost unusable during peak hours — or worse — would crash the MySQL server and take Apache with it by frenzied F5ing from users.

I went through all the usual rituals in an effort to better the situation, from PHP APC then Eaccelerator, to mod_proxy+mod_cache, to tuning Apache timeouts/prefork settings and adjusting MySQL cache/buffer options. The extreme was setting up a MySQL replication cluster with MySQL-Proxy doing RW splitting/load balancing across the cluster and memcached, but this quickly turned into a beast to manage and memcached was eating memory at phenomenal rates.

Although I did improve things a bit, I had done so at the expense of vastly increased hardware demand and complexity. However, the site was still choking during peak hours and in a situation where switching applications and/or getting it reprogrammed is not at all an option, I had to start thinking outside the box or more to the point, outside Apache.

I have experience with lighttpd and pound reverse proxy, they are both phenomenal applications but neither directly handles caching in a graceful fashion (in pounds case not at all). This is when I took a look a nginx which to date I had never tried but heard many great things about. I fired up a new Xen guest running CentOS 5.4, 2GB RAM & 2 CPU cores….. an hour later I had nginx installed, configured and proxy-caching traffic for the site in question.

The impact was immediate and significant — the SQL server loads dropped from an average of 4-5 down to 0.5-1.0 and the web server loads were near non-existent from previously being on the brink of crashing every afternoon.

Enough with my ramblings, lets get into nginx. You can download the latest release from http://nginx.org and although I could not find a binary version of it, compiling was straight forward with no real issues.

First up we need to satisfy some requirements for the configure options we will be using, I encourage you to look at ‘./configure –help’ list of available options as there are some nice features at your disposal.

yum install -y zlib zlib-devel openssl-devel gd gd-devel pcre pcre-devel

Once the above packages are installed we are good to go with downloading and compiling the latest version of nginx:

wget http://nginx.org/download/nginx-0.8.36.tar.gz
tar xvfz nginx-0.8.36.tar.gz
cd nginx-0.8.36/
./configure --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_image_filter_module --with-http_gzip_static_module
make && make install

This will install nginx into ‘/usr/local/nginx’, if you would like to relocate it you can use ‘–prefix=/path’ on the configure options. The path layout for nginx is very straight forward, for the purpose of this post we are assuming the defaults:

[root@atlas ~]# ls /usr/local/nginx
conf  fastcgi_temp  html  logs  sbin

[root@atlas nginx]# cd /usr/local/nginx

[root@atlas nginx]# ls conf/
fastcgi.conf  fastcgi.conf.default  fastcgi_params  fastcgi_params.default  koi-utf  koi-win  mime.types  mime.types.default  nginx.conf  nginx.conf.default  win-utf

The layout will be very familiar to anyone that has worked with Apache and true to that, nginx breaks the configuration down into a global set of options and then the individual web site virtual host options. The ‘conf/’ folder might look a little intimidating but you only need to be concerned with the nginx.conf file which we are going to go ahead and overwrite, a copy of the defaults is already saved for you as nginx.conf.default.

My nginx configuration file is available at http://www.rfxn.com/downloads/nginx.conf.atlas, be sure to rename it to nginx.conf or copy the contents listed below into ‘conf/nginx.conf’:

user  nobody nobody;

worker_processes     4;
worker_rlimit_nofile 8192;

pid /var/run/nginx.pid;

events {
  worker_connections 2048;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format main '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status  $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/nginx_access.log  main;
    error_log  logs/nginx_error.log debug;

    server_names_hash_bucket_size 64;
    sendfile on;
    tcp_nopush     on;
    tcp_nodelay    off;
    keepalive_timeout  30;

    gzip  on;
    gzip_comp_level 9;
    gzip_proxied any;

    proxy_buffering on;
    proxy_cache_path /usr/local/nginx/proxy levels=1:2 keys_zone=one:15m inactive=7d max_size=1000m;
    proxy_buffer_size 4k;
    proxy_buffers 100 8k;
    proxy_connect_timeout      60;
    proxy_send_timeout         60;
    proxy_read_timeout         60;

    include /usr/local/nginx/vhosts/*.conf;
}

Lets take a moment to review some of the more important options in nginx.conf before we move along…

user nobody nobody;
If you are running this on a server with an apache install or other software using the user ‘nobody’, it might be wise to create a user specifically for nginx (i.e: useradd nginx -d /usr/local/nginx -s /bin/false)

worker_processes 4;
This should reflect the number of CPU cores which you can find out by running ‘cat /proc/cpuinfo | grep processor‘ — I recommend a setting of at least 2 but no more than 6, nginx is VERY efficient.

proxy_cache_path /usr/local/nginx/proxy … inactive=7d max_size=1000m;
The ‘inactive’ option is the maximum age of content in the cache path and the ‘max_size’ is the maximum on disk size of the cache path. If you are serving up lots of object heavy content such as images, you are going to want to increase this.

proxy_send|read_timeout 60;
These timeout values are important, if you run any scripts through admin interfaces or other maintenance URL’s, these values will cause the proxy to time them out — that said increase them to sane values as appropriate, anything more than 300 is probably excessive and you should consider running such tasks from cronjobs.

Apache style MaxClients
Finally, maximum amount of connections, or MaxClients, that nginx can accept is determined by worker_processes * worker_connections/2 (2 fd per session) = 8192 MaxClients in our configuration.

Moving along we need to create two paths that we defined in our configuration, the first is the content caching folder and the second is where we will create our vhosts.

mkdir /usr/local/nginx/proxy /usr/local/nginx/vhosts /usr/local/nginx/client_body_temp /usr/local/nginx/fastcgi_temp  /usr/local/nginx/proxy_temp

chown nobody.nobody /usr/local/nginx/proxy /usr/local/nginx/vhosts /usr/local/nginx/client_body_temp /usr/local/nginx/fastcgi_temp  /usr/local/nginx/proxy_temp

Lets go ahead and get our initial vhosts file created, my template is available from http://www.rfxn.com/downloads/nginx.vhost.conf and should be saved to ‘/usr/local/nginx/vhosts/myforums.com.conf’, the contents of which are as follows:

server {
    listen 80;
    server_name myforums.com alias www.myforuns.com;

    access_log  logs/myforums.com_access.log  main;
    error_log  logs/myforums.com_error.log debug;

    location / {
        proxy_pass http://10.10.6.230;
        proxy_redirect     off;
        proxy_set_header   Host             $host;
        proxy_set_header   X-Real-IP        $remote_addr;
        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;


        proxy_cache               one;
        proxy_cache_key         backend$request_uri;
        proxy_cache_valid       200 301 302 20m;
        proxy_cache_valid       404 1m;
        proxy_cache_valid       any 15m;
        proxy_cache_use_stale   error timeout invalid_header updating;
    }

    location /admin {
        proxy_pass http://10.10.6.230;
        proxy_set_header   Host             $host;
        proxy_set_header   X-Real-IP        $remote_addr;
        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
    }
}

The obvious changes you want to make are ‘myforums.com’ to whatever domain you are serving, you can append multiple aliases to the server_name string such as ‘server_name domain.com alias www.domain.com alias sub.domain.com;‘. Now, lets take a look at some of the important options in the vhosts configuration:

listen 80;
This is the port which nginx will listen on for this vhost, by default unless you specify an IP address with it, you will bind port 80 on all local IP’s for nginx — you can limit this by setting the value as ‘listen 10.10.3.5:80;‘.

proxy_pass http://10.10.6.230;
Here we are telling nginx where to find our content aka the backend server, this should be an IP and it is also important to not forget setting the ‘proxy_set_header Host’ option so that the backend server knows what vhost to serve.

proxy_cache_valid
This allows us to define cache times based on HTTP status codes for our content, for 99% of traffic it will fall under the ‘200 301 302 20m’ value. If you are running allot of dynamic content you may want to lower this from 20m to 10m or 5m, any lower defeats the purpose of caching. The ‘404 1m’ value ensures that not found pages are not stored for long in case you are updating the site/have a temporary error but also prevent 404’s from choking up the backend server. Then the ‘any 15m’ value grabs all other content and caches it for 15m, again if you are running a very dynamic site you may want to lower this.

proxy_cache_use_stale
When the cache has stale content, that is content which has expired but not yet been updated, nginx can serve this content in the event errors are encountered. Here we are telling nginx to serve stale cache data if there is an error/timeout/invalid header talking to the backend servers or if another nginx worker process is busy updating the cache. This is really useful in the event your web server crashes, as to clients they will receive data from the cache.

location /admin
With this location statement we are telling nginx to take all requests to ‘http://myforums.com/admin’ and pass it off directly to our backend server with no further interaction — no caching.

That’s it! You can start nginx by running ‘/usr/local/nginx/sbin/nginx’, it should not generate any errors if you did everything right! To start nginx on boot you can append the command into ‘/etc/rc.local’. All you have to do now is point the respective domain DNS records to the IP of the server running nginx and it will start proxy-caching for you. If you wanted to run nginx on the same host as your Apache server you could set Apache to listen on port 8080 and then adjust the ‘proxy_pass’ options accordingly as ‘proxy_pass http://127.0.0.1:8080;’.

Extended Usage:
If you wanted to have nginx serve static content instead of Apache, since it is so horrible at it, we need to declare a new location option in our vhosts/*.conf file. We have two options here, we can either point nginx to a local path with our static content or have nginx cache our static content then retain it for longer periods of time — the later is far simpler.

Serve static content from a local path:

        location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {
            root   /home/myuser/public_html;
            expires 1d;
        }

In the above, we are telling nginx that our static content is located at ‘/home/myuser/public_html’, paths must be relative!! When a user requests ‘http://www.mydomain.com/img/flyingpigs.jpg’, nginx will look for it at ‘/home/myuser/public_html/img/flyingpigs.jpg’. The expires option can have values in seconds, minutes, hours or days — if you have allot of dynamic images on your site then you might consider an option like 2h or 30m, anything lower defeats the purpose. Using this method has a slight performance benefit over the cache option below.

Serve static content from cache:

        location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {
             proxy_cache_valid 200 301 302 120m;
             expires 2d;
             proxy_pass http://10.10.6.230;
             proxy_cache one;
        }

With this setup we are telling nginx to cache our static content just like we did with the parent site itself, except that we are defining an extended time period for which the content is valid/cached. The time values are, content is valid for 2h (nginx updates cache) and every 2 days the content expires (client browsers cache expires and requests again). Using this method is simple and does not require copying static content to a dedicated nginx host.

We can also do load balancing very easily with nginx, this is done by setting an alias for a group of servers, we then define this alias in place of addresses in our ‘proxy_pass’ settings. In the ‘upstream’ option shown below, we want to list all of our web servers that load should be distributed across:

  upstream my_server_group {
    server 10.10.6.230:8000 weight=1;
    server 10.10.6.231:8000 weight=2 max_fails=3  fail_timeout=30s;
    server 10.10.6.15:8080 weight=2;
    server 10.10.6.17:8081
  }

This must be placed in the ‘http { }’ section of the ‘conf/nginx.conf’ file, then the server group can be used in any vhost. To do this we would replace ‘proxy_pass http://208.76.83.135;’ with ‘proxy_pass http://my_server_group;’. The requests will be distributed across the server group in a round-robin fashion with respect to the weighted values, if any. If a request to one of the servers fails, nginx will try the next server until it finds a working server. In the event no working servers can be found, nginx will fall back to stale cache data and ultimately an error if that’s not available.

Conclusion:
This has turned into a longer post than I had planned but oh well, I hope it proves to be useful. If you need any help on the configuration options, please check out http://wiki.nginx.org, it covers just about everything one could need.

Although I noted this nginx setup is deployed on a Xen guest (CentOS 5.4, 2GB RAM & 2 CPU cores), it proved to be so efficient, that these specs were overkill for it. You could easily run nginx on a 1GB guest with a single core, a recycled server or locally on the Apache server. I should also mention that I took apart the MySQL replication cluster and am now running with a single MySQL server without issue — down from 4.

IRSYNC & Limiting Passwordless SSH Keys

Anyone who has ever used SSH key-pairs to access more than a couple of servers (or hundreds in my case), will tell you they are an invaluable convenience. It is a natural progression and very common usage that SSH key-pairs are coupled with other common tasks or tools, where having a pass phrase attached to the key would be counter-intuitive to the task automation. So, what do we do despite our better judgment? We create key-pairs with absolutely no pass phrase. The implications are abundantly obvious, if the private key ever gets lost or stolen, any accounts that have the key-pair associated to it can be instantly compromised.

In the case of my recently released project Incremental Rsync (IRSYNC), one of the implementation hurdles at work was to have servers backup using a secure medium. This is easily handled with rsync’s -e option to have data transferred over ssh using a key-pair but then the obvious issue comes up that what if a client server ever gets compromised? Then the backup account on the backup server can be compromised (please don’t use root!@#!@#) allowing for backups to be deleted or worse yet data to be stolen for every server that backups to said server/account.

A solution to this is to limit the commands that can be executed over SSH by a specific public key, though this is not a perfect way to mitigate the threat it does go a long way to help. For my backup server implementation I have setup the user ‘irsync’ on the backup server, this account has the usual ‘~irsync/.ssh/authorized_keys’ file where I place the public key. Where things differ is that you prefix a script path in front of the public key that is used to interpret commands sent over ssh, which looks something like this:

command="/data/irsync/validate-ssh.sh" ssh-dss AAAAB3NzaC1kc3MAAAC......87JVNLJ5nhaK1A== irsync@irsync

The ‘validate-ssh.sh’ script is basically a simple interpreter, it looks at the commands being passed over ssh and either allows them or denies them with some logging thrown in for auditing purposes. The script can be downloaded from: http://www.rfxn.com/downloads/validate-ssh.sh. Please take note to edit the scripts ‘log_file=’ value to an appropriate path, usually the base backup path or user homedir.

An example of validate-ssh.sh in play would be as follows, first the client side view then the logs from $log_file:

root@praxis [~]# ssh -i /usr/local/irsync/ssh/id_dsa irsync@buserver3 "rm -rf /some/path"
sshval(13156): ssh command rejected from 192.168.3.33: rm -rf /some/path

root@praxis [~]# ssh -i /usr/local/irsync/ssh/id_dsa irsync@buserver3
sshval(13403): interactive shell rejected from 192.168.3.33

May 04 11:36:15 buserver3 sshval(13156): ssh command rejected from 192.168.3.33: rm -rf /some/path
May 04 11:40:03 buserver3 sshval(13403): interactive shell rejected from 192.168.3.33

On the flip side when a command is authorized, it gets recorded into the $log_file as follows:

May 04 05:29:08 buserver3 sshval(29993): ssh command accepted from 10.10.6.6: rsync --server -lHogDtprx --timeout=600 --delete-excluded --ignore-errors --numeric-ids . /data/irsync/mysql02.mynetwork.com.full

Take note that if you do choose to use validate-ssh.sh with irsync, you will need to create your own script to manage the snapshots as internally irsync uses the find command, piping results to xargs and rm which will not be authorized by validate-ssh.sh (for good reason!). This is actually a very simple task, although all your snapshots will have to use the same rotation age (whatever).

#!/bin/sh
age=14
bkpath=/data/irsync

for i in `ls $bkpath | grep snaps`; do
wd=$bkpath/$i
find $wd -maxdepth 1 -mtime +14 -type d | xargs rm -rf
done

You can save this to /root/irsync_rotate.sh, chmod 750 it and run it as a daily cronjob by linking it into /etc/cron.daily/ (ln -s /root/irsync_rotate.sh /etc/cron.daily/) or you can add an entry into /etc/crontab as follows:

02 4 * * * root /root/irsync_rotate.sh >> /dev/null 2>&1

Although I detailed the use of validate-ssh.sh in the context of backups with irsync, this could easily be adapted to any usage when you want to restrict the commands executed over ssh with key pairs. You could even create your own script in perl or whatever floats your boat and use that instead — if you happen to go that route, please share with me what you created in the comments or by e-mail to ryan <at> rfxn.com.