We have a server with the Alma Linux 9.3 OS.
By default (as well as all current RHEL-like OSs) it has fapolicyd
enabled.
There is also an application server (WildFly/JBoss/Java) running on that server.
The deployed application processes some data files (submited by users) and it works fine in the standard situation.
However currently, there is a period of time when the application needs to process 1000-ish files per minute.
In such a situation, the fapolicyd
overhead is utilising ~15% of CPU which we evaluated as too much.
I was unable to find anyone with a similar problem on the internet.
I'm also suprprised there is no fapolicyd
tag here on ServerFault.
Questions:
- Is there a way to optimize
fapolicyd
configuration so that it could decide faster whether it allows or denies a file access?- One thing that comes to my mind is the ordering of custom rules.
- Maybe using wildcard vs. using literal rules.
- Any hints how to evaluate how much important
fapolicyd
is for us?- Whether we can just turn it off or whether it is really a good idea to have it running despite the huge overhead.
- Whether other distributions also use something like
fapolicyd
or whether it is "just additional security" andSELinux
is enough. (I know they are not the same.)
Sources:
Allow listing executed programs is among the most effective security features. Without it, a compromised user account could execute any arbitrary payload. Or users install programs to their home directory that they should not. Although it is an optional feature that you decide to enable or not.
Inspecting all such file system calls has a performance hit. Although the overhead can be minimized by optimizing the rules and database.
Measure if performance is acceptable from a user's perspective. A response time focused objective, something like "99.9% of application API calls will complete under 1 second", will detect real problems, not just trends in resource utilization.
First for some background on fapolicyd note the performance introduction from the README:
do_stat_report = 1
in config to enable the statistics report, then restart fapolicyd if it has not recently. Analyze/var/log/fapolicyd-access.log
and note the patterns of what PIDs are opening which files.Note the ratio of "hits" to "misses". Higher hit ratio is better, accessing the in-memory database is much faster than file system access and processing. Increase
obj_cache_size
in config to the number of files your system has at once. A possible upper bound is the number of used inodes in the data file system, as fromdf -i
output. Which might be excessive, but if you have the memory why not cache a couple hundred thousand entries.Review configuration in fapolicyd.conf.
integrity
values other thannone
orsize
will compute checksum and have overhead. Especially if you have lots of misses from processing new files, this could be a significant amount of CPU.q_size
should be larger than the "max queue depth" on the access report, however I doubt queue size needs to be increased.Review rules, in compiled.rules from rules.d. RHEL and Fedora populate trusted files from rpm, do not allow execute of unknown files, do not allow the ld.so trick, and allow most opens. If you do modify rules, think about the performance impact of doing more things while that open syscall is waiting.
And as always you can profile what exactly is going on while troubleshooting.
perf top
will print what functions are on CPU, and is even better when debuginfo is installed. bcc-tools package has some neat scripts: opensnoop and execsnoop to list open and exeve calls in real time.Ultimately, its your decision on what controls to put in place to only allow execution of unauthorized programs. Allow list immediately in the exec call, like fapolicyd, is of course very powerful. A less comprehensive alternative could be to restrict shell access: not allow people interactive shells, and lock down permissions of home directories. Or, if a data file system should have no programs at all, consider mounting it noexec. A good security audit would not treat the checklist as immutable, rather it would list alternative controls in place and why.