Ping a Specific Port

Question

Peter

Asked: 2018-10-18 07:19:03 +0800 CST2018-10-18 07:19:03 +0800 CST 2018-10-18 07:19:03 +0800 CST

Prometheus alert not fired

772

I have setup 5 alerts in my Prometheus setup. 3 of them work as expected. However, I have 2 that are never triggered. I am really confused and I need some help here.

So, the 2 rules that do not work are:

alert: CriticalDiskSpace
expr: node_filesystem_free{filesystem!~"^/run(/|$)",fstype!~"tmpfs",job="{{
  $labels.job }}"} / node_filesystem_size{job="{{ $labels.job }}"} <
  0.25
for: 4m
labels:
  severity: critical
annotations:
  description: '{{ $labels.instance }} of job {{ $labels.job }} has less than 25%
    space remaining.'
  summary: Instance {{ $labels.instance }} - Critical disk space usage

alert: CriticalCPULoad
expr: (100
  * (1 - avg by(instance) (irate(node_cpu{job="{{ $labels.job }}",mode="idle"}[2m]))))
  > 75
for: 2m
labels:
  severity: critical
annotations:
  description: '{{ $labels.instance }} of job {{ $labels.job }} has Critical CPU load
    for more than 2 minutes.'
  summary: Instance {{ $labels.instance }} - Critical CPU load

When I run the rules manually in the Prometheus, I get the correct values. For example, for the HDD, I have a test instance where the FS is at 79%, so, it should fire.

Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       50G   40G   11G  79% /

node_filesystem_free{filesystem!~"^/run(/|$)",fstype!~"tmpfs",fstype!~"rootfs", job="ec2_eu_west_1_discovery"} / node_filesystem_size{job="ec2_eu_west_1_discovery"} < 0.25

And of course, Prometheus has the correct value:

Element:
{device="/dev/xvda1",fstype="xfs",instance="Grafana Test",job="ec2_eu_west_1_discovery",mountpoint="/"}
Value: 
0.21932882130469517

1 Answers

Voted

Peter · Answer 1 · 2018-10-19T04:07:41+08:00

Peter

2018-10-19T04:07:41+08:002018-10-19T04:07:41+08:00

I have found a way to make the rule firing.

So, if I change the expression from this:

node_filesystem_free{filesystem!~"^/run(/|$)",fstype!~"tmpfs",job="{{
  $labels.job }}"} / node_filesystem_size{job="{{ $labels.job }}"} <
  0.25

to this:

node_filesystem_free{filesystem!~"^/run(/|$)",fstype!~"tmpfs"} / node_filesystem_size < 0.25

I get an alert. So, now, I need to understand why in the rules browser I can use the {job="{{ $labels.job }}"} and not in the rules.yml file.

1

Prometheus alert not fired

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?