Skip to content
GitLab
Explore
Projects
Groups
Topics
Snippets
Projects
Groups
Topics
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Register
Sign in
Toggle navigation
Menu
UPSMF
uphrh-sb-devops
Commits
7945e1c4
Unverified
Commit
7945e1c4
authored
5 years ago
by
Rajesh Rajendran
Committed by
GitHub
5 years ago
Browse files
Options
Download
Plain Diff
Merge pull request #653 from RaniMounikaKotakadi/sunbird-monitoring-release-2.2.0
Updated alert rules with proper messages
parents
9da2e9f2
95f7d542
sunbird-monitoring-release-3.5.0_RC1
sunbird-monitoring-release-3.3.0_RC1
sunbird-monitoring-release-2.6.0
sunbird-monitoring-release-2.6.0_RC6
sunbird-monitoring-release-2.6.0_RC5
sunbird-monitoring-release-2.6.0_RC4
sunbird-monitoring-release-2.6.0_RC3
sunbird-monitoring-release-2.6.0_RC2
sunbird-monitoring-release-2.6.0_RC1
sunbird-monitoring-release-2.5.0_484f884
sunbird-monitoring-release-2.5.0
sunbird-monitoring-release-2.5.0_RC3
sunbird-monitoring-release-2.5.0_RC2
sunbird-monitoring-release-2.5.0_RC1
sunbird-monitoring-release-2.3.0_9009005
sunbird-monitoring-release-2.3.0
sunbird-monitoring-release-2.3.0_RC4
sunbird-monitoring-release-2.3.0_RC3
sunbird-monitoring-release-2.3.0_RC2
sunbird-monitoring-release-2.3.0_RC1
sunbird-monitoring-release-2.2.0_RC1
secor-lag
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
ansible/roles/stack-monitor-stateful/templates/alertrules.services.yml
+49
-1
.../stack-monitor-stateful/templates/alertrules.services.yml
with
49 additions
and
1 deletion
+49
-1
ansible/roles/stack-monitor-stateful/templates/alertrules.services.yml
+
49
−
1
View file @
7945e1c4
...
...
@@ -16,4 +16,52 @@ groups:
severity
:
CRITICAL
annotations
:
description
:
'
The
service
status
has
changed
{%
raw
%}{{$value}}{%
endraw
%}
times
in
last
2
minutes.
Threshold
is
:
2'
summary
:
Health check is failing
summary
:
Health check is failing
-
alert
:
too_many_server_side_http_errors_5xx_WARNING
expr
:
(sum(increase(nginx_http_requests_total{status=~"5.."}[1m])) / sum(increase(nginx_http_requests_total[1m]))) * 100 >=
0.075
for
:
15m
labels
:
severity
:
WARNING
annotations
:
description
:
'
Server
side
http
errors:
{%
raw
%}{{$value}}{%
endraw
%}%
has
exceeded
threshold
of
0.075%'
summary
:
Too many server side http errors_5xx_WARNING
-
alert
:
too_many_server_side_http_errors_5xx_CRITICAL
expr
:
(sum(increase(nginx_http_requests_total{status=~"5.."}[2m])) / sum(increase(nginx_http_requests_total[2m]))) * 100 >=
0.1
for
:
2m
labels
:
severity
:
CRITICAL
annotations
:
description
:
'
Server
side
http
errors:
{%
raw
%}{{$value}}{%
endraw
%}%
has
exceeded
threshold
of
0.1%'
summary
:
Too many server side http errors_5xx_CRITICAL
-
alert
:
too_many_server_side_http_errors_5xx_FATAL
expr
:
(sum(increase(nginx_http_requests_total{status=~"5.."}[5m])) / sum(increase(nginx_http_requests_total[5m]))) * 100 >=
0.1
for
:
5m
labels
:
severity
:
FATAL
annotations
:
description
:
'
Server
side
http
errors:
{%
raw
%}{{$value}}{%
endraw
%}%
has
exceeded
threshold
of
0.1%'
summary
:
Too many server side http errors_5xx_FATAL
-
alert
:
too_many_client_side_http_errors_4xx_WARNING
expr
:
(sum(increase(nginx_http_requests_total{status=~"4.."}[5m])) / sum(increase(nginx_http_requests_total[5m]))) * 100 >=
1
for
:
15m
labels
:
severity
:
WARNING
annotations
:
description
:
'
Client
side
http
errors:
{%
raw
%}{{$value}}{%
endraw
%}%
has
exceeded
threshold
of
1%'
summary
:
Too many client side http errors_4xx_WARNING
-
alert
:
too_many_client_side_http_errors_4xx_CRITICAL
expr
:
(sum(increase(nginx_http_requests_total{status=~"4.."}[5m])) / sum(increase(nginx_http_requests_total[5m]))) * 100 >=
2
for
:
15m
labels
:
severity
:
CRITICAL
annotations
:
description
:
'
Client
side
http
errors:
{%
raw
%}{{$value}}{%
endraw
%}%
has
exceeded
threshold
of
2%'
summary
:
Too many client side http errors_4xx_CRITICAL
-
alert
:
too_many_client_side_http_errors_4xx_FATAL
expr
:
(sum(increase(nginx_http_requests_total{status=~"4.."}[5m])) / sum(increase(nginx_http_requests_total[5m]))) * 100 >=
3
for
:
15m
labels
:
severity
:
FATAL
annotations
:
description
:
'
Client
side
http
errors:
{%
raw
%}{{$value}}{%
endraw
%}%
has
exceeded
threshold
of
3%'
summary
:
Too many client side http errors_4xx_FATAL
This diff is collapsed.
Click to expand it.
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment
Menu
Explore
Projects
Groups
Topics
Snippets