Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
A
AlekSIS-Core
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Service Desk
Analyze
Contributor analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
AlekSIS®
Official
AlekSIS-Core
Commits
7c8c48cd
Commit
7c8c48cd
authored
3 years ago
by
Tom Teichler
Browse files
Options
Downloads
Patches
Plain Diff
Update monitoring docs
parent
b06835dc
No related branches found
Branches containing commit
No related tags found
Tags containing commit
1 merge request
!898
Re-merge handbook
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
docs/admin/05_monitoring.rst
+104
-27
104 additions, 27 deletions
docs/admin/05_monitoring.rst
with
104 additions
and
27 deletions
docs/admin/05_monitoring.rst
+
104
−
27
View file @
7c8c48cd
Monitoring
##########
.. _sec:Monitoring:
Prometheu
s
**********
Monitoring and health check
s
============================
AlekSIS provides a metric endpoint at `/metrics`, so you can scrape metrics i
n
your Prometheus instance.
Configuratio
n
-------------
Available metric
s
=================
Threshold
s
~~~~~~~~~~
The exporter provides metrics about responses and requests, e.g. statistics
about response codes, request latency and requests per view. It also
provides data about database operations.
Thresholds for health checks can be configured via config file
(``/etc/aleksis``).
Prometheus config to get metrics
================================
.. code:: toml
To get metrics of your AlekSIS instance, just add the following to your
`prometheus.yml`::
[health]
disk_usage_max_percent = 90
memory_min_mb = 500
- job_name: aleksis
static_configs:
- targets: ['my.aleksis-instance.com']
metrics_path: /metrics
[backup.database]
check_seconds = 7200
[backup.media]
check_seconds = 7200
Grafana
*******
Status page
-----------
Visualise metrics with Grafana
==============================
AlekSIS status page show information about the health of your AlekSIS
instance. You can visit it via the left navigation bar (Admin → Status).
If you want to visualise your AlekSIS metrics with Grafana, you can use one
of the public available Grafana dashboards, for example the following one,
or just write your own.
The page show information about debug and maintenance mode, a summary of
your health checks and the last exit status of your celery tasks. This
page can not be used as a health check, it will always return HTTP 200
if the site is reachable.
https://grafana.com/grafana/dashboards/9528
Health check
------------
The health check can be used to verify the health of your AlekSIS
instance. You can access it via the browser
(https://aleksis.example.com/health/) and it will show you a summary of
your health checks. If something is wrong it will return HTTP 500.
It is also possible to get a JSON response from the health check, for
example via ``curl``. You only have to pass a valid
``Accept: application/json`` header to your request.
The health check can also be executed via ``aleksis-admin``:
.. code:: shell
$ aleksis-admin health_check
Monitoring with Icinga2
-----------------------
As already mentioned, there is a JSON endpoint at
https://aleksis.example.com/health/. You can use an json check plugin to
check seperate health checks or just use a HTTP check to check if the
site returns HTTP 200.
Performance monitoring with Prometheus
--------------------------------------
AlekSIS provides a Prometheus exporter. The exporter provides metrics
about responses and requests, e.g. s about response codes, request
latency and requests per view. It al provides data about database
operations.
Metrics endpoint
~~~~~~~~~~~~~~~~
The metrics endpoint can be found at
https://aleksis.example.com/metrics. In the default configuration it can
be scraped from everywhere. You might want to add some webserver
configuration to restrict access to this url.
To get metrics of your AlekSIS instance, just add the following to
``prometheus.yml``
.. code:: yaml
- job_name: aleksis
static_configs:
- targets: ['my.aleksis-instance.com']
metrics_path: /metrics
Rules for prometheus alertmanager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you are using the prometheus alertmanager, it is possible to create
some alerting rules so that an alert is fired when your AlekSIS instance
is slow or something.
.. code:: yaml
groups:
- name: aleksis
rules:
- alert: HighRequestLatency
expr: histogram_quantile(0.999, sum(rate(django_http_requests_latency_seconds_by_view_method_bucket{instance="YOUR-INSTANCE",view!~"prometheus-django-metrics|healthcheck"}[15m])) by (job, le)) < 30
for: 15m
labels:
severity: page
annotations:
summary: High request latency for 15 minutes
Grafana dashboard
~~~~~~~~~~~~~~~~~
There is a Grafana dashboard available to visualize the metrics.
The dashboard is available at
https://grafana.com/grafana/dashboards/9528.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment