Usage¶
A quick guide how to install and configure awm.
Installation¶
virtual environment¶
awm can be installed into a python virtual environment:
$ virtualenv venv
$ source venv/bin/activate
$ pip install git+https://github.com/toabctl/awm.git
The available service (awm-crawler and awm-persister) are now available in $PATH and can be executed.
RPM packages¶
There are also prebuilt RPM packages (currently openSUSE only) on the OpenBuildService available:
https://build.opensuse.org/project/show/home:tbechtold:awm
The RPM packages contain a system user, systemd service files
and a configuration file in /etc/awm/config.json
Configure¶
awm-crawler and awm-persister need both a configuration file. The
default path is ~/.config/awm/config.json
. Here’s a
example configuration.
{
"kafka" : {
"servers": "HOST:PORT",
"topic_name": "awm-crawler",
"ssl": {
"enabled": true,
"cafile": "./cacert",
"certfile": "./certfile",
"keyfile": "./keyfile",
"password": "SECRET"
}
},
"persister": {
"postgres": {
"uri": "postgres://USERNAME:PASSWORD@HOST:PORT/DATABASE?sslmode=require"
}
},
"crawler": {
"interval": 5.0,
"urls": {
"https://toabctl.de": { "interval": 1.0, "regex": ".*html.*" },
"https://aiven.io": {},
"https://google.com": {}
}
}
}
Most of the kafka and persister options should be self-explanatory.
Note
the kafka topic configured with topic_name must already exist or kafka must be configured to automatically create new topics. awm will not create the topic.
Note
the database tables needed by awm-persister are automatically created but the database itself must already exist.
The crawler section contains the global check interval. It also contains a map of urls. Every url in that map will be periodically checked. There is also the possibility to do a regular expression check against the url response body. That’s optional.
Start¶
With the RPM packages, systemctl can be used to start the services:
systemctl start awm-crawler
systemctl start awm-persister
Contributing¶
Please use github pull requests against:
https://github.com/toabctl/awm
Make sure the tests and linters are passing. This is done via TravisCI but can also be executed locally:
$ tox -epy38 # for unittests
$ tox -elint # for linters (flake8, mypy)
$ tox -edocs # for documentation build