OPERAS Metrics System Overview
The software is designed to collect metrics from various sources and is divided into different sections, with the most prominent being the Metrics-drivers-wrapper which contains the packages called ‘drivers’. These drivers serve as entry point components, responsible for gathering data into the system. Following this, we have the ‘plugins’, which are used to normalize the collected data. Finally, the metrics are combined with the altmetrics and sent to the user interface, where they are displayed in a user-friendly Javascript widget.
How it works
This system is divided into different sections. Firstly, there are the drivers, which serve as the components responsible for gathering data into the system as entry points (refer to the point above to view the architecture diagram). In most cases, we connect to the source API to obtain the metrics, which is the preferred method. However, in two cases, we process a CSV file with metrics: 'Access Logs Local' and 'Google Books' (the latter is optional, as it can involve either web scraping Google Books or processing a CSV file uploaded by the user).
Next, we have the plugins, which are responsible for processing this data. Normally, each plugin corresponds to a driver, except for 'JSTOR' and 'Access Logs' which fetch the data individually.
'JSTOR' processes a user's CSV file, and 'Access Logs' performs a call to Google Cloud without any driver intervention. Subsequently, the metrics are saved to the database.
Last but not least, we have a second database that combines the metrics fetched by the drivers and plugins mentioned above, along with the altmetrics obtained from sites such as 'hypothes.is' and 'Wikipedia’, among others. Finally, these combined metrics are sent to the frontend for display in a widget.
Section for each service/module
Metrics-drivers-wrapper
The most interesting modules are the Drivers and Plugins which are explained below:
-
Drivers: Drivers are independent modules that can be installed locally to collect and normalize data from a given platform. The drivers are packages located in PyPI, and you can install them with a single command, e.g.:
pip install <package_name>
; -
Plugins: The plugins are responsible for normalizing the data collected by the drivers. Initially, we retrieve the variables from a YAML file to aid in this normalization and filtering process. Subsequently, a translator is used to convert the relevant data into an acceptable URI identifier, after which the data is saved to the database.
Identifier Translation Service and Tokens API
Used to normalize identifiers' data.
The Identifier Translation Service is a JSON REST API to a database of
publication URIs. The translation service maps works (publications) to URIs
(e.g. info:doi:10.11647/obp.0001
, urn:isbn:9781906924010
,
https://www.openbookpublishers.com/product/3) to allow converting from one
identifier to another.
Centrally-managed OPERAS Metrics
This is the final step that involves combining the data gathered by drivers and plugins with the altmetrics retrieved from sources like Crossref Relationships API which is a separate service combining results from (Hypothes.is, Wikipedia, WordPress, etc.).
How to install / configure the Metrics-Drivers project
There are two different ways of install and configure the system:
Production environment setup, use Docker
There is a directory called docker
where all the config files for this step
are located to run the system as follows:
Build docker image from the parent directory:
$ sudo docker build -t metrics-drivers -f docker/Dockerfile .
- Desired output:
...
Successfully built 20d1a201d6ed
Successfully tagged metrics-drivers:latest
Create environment file:
FLASK_APP=core
FLASK_ENV=development
CONFIG=DevConfig
DRIVERS_SETTINGS_SOURCE='TEST'
DB_USER=your_db_user
DB_PASSWORD=your_db_password
DB_HOST=your_db_host
DB_NAME=your_db_name
DB_PORT=your_db_port
TOKENS_KEY=your_tokens_key
TOKENS_EMAIL=your_tokens_email
TOKENS_PASSWD=your_tokens_passwd
ALTMETRICS_USER=your_altmetrics_user
ALTMETRICS_PASSWORD=your_altmetrics_password
TRANSLATION_API_BASE=your_translation_api_base
REDIS_HOST=your_redis_host
REDIS_PORT=your_redis_port
RMQ_AMQ_SCHEME=your_rmq_amq_scheme
RMQ_USER=your_rmq_user
RMQ_PASSWORD=your_rmq_password
RMQ_HOST=your_rmq_host
RMQ_PORT=your_rmq_port
RMQ_VHOST=your_rmq_vhost
CONSUL_HOST=your_consul_host
CONSUL_TOKEN=your_consul_token
SENTRY_DSN=your_sentry_dsn
METRICS_API_BASE=your_metrics_api_base
Execute the container:
- docker run --env-file docker/env.docker -it --rm -p 80:80 metrics-drivers
Install every component locally
The second option is meant to be for developers which involves installing locally as a step-to-step installation, therefore previous knowledge of Python, Flask and microservices would be ideal.
Please, make sure you’ve gone through the system requirements specified in section 2 of this documentation before proceeding with the setup.
First, clone the repository metrics-drivers: https://gitlab.com/ubiquitypress/metrics-drivers-wrapper
Set up a virtual environment:
with Pyenv
- $ pyenv virtualenv 3.10.6 <name>
with Venv
- $ virtualenv [directory]
- $ source myvenv/bin/activate
Install requirements
pip install -r requirements.txt
Create a file ~/.bash_aliases
, with the content:
# ==========================
# METRICS-DRIVERS-DB-WRAPPER
# ==========================
alias go2metrics-service="cd ~/Documents/projects/metrics-drivers-wrapper; pyenv activate metrics-drivers"
alias with_md_env_="~/.bash_scripts/load_metrics_drivers_env.bash"
alias with_md_env_test="~/.bash_scripts/load_metrics_drivers_env_test.bash"
alias flask_metrics_="with_md_env_ flask"
alias flask_metrics_shell="with_md_env_ flask shell"
alias flask_metrics_run_tests="with_md_env_test python -m unittest discover core.tests -t . -v"
Create a directory in ~/.bash_scripts
;
Create a file called: load_metrics_drivers_env.bash
with the content below:
export FLASK_APP=core
export FLASK_ENV=development
export CONFIG=DevConfig
export RMQ_HOST=localhost
export RMQ_VHOST=metrics
export RMQ_PASSWORD=password
export RMQ_USER=user
export DB_USER='your-user'
export DB_PASSWORD='your-password'
export DB_HOST='localhost'
export PORT='5432'
export DB_NAME='metrics-drivers'
# Live Values
export TOKENS_KEY=””
export TRANSLATION_API_BASE=”"
export DRIVERS_SETTINGS_SOURCE='YAML'
Make the above script executable:
chmod u+x ~/.bash_scripts/load_metrics_drivers_env.bash
Add this line to your ~/.bashrc
file and restart the shell:
source ~/.bash_aliases
Run Database migrations (in your metrics drivers wrapper src folder):
flask_metrics_ db upgrade
- Sample output:
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
Execute the Flask shell to make sure you have access (check whether there are any errors):
flask_metrics_shell
- Sample output:
Python 3.10.13 (main, Sep 27 2023, 10:58:53) [GCC 13.2.1 20230801] on linux
IPython: 8.15.0
App: core
Instance: <your metrics-drivers directory>
In [1]:
Create a test script file: ~/.bash_scripts/load_metrics_drivers_env_test.bash
with the below content:
#!/bin/bash
export FLASK_APP=core
export FLASK_ENV=development
export CONFIG=TestConfig
export DB_USER='test-db-user'
export DB_PASSWORD='test-db-password'
export DB_HOST='localhost'
export PORT='5432'
export DB_NAME='metrics-drivers-test'
ARGS=("$*")
$ARGS
Make the above script executable:
chmod u+x ~/.bash_scripts/load_metrics_drivers_env_test.bash
Finally, run your tests:
$ flask_metrics_run_tests
- Desired output:
----------------------------------------------------------------------
Ran 13 tests in 0.835s
OK