Configuration
Medcat service can be configured with environment variables on startup.
Service Environment vars
The following environment variables are available for tailoring the MedCAT Service gunicorn server:
SERVER_HOST- specifies the host address (default:0.0.0.0),SERVER_PORT- the port number used (default:5000),SERVER_WORKERS- the number of workers serving the Flask app working in parallel (default:1; only used in production server).SERVER_WORKER_TIMEOUT- the max timeout (in sec) for receiving response from worker (default:300; only used with production server).SERVER_GUNICORN_MAX_REQUESTS- maximum number of requests a worker will process before restarting (default:1000),SERVER_GUNICORN_MAX_REQUESTS_JITTER- adds randomness toMAX_REQUESTSto avoid all workers restarting simultaneously (default:50),SERVER_GUNICORN_EXTRA_ARGS- any additional Gunicorn CLI arguments you want to pass (default: none). (Example value: "SERVER_GUNICORN_EXTRA_ARGS=--backlog 20")
The following environment variables are available for tailoring the MedCAT Service wrapper:
APP_MODEL_NAME- an informative name of the model used by MedCAT (optional),APP_MODEL_CDB_PATH- the path to the model's concept database,APP_MODEL_VOCAB_PATH- the path to the model's vocabulary,APP_MODEL_META_PATH_LIST- the list of paths to meta-annotation models, each separated by:character (optional),APP_BULK_NPROC- the number of threads used in bulk processing (default:8),APP_MEDCAT_MODEL_PACK- MedCAT Model Pack path, if this parameter has a value IT WILL BE LOADED FIRST OVER EVERYTHING ELSE (CDB, Vocab, MetaCATs, etc.) declared above.APP_ENABLE_METRICS- Enable prometheus metrics collection served on the path /metricsAPP_ENABLE_DEMO_UI- Enable the demo user interface to try models. (Default:False)APP_DEMO_UI_PATH- Customise the path of the demo UI. (Default:/)
Shared Memory (DOCKER_SHM_SIZE)
The MedCAT service uses PyTorch multiprocessing and memory-mapped models, which rely on Linux shared memory (/dev/shm).
By default, Docker limits this to 64 MB, which is insufficient for NLP models.
Use the environment variable DOCKER_SHM_SIZE to control the size of shared memory inside the container.
You can set this variable in the env/general.env file.
- Recommended:
8gfor bulk inference (APP_BULK_NPROC > 1) - Minimum:
1gfor single-process inference (APP_BULK_NPROC=1)
Example:
DOCKER_SHM_SIZE=8g
Telemetry
MedCAT Service supports exporting traces using Opentelemetry
To enable distributed tracing and telemetry in the MedCAT Service, several environment variables must be set. These can be configured in your environment files or exported in your startup scripts (see start_service_debug.sh and related files):
| Environment Variable | Description | Example Value |
|---|---|---|
APP_ENABLE_TRACING |
Enable OpenTelemetry tracing in the application. | True |
OTEL_TRACES_EXPORTER |
Exporter to use for traces (commonly otlp). |
otlp |
OTEL_SERVICE_NAME |
Logical service name for your traces. | medcat-service |
OTEL_EXPORTER_OTLP_ENDPOINT |
URL for your OpenTelemetry collector. | http://localhost:4317 |
OTEL_EXPORTER_OTLP_PROTOCOL |
Protocol to use for OTLP exporter. | grpc |
OTEL_METRICS_EXPORTER |
Set to none to disable metrics export, or another value if metrics are enabled. |
none |
OTEL_PYTHON_FASTAPI_EXCLUDED_URLS |
Comma-separated list of URLs to exclude from tracing and metrics (e.g., health/metrics endpoints). | /api/health,/metrics |
OTEL_EXPERIMENTAL_RESOURCE_DETECTORS |
Additional resource detectors to use (comma-separated). | containerid,os |
See https://opentelemetry-python.readthedocs.io/en/latest/sdk/environment_variables.html for the full list of opentelemetry environment variables.
Performance Tuning
Theres a range of factors that might impact the performance of this service, the most obvious being the size of the processed documents (amount of text per document) as well as the resources of the machine on which the service operates.
The main settings that can be used to improve the performance when querying large amounts of documents are : SERVER_WORKERS (number of flask web workers that chan handle parallel requests) and APP_BULK_NPROC (threads for annotation processing).
MedCAT library
MedCAT parameters are defined in selected envs/medcat* file.
For details on available MedCAT parameters please refer to the official GitHub repository.