Suppose you have constructed a biolink-compliant knowledge graph, and want to deploy it as a TRAPI endpoint with limited fuss.
Plater is a web server that automatically exposes a Neo4j or Memgraph instance through TRAPI compliant endpoints.
Plater brings several tools together in a web server to achieve this. It Uses Reasoner Pydantic models for frontend validation
and Reasoner transpiler for transforming TRAPI to and from cypher and querying the Neo4j or Memgraph backend. The Neo4j or Memgraph database
can be populated by using KGX upload, which is able to consume numerous graph input formats. By pointing Plater to Neo4j or Memgraph
we can easily stand up a Knowledge Provider that provides the “lookup” operation and meta_knowledge_graph, as well as providing a platform to
distribute common code implementing future operations across any endpoint built using Plater. In addition, with some configuration
(x-trapi parameters etc...) options we can easily register our new instance to Smart api.
Another tool that comes in handy with Plater is Automat, which helps expose multiple Plater servers at a single public url and proxies queries towards them. Here is an example of running Automat instance.
Nodes are expected to have the following core structure:
- id : as neo4j or memgraph node property with label
id - category : Array of biolink types as neo4j node labels, it is required for every node to have at least the node label "biolink:NamedThing".
- Additional attributes can be added and will be exposed. (more details on "Matching a TRAPI query" section)
Edges need to have the following properties structure:
- subject: as a neo4j or memgraph edge property with label
subject - object: as a neo4j or memgraph edge property with label
object - predicate: as a neo4j or memgraph edge type
- id: as a neo4j or memgraph edge property with label
id - Additional attributes will be returned in the TRAPI response attributes section. (more details on "Matching a TRAPI query" section)
PLATER matches nodes in neo4j or memgraph using node labels. It expects nodes in neo4j or memgraph to be labeled using biolink types. Nodes in neo4j or memgraph can have multiple labels. When looking a node from an incoming TRAPI query graph, the node type(s) are extracted for a node, and by traversing the biolink model, all subtypes and mixins that go with the query node type(s) will be used to lookup nodes.
It's recommended that when encoding nodes labels in neo4j that we use the biolink class genealogy. For instance a node that is known to be
a biolink:SmallMolecule can be assigned all of these classes ["biolink:SmallMolecule", "biolink:MolecularEntity", "biolink:ChemicalEntity", "biolink:PhysicalEssence", "biolink:NamedThing", "biolink:Entity", "biolink:PhysicalEssenceOrOccurrent"] .
By doing such encoding, during lookup the incoming query can be more laxed (ask for biolink:NamedThing) or more specific (ask for biolink:SmallMolecule etc...), and PLATER would be able to use the encoded label information to find matching node(s).
Similarly, for edges, edge labels in neo4j or memgraph are used to perform edge lookup. Predicate hierarchy in biolink would be consulted to find
subclasses of the query predicate type(s) and those would be used in an OR combinatorial fashion to find results.
Plater does subclass inference if subclass edges are encoded into the graph. For eg , let A be a super class of B and C. And let B, C are related to D and E respectively :
(A) <- biolink:subclass_of - (B) - biolink:decreases_activity_of -> (D)
<- biolink:subclass_of - (C) - biolink:decreases_activity_of -> (E)
Querying for A - [ biolink:decreases_activity_of] -> (?) graph structure in TRAPI would give us back nodes D and E.
Plater tries to resolve attibute types and value types for edges and nodes in the following ways.
-
attr_val_map.json: This file has the following structure
{ "attribute_type_map" : { "<attribute_name_in_neo4j>" : "TRAPI_COMPLIANT_ATTRIBUTE_NAME" }, "value_type_map": { "<attribute_name_in_neo4j>" : "TRAPI_COMPLIANT_VALUE_TYPE" } }To explain this a little further, suppose we have an attribute called "equivalent_identifiers" stored in the graph. Our attr_val_map.json would be:
{ "attribute_type_map": { "equivalent_identifiers": "biolink:same_as" }, "value_type_map": { "equivalent_identifiers": "metatype:uriorcurie" } }When Nodes / edges that have equvalent_identifier are returned they would have :
"MONDO:0004969": { "categories": [...], "name": "acute quadriplegic myopathy", "attributes": [ { "attribute_type_id": "biolink:same_as", "value": [ "MONDO:0004969" ], "value_type_id": "metatype:uriorcurie", "original_attribute_name": "equivalent_identifiers", "value_url": null, "attribute_source": null, "description": null, "attributes": null }] } -
In cases where there are attributes in the graph that are not specified in attr_val_map.json, PLATER will try to resolve a biolink class by using the original attribute name using Biolink model toolkit.
-
If the above steps fail the attribute will be presented having
"attribute_type_id": "biolink:Attribute"and"value_type_id": "EDAM:data_0006" -
If there are attributes that is not needed for presentation through TRAPI Skip_attr.json can be used to specify attribute names in neo4j or memgraph to skip. KGX loading adds a new attributes
provided_byandknowledge_sourceto nodes and edges respectively, which are the file name used to load the graph. By default, we have included these to the skip list.
By setting PROVENANCE_TAG environment variable to something like infores:automat.ctd , PLATER will return provenance information on edges.
To run the web server directly:
cd <PLATER-ROOT>
python<version> -m venv venv
source venv/bin/activate
pip install -r PLATER/requirements.txt
Populate .env-template file with settings and save as .env in repo root dir.
WEB_HOST=0.0.0.0
WEB_PORT=8080
GRAPH_DB=neo4j
NEO4J_HOST=neo4j
NEO4J_BOLT_PORT=7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=<change_me>
GRAPH_QUERY_TIMEOUT=600
PLATER_TITLE='Plater'
PLATER_VERSION='1.5.1'
BL_VERSION='4.1.6'
./main.sh
Or build an image and run it.
cd PLATER
docker build --tag <image_tag> .
cd ../ docker run --env-file .env\
--name plater\
-p 8080:8080\
--network <network_where_neo4j_is_running>\
plater-tst
Clustering with Automat Server [Optional]
You can also serve several instances of plater through a common gateway(Automat). On specific instructions please refer to AUTOMAT's readme
The /about endpoint can be used to present meta-data about the current PLATER instance.
This meta-data is served from <repo-root>/PLATER/metadata/about.json file. One can edit the contents of
this file to suite needs. In containerized environment we recommend mounting this file as a volume.
Eg:
docker run -p 0.0.0.0:8999:8080 \
--env NEO4J_HOST=<your_neo_host> \
--env NEO4J_HTTP_PORT=<your_neo4j_http_port> \
--env NEO4J_USERNAME=neo4j\
--env NEO4J_PASSWORD=<neo4j_password> \
--env WEB_HOST=0.0.0.0 \
-v <your-custom-about>:/<path-to-plater-repo-home>/plater/about.json \
--network=<docker_network_neo4j_is_running_at> \
<image_tag>