1 Introduction

The aim of TKCat (Tailored Knowledge Catalog) is to facilitate the management of data from knowledge resources that are frequently used alone or together in research environments. In TKCat, knowledge resources are manipulated as modeled database (MDB) objects. These objects provide access to the data tables along with a general description of the resource and a detail data model generated with ReDaMoR documenting the tables, their fields and their relationships. These MDB are then gathered in catalogs that can be easily explored an shared. TKCat provides tools to easily subset, filter and combine MDBs and create new catalogs suited for specific needs.

The TKCat R package is licensed under GPL-3.

This vignette describes how TKCat can be used with a ClickHouse database. Users should also refer to the general TKCat user guide. Finally, a specific vignette describes how to create and administrate a TKCat ClickHouse instance, and another is dedicated to an extended documentation of collections.

2 Connection

The connection to a chTKCat database is achieved by instantiating chTKCat object.

k <- chTKCat(
   host="localhost", # default parameter
   port=9101L,       # default parameter
   user="default",   # default parameter
   password=""       # if not provided the password is requested interactively 
)

Connection parameters should be adapted according to the ClickHouse database setup which is documented in the web exploration interface (in the “System” tab).

3 Use

The chTKCat object can be used as a TKCat object.

list_MDBs(k)                     # list all the MDBs in a TKCat object
khpo <- get_MDB(k, "HPO")        # get a specific MDBs from the catalog
search_MDB_tables(k, "disease")  # Search table about "disease"
search_MDB_fields(k, "disease")  # Search a field about "disease"
collection_members(k)            # Get collection members of the different MDBs

The khpo object in the example above is a chMDB object. The data of this type of MDB are located in the ClickHouse database until the user requests them. They can be use like any other MDB object:

The user is able to access only chMDB allowed according to his rights (see the operation manual).

4 Reconnecting

If the connection to the database is lost the user can reactivate it by using the db_reconnect() function which works with chTKCat objects but also with chMDB objects.

5 Pushing an MDB in a chTKCat instance

MDB can be add to the ClickHouse database of a chTKCat object with the as_chMDB() function.

lk <- TKCat(read_fileMDB(
   path=system.file("examples/ClinVar", package="TKCat")
))
names(lk)
as_chMDB(lk$ClinVar, k, overwrite=FALSE)

This function works only if (see the operation manual):

  • the ClinVar database has been initiated by the administrator of the ClickHouse database,
  • the user is allowed to write in the database.

6 Acknowledgments

This work was entirely supported by UCB Pharma (Early Solutions department).