T

Wednesday, October 19th, 2022 2:07 PM

detailed DGC logging

Hi All, we would like to explore detailed logging in the DGC, i.e.

  1. .API calls - It seems there is an option for that in the console, but where can the actual logs be found?

  2. Administrative functions logging: if somebody changes the scope of a domain, or removes an attribute from an asset type, etc. where are these actions logged?
    Thanks!
    Tom Kuppens

1.2K Messages

2 years ago

  1. No idea, I’m very confused by the documentation: Configure the logging of Collibra DGC API calls. It says logs are written to a database and overwritten (so I guess they don’t use an immutable log) and the documentation belongs to the “DiagnosticFiles” section, so I imagine the purpose is to create a diagnostic file only? It does not look like a feature that is meant to be activated permanently.

  2. No such feature OOTB.

  • Collibra has developed a marketplace OMRE (Operating Model Reverse Engineering) that is pretty nice. You run it on a regular basis in DEV only (not prod), and it copies the metamodel into assets, relations, domains, etc. This gives you full visibility/audit on who created what, when, etc. It’s slow as hell though, since every single action is performed individually, rather than relying on mechanisms such as import/export. It takes about 1h30 only for the metamodel and I had to kill it after 12h when trying to also get metrics.
  • I have developed a small metadata tracking tool in python (150 lines of codes) that flattens the model into a set of relational tables, then perform SCD2 on top of it to keep track of everything that happens. This runs in about 2 minutes.

The OMRE has really nice features, such as prebuilt diagrams that are super useful for activities such as: please tell me all assignments that use this attributes/relations
image

or “how are glossary domain types being used?”

41 Messages

2 years ago

On the API log indeed the documentation is sparse. I was under the impression that the API call logging option under general settings would be more like a permanent logging feature. Under log settings, there is another “API call logging” setting (see below), which can only be enabled for a certain amount of time. Confusing. We raised a ticket for this, but no feedback yet.

I will dig a bit deeper in the OMRE, but it’s looks like an expensive solution for something that could be solved OOTB by simple event logging. It doesn’t have to be activity log stuff, I would settle for just low level dgc.log. :slight_smile:
Yesterday a set of domains all of a sudden had its scope changed to a default scope. Many of our assets appeared empty. Everything is re-set now, and our users are happy again, but we have no clue what happened, no root cause analysis possible here.

Thanks for the insights, Arthur!

Tom

1.2K Messages

  1. Looking at the collibra repo, it seems there is indeed a “component_call_logging” table, which I guess is the target of this setting. I think the main objective of API call logging is to measure how long java API calls are taking.
    This is a very obscure functionality, it’s either not properly documented or should be removed.

  2. Regarding scopes, you’re absolutely right!
    There is absolutely ZERO trace of scope changes (not in the logs, not activities, not even in the database).
    For info, there are currently no expectations to fix that, the issue was proposed in 2021 and turned down.
    Audit Trail for ‘Settings’ Tab Changes | Ideation platform (collibra.com)


Even when looking into the repo, there are two tables “scopes_communities” and “scopes_domains” with no timestamp, no user modification… just very simple associations tables with “scope_id” and “domain_id/community_id”
image

BUT changing the domains/communities of a scope will trigger an update of the scope object => You can find the latest update time and updated by

41 Messages

2 years ago

How do you get access to the repository? This looks mighty interesting!
Thanks!
Tom

1.2K Messages

2 years ago

The repo is a postgres database available in the backup zip (without password). You can restore it to a postgres server: https://datacitizens.collibra.com/forum/t/backup-recovery-issues-in-2022-08-2022-09/2333/4?u=arthur.burkhardt :wink:

2 years ago

Hi all! We are also interested in getting detailed logging, thank you for insights! :+1:
However, the question to Collibra team - whether there is on roadmap detailed statistics:

  • search statistics - by userid/date/search pattern
  • view statistics - by userid/date/asset
    ?

1.2K Messages

There are navigation statistics in the repo, but I don’t know of any API or UI to get access to the data.
Someone has recommended Google Analytics in the past as well, we haven’t used them but I’ve heard good things.

Loading...