FU

Monday, November 16th, 2020 3:22 PM

ER diagram in Collibra

Hi everybody,
I started to use Collibra some months ago with the first goal to deliver a data catalog to data citizens in my company.
For our data analysts/scientists will be useful to have an ER diagram to be able to perform SQL query on information retrieved in Collibra. Is it Collibra the right tool to implement this deliverable or is this out of Collibra scope?
Thank you very much in advance for your contribution.

Best regards,
Fabio

1.2K Messages

3 years ago

Too bad I can only give you 1 like, I would like to give you my all likes budget. :joy:
Yes, yes yes, ER models and queries are two artifacts that are immensely useful to understand data and how to use it, but sadly, there is no such functionality in Collibra at present.
It is possibly to poorly emulate it with the lineage feature and custom assets, but I don’t think it is a suitable replacement.

ER diagram: I’m not sure this is on the Collibra roadmap. There is the “logical data dictionary” feature, but that’s about it.
Tools such as idera, dataedo or erwin are really meant for that feature.

Queries: I guess this is what lies behind the “Compositing” (step 11 in the data intelligence journey): data virtualization that allows to discover data, share queries, active metadata management, etc. This is already a standard feature of many analytics data catalogs, hopefully it will come to Collibra in 2021.

1.2K Messages

Hi Arthur, thank you very much for your really helpful answer. Is great also the reference to “Compositing” (step 11 in the data intelligence journey). I’ll deep dive it and come back to you, if you are available, if I have some doubt.
Thank ypu very much indeed.

1.2K Messages

3 years ago

Hi Fabio,

If your only focus is on Logical ER and physical queries, then best of breed tooling (ERwin et al) will be a good fit - noting that Collibra did have some ability to import models from some of these tools (via legacy Collibra Connect), see the Marketplace for any current integrations.

However, in the Data Governance space, there is a lot more to worry about than just an entity and its attributes. Yes, Collibra can do Conceptual / Logical / Physical modeling using the respective asset types and relationships (and you could add custom relationships for each of the cardinalities like 1-to-many). An excellent hidden gem is the Guided Stewardship Operating Model which has best practice advice on modeling. Collibra extends the data modeling in so many ways (policies, glossary, ownership, lineage & technical lineage, workflow, reference data, data quality, reports, data access etc. etc.), so the out of the box assets look remarkably complicated to begin with vs just an ER diagram. You still can’t generate SQL queries from a model in Collibra unfortunately.

Think about Collibra as giving you the ability to do top-to-bottom semantic mapping, from the highest level policy, through a business term, to the data model (conceptual > logical), and then to the lowest level column. Then, add left-to-right data lineage across the data flows right out to an attribute in a report. This takes a lot of work (including connectors to ingest schema), but unlocks a large number of governance use cases / benefits.

Are you working with reverse engineering existing schema, or green fields?

Regards,
JPS

1.2K Messages

Thank you very much John, your answer is really clear. Thanks also for the reference to “Guided Stewardship Operating Model” . I’m trying to understand more from collibra.com. Is this a workflow that I can download in my Collibra test environment and try to understand if works for me?
Thank you very much indeed.

1.2K Messages

Is this a workflow that I can download in my Collibra test environment and try to understand if works for me?

Yes indeed. It is also part of the collibra product roadmap for jan 2021, so I guess it will be out of the box by then, with some enhancements.

1.2K Messages

3 years ago

Hi Fabio, one problem with showing ER-diagrams in Collibra is that the traversal strategy is really more suited for lineage rather than data models. This means that some nodes and edges you’d like to have traversed to get a complete ER diagram, will be skipped.
In the diagram view editor, and also in the result diagram, you see a dropdown for traversal strategy.
The options for end2end, upstream and downstream are for lineage.
You can try to use ‘complete’ for ER diagrams but be very careful, this could lead to a gigantic diagram, that includes too many nodes and edges.

Hope this helps

ER/win Dral :slight_smile:
Product Manager Diagrams

1.2K Messages

Hi Erwin, thank your very much for your contribution. Yes, I understand. Thank you again.

Loading...