Data Engineering

"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
The job description vs reality
165
2
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
3 Key Takeaways from Airflow Summit 2023
www.astronomer.io
-1
0
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Airflow Summit 2023 - Recordings Now Available
https://www.youtube.com/playlist?list=PLGudixcDaxY29qXIXhd90htHp_BFk-Bqf
0
0
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering nydas Now 100%
Unified Star Schema
https://towardsdatascience.com/the-new-unified-star-schema-paradigm-in-analytics-data-modeling-review-a245b2641dc8

Hi all, I was recently reading about the Unified Star Schema and the Puppini Bridge. I’m curious whether anyone here has experience with it and what their thoughts are. TIA

4
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Who wants clean data?
19
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering bahmanm Now 100%
Visualising Kafka Internals

I plan to run a few tests to determine if Kafka is suitable for a certain usecase I have in mind. My idea is to run a local cluster of Kafka servers (either VMs or containers), produce/consume a series of messages, observe a bunch of metrics (Prometheus & Grafana) and custom business logic outcomes. What are some good tools to record and visualise the internals of Kafka cluster? I'm looking for things like consumer lag, topic replication, possibly tracing messages, ... *Originally posted on https://mastodon.social/@bahmanm/110662538718523380*

3
2
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
One big table vs a dimensional model

Hi fellow data engineers, Currently I’m restructuring a pipeline written with pyspark on Databricks. Since it’s a lot of transformations, results in an extensive DAG, but it’s cool to spend some extra processing resources to make a standard dimensional model (apart from the necessary transformations). Was wondering what real benefits you have seen a star schema design has from the “one big table” approach, I could preach to my team? (My goal mainly would be to have a resulting smaller PowerBI model.) And as a side question, what tools do you use to create a dimensional model such a star schema with code? Thanks a lot!

3
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering nydas Now 100%
Free resource books
books.goalkicker.com

Thought I’d share this link. I’m not affiliated in any way.

8
2
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering nydas Now 100%
Data Vault 2.0 Advanced Material

Hey there community! Does anyone have any resources they could share relating to Data Vault 2.0, specifically the joining of SAL and PIT tables? The two main books on the architecture are very sparse on this area, which I would have thought would be a fairly key component for any mid-to-large organisation.

2
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
33 data offerings by AWS, Azure, Google Cloud

![](https://lemmy.ml/pictrs/image/9fd9e352-6cb8-413b-b6f9-1e40ab4d78c1.webp)

4
0
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
This one never gets old

![](https://lemmy.ml/pictrs/image/d703acc5-18c0-47a6-b4b1-4c5869b845e2.webp)

3
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering roadmap

What needs to be added for 2023

12
6
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Welcome to the community, want to join the mod team?

Fellow data engineers, looking forward to your contribution/participation to the communiy. If you want to help in managing the community, get in touch to join the team

6
1