-
Notifications
You must be signed in to change notification settings - Fork 93
Naming things: Rename _kafka_msg_id
to _kafka__msg_id
#289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Unfortunately, CrateDB does not permit column names using leading underscores. However, if two underscores are included, it works. The patch improves storage alignment with the other column which gets materialized as `_kafka__data` also using two underscores.
We think the new field SHOW CREATE TABLE kafka_demo; CREATE TABLE IF NOT EXISTS "doc"."kafka_demo" (
"_kafka__partition" BIGINT,
"_kafka__topic" TEXT,
"_kafka__offset" BIGINT,
"_kafka__ts__type" BIGINT,
"_kafka__ts__value" TIMESTAMP WITH TIME ZONE,
"_kafka__data" TEXT,
"_kafka__msg_id" TEXT,
"__dlt_load_id" TEXT NOT NULL,
"__dlt_id" TEXT NOT NULL
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a breaking change for people that are using ingestr with kafka. Could this be implemented as an adapter that modifies the data that is only registered for CrateDB?
Hi Burak.
That's true. Let's do it more conservatively to not introduce breaking changes without previously emitting warnings. So, the primary question is: Do you agree with the change in general, in order to align the fields on matters of "naming things"? If so, let's publish this change progressively, like any good software is doing it? I will be happy to bring in corresponding changes to this patch. If you don't like it at all, let's just close the PR.
Yes. We often use monkeypatching for such matters, only changing code paths in certain situations, optimally by using a context manager so the change doesn't pollute the runtime context permanently, but only within the scope of a specific invocation. In this case, because CrateDB currently uses a custom dlt destination adapter anyway per dlt-cratedb, we already added a corresponding adjustment per crate/dlt-cratedb@7c14632. In this spirit, we absolutely do not have an immediate need for this change at all, so the case here is effectively just meant as a long-term improvement, which leads to my previous question:
With kind regards, NB: I've also toggled the PR back into draft mode to give it more time to grow the patch into a better shape. |
We would use a dedicated URL parameter on the Kafka source URL to tune the output, when applicable. Silly idea? |
We submitted a patch that can satisfy new requirements (together with other features) while retaining backwards compatibility. |
About
Unfortunately, CrateDB does not permit column names using leading underscores, like
_kafka_msg_id
. However, if two underscores are included, like_kafka__msg_id
, it works.The patch improves storage alignment with the other already existing column
_kafka__data
, which is already using two underscores in its name.Thoughts
We certainly know the obstacle is on CrateDB, while it is not a problem for any other database destination covered by ingestr.
However, we didn't see many dependencies on this column name, so we are humbly asking if you could do us a favor to rename that column, so we will not need to find a different workaround in form of a runtime patch. 🙏
References