-
Notifications
You must be signed in to change notification settings - Fork 93
Tests: Adjust software tests for CrateDB #284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
24d5131
to
8466d1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just adding a few comments at relevant spots to highlight the worst code smells.
# CrateDB needs an explicit flush to make data available for reads immediately. | ||
if dest_engine.dialect.name == "crate": | ||
dest_engine.execute(f"REFRESH TABLE {schema_rand_prefix}.output") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: Is it really needed? Doesn't the dlt adapter already flush the relevant table after concluding the load operation, or is it missing in ingestr?
assert res[0] == (1, "val1", as_datetime("2022-01-01")) | ||
assert res[1] == (2, "val2", as_datetime("2022-02-01")) | ||
|
||
# Compensate for CrateDB types and insert order. | ||
if dest_connection_url.startswith("cratedb://"): | ||
assert (1, "val1", 1640995200000) in res | ||
assert (2, "val2", 1643673600000) in res | ||
else: | ||
assert res[0] == (1, "val1", as_datetime("2022-01-01")) | ||
assert res[1] == (2, "val2", as_datetime("2022-02-01")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well. C'est la vie. Let me add a few thoughts about it.
Maybe improve this spot by patching as_datetime()
at runtime in the context of CrateDB instead, in order to reduce the cluttering on the actual data validation?
On the other hand, aren't there corresponding converter hooks in crate-python already, which could make it work fluently?
Is it even a concern of improving the SQLAlchemy dialect with relevant features?
# CrateDB: Compensate for "Type `date` does not support storage". | ||
updated_at_type = "DATE" | ||
if dest_connection_url.startswith("cratedb://"): | ||
updated_at_type = "TIMESTAMP" | ||
|
||
source_engine = sqlalchemy.create_engine(source_connection_url) | ||
with source_engine.begin() as conn: | ||
conn.execute(f"DROP SCHEMA IF EXISTS {schema_rand_prefix}") | ||
conn.execute(f"CREATE SCHEMA {schema_rand_prefix}") | ||
conn.execute( | ||
f"CREATE TABLE {schema_rand_prefix}.input (id INTEGER, val VARCHAR(20), updated_at DATE)" | ||
f"CREATE TABLE {schema_rand_prefix}.input (id INTEGER, val VARCHAR(20), updated_at {updated_at_type})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e1c80a6
to
4cd8559
Compare
45f2c7c
to
d20b837
Compare
Makefile
Outdated
set -a; source test.env; set +a; TESTCONTAINERS_RYUK_DISABLED=true pytest -n auto -x -rP -vv --tb=short --durations=10 --cov=ingestr --no-cov-on-fail | ||
set -a; source test.env; set +a; TESTCONTAINERS_RYUK_DISABLED=true pytest -n auto -x -rP -vv --tb=short --durations=10 --cov=ingestr --no-cov-on-fail -k "not cratedb" | ||
set -a; source test.env; set +a; TESTCONTAINERS_RYUK_DISABLED=true pytest -rP -vv --tb=short --durations=10 --cov=ingestr --no-cov-on-fail -k "cratedb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discovered that the CrateDB Testcontainers implementation is not ready to be used together with xdist
yet, so we are invoking integration tests against CrateDB separately now.
db792c9
to
0fc85de
Compare
... so invoke software tests on GHA separately.
About
Retroactively add software tests for CrateDB by adjusting the existing test suite.
It is very unlikely the patch will get merged in its current form, so it will need to be maintained separately to provide quality assurance for the CrateDB source and destination adapters.
Thoughts
The patch clearly demonstrates where CrateDB is a different animal compared to other DBMS. We don't know yet how to compress or re-shape those anomaly-balancing boilerplate adjustments, in order to reduce the noise of the patch, but we hope to be able to resolve one or another obstacle for the better as we go.
Backlog
main_test.py
, expand intosources_test.py
anddestinations_test.py
as well.cratedb-toolkit
.References