Skip to content

Fixed #21961 -- added support for database level on delete cascade #8661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 28 commits into from

Conversation

NickStefan
Copy link

Ticket https://code.djangoproject.com/ticket/21961

General Approach:

  • I've added a models.DB_CASCADE option for on_delete. Its essentially, DO_NOTHING, but the ForeignKey field sets a flag that affects the sql strings in /backends/base/schemas.

Caveats for models.DB_CASCADE :

  • does not support on_delete signals
  • will not cascade delete multiple inherited tables as expected
  • will not trigger CASCADE on another model. E.g. Model A points to model B, via -
    DB_CASCADE. Model B points to model C, via CASCADE. A will cascade delete B, B will django delete C, but deleting A will not delete C!
  • ManyToMany with db level cascade delete will require manually defining a through table and setting the ForeignKeys as DB_CASCADE

Existing Packages:

  • I originally proof of concept'd this as a postgres only external package https://github.com/NickStefan/django-db-cascade. I think we can actually support Postgres, MySQL, and even SQLite. With that scope, I believe its better as part of django.

Other relevant tickets if we wish to support SQLite ON DELETE CASCADE

@charettes
Copy link
Member

Haven't gone through a full review but I just wanted to mention you might want to avoid special casing DB_CASCADE (e.g. field.on_delete_db_cascade) so DB_SET_NULL and such can be easily added as well. I think you could do by defining a deletion class that makes __call__() a noop.

e.g.

class DatabaseOnDelete:
    def __init__(self, operation):
        self.operation = operation

    def __call__(self, collector, field, sub_objs, using):
        pass

    def as_sql(self, connection):
        return connection.ops.fk_on_delete_sql(self.operation)

DB_CASCADE = DatabaseOnDelete('CASCADE')
DB_SET_NULL = DatabaseOnDelete('SET NULL')
...

@NickStefan
Copy link
Author

NickStefan commented Jun 20, 2017

Thanks @charettes . I definitely see where thats heading and how it would be less of a special case. Ill take a stab at that approach.

@NickStefan
Copy link
Author

NickStefan commented Jun 22, 2017

RE: SQLite:

  • models.DB_CASCADE would only work with a new OPTIONS. Something like: settings.DATABASES.["default"]["OPTIONS"]["PRAGMA_foreign_keys"] = True (which then leads to the database wrapper running the correct SQL: https://sqlite.org/foreignkeys.html#fk_enable).
  • If I add the pragma flag to /tests/sqlite.py, the test suite fails quite a few tests (50+).
  • Therefore, SQLite PRAGMA foreign keys, seems to be a bigger and separate scope. It does have its own ticket here https://code.djangoproject.com/ticket/14204

For now, going to just focus on mysql and postgres support for models.DB_CASCADE, and revert any sqlite changes.

@NickStefan NickStefan changed the title WIP: Fixed #21961 -- add support for database level cascade delete Fixed #21961 -- add support for database level cascade delete Jun 22, 2017
@NickStefan
Copy link
Author

Besides squashing the commits down, this is ready for a review of some sort. I can do that after I do any suggested changes. Pinging @charettes , as chimed in earlier. Thanks!

@NickStefan
Copy link
Author

with #8678 , this PR could also be expanded to cover not just Postgres and mysql, but also sqlite

@timgraham
Copy link
Member

Yes, I'd like to merge that PR first, then you can rebase your branch and add SQLite support back. We'll also need Oracle support. See OracleTestSetup if you're willing to work on that, otherwise we can ask for some help there.

@NickStefan
Copy link
Author

@timgraham sounds good on adding SQLite and Oracle 👍

@NickStefan NickStefan changed the title Fixed #21961 -- add support for database level cascade delete WIP Fixed #21961 -- add support for database level cascade delete Jul 13, 2017
@NickStefan NickStefan changed the title WIP Fixed #21961 -- add support for database level cascade delete Fixed #21961 -- added support for database level on delete cascade Jul 23, 2017
@NickStefan
Copy link
Author

@timgraham I was able to add SQLite and Oracle support (in addition to PostgreSQL and MySQL). I also squashed down some of the commits so that this PR is easier to review.

@django django deleted a comment from NickStefan Jul 24, 2017
@django django deleted a comment from felixxm Jul 24, 2017
@timgraham
Copy link
Member

As you've taken different design decisions than what's suggested in the ticket (for example, no separate on_delete_db option as Carl suggested), it would be appropriate to write to the django-developers mailing list to explain the different design options and the choices you made so we can make sure there's consensus about the API.

@NickStefan
Copy link
Author

NickStefan commented Aug 31, 2017

@timgraham With your suggestion, I did solicit feedback from the django developer group about a month ago: https://groups.google.com/forum/#!topic/django-developers/GZZs3Nxk7sE . I explained the implementation differences between the original ticket and this PR. I also received some helpful feedback from one group member regarding cascade delete and concrete model inheritance.

The only thing left per the ticket / developer group was to implement check warnings for cases DB_CASCADE won't be supported. I've updated the PR with check warnings for:

  • GenericForeignKey + DB_CASCADE
  • concrete model inheritance + DB_CASCADE
  • model A DB_CASCADE --> model B CASCADE (wont get triggered) --> model C

Any further direction, regarding getting this merged, or further improved and merged, would be greatly appreciated. Thank you!

@yywing
Copy link

yywing commented Apr 13, 2018

How is it going? Would it be mergad?

@Pepedou
Copy link

Pepedou commented Nov 6, 2018

Any updates on this?

@NickStefan
Copy link
Author

NickStefan commented Nov 14, 2018

@Pepedou @yywing My understanding is that at one point I had the PR up to date, with new tests and check warnings. The initial review from maintainers was that they did not like this implementation as it differed from the one suggested in the ticket originally filed 4 years ago.

I pointed out that things have changed in those 4 years (mainly that because 2018 DBs can do this feature natively we can do this as on_delete=models.DB_CASCADE rather than add a new kwarg on_delete_db=models.DB_CASCADE. I just dont think a separate kwarg is necessary in 2018, but people are free to disagree. No hard feelings).

I never got a response. And now django 2.0 has come and gone and there's merge conflicts and the PR would need to be updated with Master...

@timgraham
Copy link
Member

If you're able to target it for Django 2.2, add a link from the ticket and check "Has patch". Since that wasn't done, the ticket didn't appear in the review queue.

@Pepedou
Copy link

Pepedou commented Nov 17, 2018

@NickStefan Thanks for your reply. That's too bad as what you describe seems like a great idea.

@NickStefan
Copy link
Author

Thanks all. Im going to try and resurrect this soon. Possibly for 2.2. I dont have an exact timeline as work has gotten a lot busier, but it's back on my radar.

@IvanAnishchuk
Copy link

IvanAnishchuk commented Nov 20, 2018

defining a deletion class
no separate on_delete_db option
we can do this as on_delete=models.DB_CASCADE rather than add a new kwarg on_delete_db=models.DB_CASCADE. I just dont think a separate kwarg is necessary in 2018

How about making those deletion operations ORable? So it would look like on_delete=CASCADE|DB_SET_NULL (but can just set on_delete=DB_CASCADE to mean on_delete=DO_NOTHING|DB_CASCADE) combining both options. It's kinda like how Q-objects work, seems to be appropriate here as well (and could be implemented even without any special classes and functions, with simple integer constants) and you can easily raise ImproperlyConfigured (or whatever necessary) when some incompatible combination is used. I do agree that adding a separate argument is unnecessary but I think it's better to have options where possible if it doesn't add much complications.

@jdufresne
Copy link
Member

@NickStefan Are you still planning to resume this PR? If you no longer have the time, I might be interested in giving it a go. Thanks for all the hard work so far!

@Pepedou
Copy link

Pepedou commented Oct 19, 2019

Please do, this'd be an awesome feature.

@NickStefan
Copy link
Author

To all interested, I've brought the PR back up to date with master. All tests and checks back to passing. Thanks all for your patience 😄 !

has_patch has been checked in the ticket (https://code.djangoproject.com/ticket/21961), so hopefully this will be in the code review queue for 3.0.

@ngnpope
Copy link
Member

ngnpope commented Oct 22, 2019

Welcome back @NickStefan! Thanks for picking this up again - it'll be nice addition.

Just a heads-up, there will need to be some changes as it looks as though PR #7364 is getting close to being landed which introduces RESTRICT to resolve ticket #27272.

I see that you've added DB_CASCADE which is the server-side equivalent to the application-side CASCADE. You've changed DO_NOTHING which previously just disabled any special handling in the application to actually do NO ACTION in the database. I'm not sure whether this will cause migrations to be generated in existing projects when upgrading Django. Perhaps we should leave DO_NOTHING as "let the database decide" and add DB_NO_ACTION explicitly?

So PostgreSQL supports NO ACTION, CASCADE, RESTRICT, SET NULL, and SET DEFAULT. (I haven't checked the other backends.) Django supports the following application-side actions: DO_NOTHING, CASCADE, PROTECT, SET, SET_NULL, and SET_DEFAULT.

When #7364 lands we should also add DB_RESTRICT to mirror RESTRICT. Other than that, we can probably do DB_SET_NULL to mirror SET_NULL. We currently can't add DB_SET_DEFAULT as that complicated by the way that Django manages column defaults application-side. However, there is some work ongoing to attempt to sort out default values managed in the database, at which point adding support for this may be useful, or indeed necessary. I don't think PROTECT or SET have any equivalent, at least not in PostgreSQL.

...so hopefully this will be in the code review queue for 3.0.

This should be targeted at 3.1 as 3.0 is in feature freeze.

@simonpercivall
Copy link

Sorry for the PR spam, but this would be amazing to get merged! What's the chance of that happening?

@thinkwelltwd
Copy link

Sorry for the PR spam, but this would be amazing to get merged!

I make the same apology -- and the same wish!

I have a project that desperately needs native support for cascades.

@NickStefan NickStefan requested a review from felixxm April 22, 2020 23:20
@Pepedou
Copy link

Pepedou commented Apr 23, 2020

It'd be great, I've been following this CR for 3 years now I think XD

@ngnpope
Copy link
Member

ngnpope commented Apr 23, 2020

@simonpercivall @thinkwelltwd @Pepedou

Please don't +1 comment, as it doesn't help progress anything - use the reactions buttons to express support at the top.

If you would like to see this land for Django 3.1, you can help make this possible by reviewing this changes in this pull request. In addition you can test it in a development environment on your projects that you wish to use this in to verify that the changes work as expected and don't regress anything else.

This would be incredibly helpful. You have until the feature freeze on the 11th May according to the roadmap to help make this happen. Don't leave it too late as the fellows will need time for their final review prior to merging.

Copy link
Member

@ngnpope ngnpope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @NickStefan

Thanks for your efforts over the years on this. I've tried to pick through for you so that we can try to get this into better shape.

Please also see #8661 (comment). One of my concerns here is that I see plenty about adding these ON DELETE clauses when creating tables, but not really anything related to altering tables. What happens in that case? Are there any tests?

What about DO_NOTHING - which is currently changed to be based of DatabaseOnDelete in this PR - does there need to be a DB_NO_ACTION? Should all of the application-level values be the equivalent of setting DB_NO_ACTION to remove the previous value? Will that generate unnecessary migrations on upgrade of Django or lead to inconsistent database states for existing projects? Do we need to reconsider having db_cascade separate?

How do we ensure that going from something like CASCADE to DB_CASCADE on an existing column works? And vice versa, because the database clause ought to be removed.

I'm sorry that is a lot of questions, but I think they need answering.

@thinkwelltwd
Copy link

In addition you can test it in a development environment on your projects that you wish to use this in to verify that the changes work as expected and don't regress anything else.

I ran my test database on this PR and migrated existing CASCADE relations. I examined the database and the CASCADE param was added to the Foreign-key constraint as expected.

I also added a receiver on all post_delete signals and only one signal was emitted when deleting a record with 17 other relations. The related rows were deleted as well.

What other feedback / tests would be useful?

@thinkwelltwd
Copy link

thinkwelltwd commented May 2, 2020

Was reviewing some questions from @pope1ni; apropo this:

How do we ensure that going from something like CASCADE to DB_CASCADE on an existing column works? And vice versa, because the database clause ought to be removed.

I suppose you meant whether there should be a test for that or not? I decided to try it and see if it works as expected.

Changed models.DB_CASCADE back to models.CASCADE and makemigrations noted the changes, generated the relevant migration file and applied it. The Foreign-key constraint no longer contains the CASCADE param.

@NickStefan
Copy link
Author

I'm currently working through the code review requested changes... will update soon hopefully...

@NickStefan
Copy link
Author

Hi @NickStefan

Thanks for your efforts over the years on this. I've tried to pick through for you so that we can try to get this into better shape.

Please also see #8661 (comment). One of my concerns here is that I see plenty about adding these ON DELETE clauses when creating tables, but not really anything related to altering tables. What happens in that case? Are there any tests?

What about DO_NOTHING - which is currently changed to be based of DatabaseOnDelete in this PR - does there need to be a DB_NO_ACTION? Should all of the application-level values be the equivalent of setting DB_NO_ACTION to remove the previous value? Will that generate unnecessary migrations on upgrade of Django or lead to inconsistent database states for existing projects? Do we need to reconsider having db_cascade separate?

How do we ensure that going from something like CASCADE to DB_CASCADE on an existing column works? And vice versa, because the database clause ought to be removed.

I'm sorry that is a lot of questions, but I think they need answering.

I've updated the PR according to the latest code reviews (thank you @pope1ni and @jdufresne ).

@pope1ni I specifically agreed with your suggestion to leave DO_NOTHING as original as possible. That should avoid this concern of creating unnecessary migrations:

Will that generate unnecessary migrations on upgrade of Django

However, I'm not sure that a DB_DO_NOTHING or DB_NO_ACTION is required to migrate back from DB_CASCADE. I believe one would simply set that model to on_delete=CASCADE and the migration would change the table SQL to no longer include the ON DELETE CASCADE language. From how I read the migrations code, when model definitions change, the migration serializes a new SQL description of the column/table. This could be wrong, but is my recollection from when I dug deeper into this (years ago now!).

We will 100% need more community testing of how this PR might play with existing projects.

Do we need to reconsider having db_cascade separate?

This was originally suggested back in 2014 when only some DBs supported ON DELETE semantics. It was suggested to have both an on_delete kwarg and a db_on_delete kwarg. It is my belief that this is an unnecessary distinction since all DBs now support ON DELETE. Any on_delete=DB_* can be considered equivalent to having def application_do_nothing + some extra SQL in the table definition. Any use of something like on_delete=CASCADE can be considered equivalent to an empty string being used in the place of ON DELETE XYZ. The cases can be treated as binary.

The cases don't have to be treated as binary though, and having two kwargs would create many different combinations of behavior. I've discussed this before in the PR, the google mailing list, and the ticket. I completely understand if others disagree. Some people like APIs that work like a Coke Machine and others prefer something that gives more possibilities and low level control.

Even with the binary behavior of either using application on_delete behavior or sql level on_delete behavior, there are still plenty of edge cases to deal with. I've tried to enumerate and test for these in the checks framework (db/models/fields/related.py):

  • Can't support GenericForeignKey relations
  • Can't support multi-table inheritance
  • Can't support field_A <- CASCADE <- field_B <- DB_CASCADE <- field_C

And then there's still more, like the broken test remaining where we really have to decide what to do when someone tries to point DB_* at an abstract or missing model. https://github.com/django/django/pull/8661/files#diff-3010fc5a498b7171c342520f34507968R898 .

invalid_models_tests.test_relative_fields.RelativeFieldTests.test_foreign_key_to_abstract_model
invalid_models_tests.test_relative_fields.RelativeFieldTests.test_foreign_key_to_missing_model

I'm not sure whether to raise a checks error there or not.

@NickStefan
Copy link
Author

After some further consideration, I do not have bandwidth to continue work on this ticket along with my full time work duties. I'm happy to respond to further questions about the PR, or catch up someone new.

I apologize to anyone who has been watching this PR and waiting. I should have let this go to someone with more free time back when that was a good option before the 3.1 target date. If someone new picks up the ticket and wants to continue, that's great; if they want to do everything different, that's also great, no hard feelings.

Again, I'm deeply sorry to anyone who's been counting on getting this feature soon :/.

@thinkwelltwd
Copy link

I apologize to anyone who has been watching this PR and waiting.

I'm one of those, but no apology necessary! Thank you for what you've done and what's undone has to wait.

As far as what remains undone. I'm not conversant in Django core enough to take this PR on to completion but I'm considering using it as-is for myself.

We will 100% need more community testing of how this PR might play with existing projects.

How can does the project go about getting this testing? What do the Django devs want from the community? Should I deploy this PR in my production code? Should I use @NickStefan's original branch at django-db-casecade?

Let me know if there's any assistance I can render at this point.

@felixxm felixxm closed this Oct 2, 2020
@pmdevita
Copy link
Contributor

pmdevita commented Oct 2, 2020

Sorry to ask but why was this closed? Is it because development halted or because the feature is being rejected or something else?

@felixxm
Copy link
Member

felixxm commented Oct 2, 2020

Nick doesn't have bandwidth to finish it, see comment. Feel-free to continue his work and prepare a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.