Skip to content

Feat/timeout for q & pull #484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

baibhavbista
Copy link

@baibhavbista baibhavbista commented Mar 6, 2025

Adds a timeout to q and pull

Mostly inspired from the timeout approach in datalevin: https://github.com/juji-io/datalevin/blob/master/src/datalevin/timeout.clj (& original PR which adds it)

Is this approach correct, @tonsky ? (seems to work, based on my tests)

Copy link
Author

@baibhavbista baibhavbista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some clarifying comments

@@ -288,6 +288,10 @@ Some of the features are omitted intentionally. Different apps have different ne

### Testing

We have scripts in the `script` folder like `./script/test_clj.sh`
Copy link
Author

@baibhavbista baibhavbista Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added these to the docs because I couldn't get the below steps (the ones with kaocha) working

@@ -888,7 +893,10 @@
(recur (-collect-tuples acc rel len copy-map) (next rels) symbols))))

(defn collect [context symbols]
(into #{} (map vec) (-collect context symbols)))
(into #{}
(map #(do (timeout/assert-time-left)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change additional to datalevin timeout. Realized it was required here because this is the place where laziness ends

resolved
tuple))
;; realize lazy seq because this is the last step anyways, and because if we don't realize right now then binding for timeout/*deadline* does not work
doall)))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another change which is not present in datalevin timeout PR. This is needed otherwise pull within find spec of q do not obey timeout

true
(-post-process find (:qreturn-map parsed-q)))))
(let [parsed-q (lru/-get *query-cache* q #(dp/parse-query q))]
(binding [timeout/*deadline* (timeout/to-deadline (:qtimeout parsed-q))]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main change here is just the addition of the binding in the middle of the let

@@ -0,0 +1,24 @@
(ns ^:no-doc datascript.timeout)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,51 @@
(ns datascript.test.query-deadline
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonsky
Copy link
Owner

tonsky commented Mar 7, 2025

Interesting! Can you talk a little when is this useful (I am not saying that it isn’t, just want to understand use-cases)

Implementation-wise:

  • timeout ns should be probably merged into util
  • I don’t understand why parse-timeout is needed. It looks like it supports some legacy formats but we don’t have any legacy? We can make any API we want
  • relation! shouldn’t have exclamation mark in its name (there are no side effects)
  • I don’t think (timeout/assert-time-left) should be part of relation!. When you do this, it just scatters checks in more or less random places. I’d prefer more deliberate placing. E.g. in -resolve-clause, -collect, aggregate?
  • (binding [timeout/*deadline* (timeout/to-deadline should probably be one macro (with-deadline?) that does binding and wraps the body

@baibhavbista
Copy link
Author

Sorry just seeing your reply now @tonsky !

The main usecase is when allowing end users to be able to run their own queries. Queries not properly written can cause combinatorial explosion of candidates, and lead either to OOM or to the application freezing for a non-allowably-long period of time (common use case of datascript running in the main thread in browser). In these cases, a timeout allows one to stop the query partway through if the time taken exceeds a value we deem reasonable.

Re: your implementation notes, I will go through them properly next week. BTW, as I mentioned in the PR description, most of your "why"s could be answered by "it was done that way in datalevin, and the relevant parts of the two codebases looked similar enough that I just recreated the same changes here in datascript". I will more properly think about your notes and reply/fix soon though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants