Use evaluation loop #1543

mberr · 2025-05-11T17:59:00Z

Update the code to use EvaluationLoop instead of the Evaluator.evaluate.

EvaluationLoop uses a PyTorch dataset and dataloader to improve communication and computation overlap. In the filtered setting, EvaluationLoop precomputes the filter positions once and re-uses them next time.

Also implement support for slicing.

cthoyt · 2025-05-12T07:41:40Z

src/pykeen/pipeline/api.py

+        model=model_instance,
+        triples_factory=evaluation_factory,
+        evaluator=evaluator_instance,
+        # TODO: mode support?


which mode is already considered the "default"?

maybe just make this an argument, and raise a NotImplementedError if it's set to anything else (for now)

None is the default (which corresponds to the transductive setting).

on second look, the mode is passed through to the evaluator instance, so the way to support this is during construction of that

mberr · 2025-05-13T09:20:57Z

src/pykeen/evaluation/evaluation_loop.py

        """
        self.model = model
        self.evaluator = evaluator
        self.dataset = dataset
+        if mode is not None:
+            raise NotImplementedError(f"non-transductive evaluation mode is not yet implemented: {mode}")


The evaluation loop supports it. The only issue is that we don't expose the mode through the pipeline. However, this may be acceptable since adding support for inductive models to the pipeline could make it more difficult to determine which parameter combinations to provide.

thanks! I noticed I misunderstood this and updated it after the first commit.

There was definitely some discussion somewhere about how to document exposing the inductive mode via the pipeline (#1503 (comment), moved to its own issue in #1544)

I wouldn't block up this PR with solving that problem

cthoyt · 2025-05-14T08:29:12Z

src/pykeen/evaluation/evaluation_loop.py

@@ -169,12 +175,16 @@ def evaluate(
        batch_size = determine_maximum_batch_size(
            batch_size=batch_size, device=self.model.device, maximum_batch_size=len(self.dataset)
        )
+        # set upper limit for slice size
+        # TODO: if we knew the targets here, we could guess this better
+        slice_size = max(self.model.num_entities, self.model.num_relations)


there are parts of the pipeline that actually try to pass slice_size into this function (I am doing an experiment with typeddicts locally, which would make this PR much messier, but I found this)

so maybe slice_size should be an optional[int] where this gets calculated if it's not passed directly (or does this blow up AMO?)

a7d9231

(or does this blow up AMO?)

The function that is wrapped by AMO needs to pass an int

mberr added 10 commits May 11, 2025 19:51

Add test for equivalence

b545cdb

Add todos

a589f68

Update pipeline's evaluation

bbec544

add todo note

8e34038

fix passing model instance

d6fbc0a

Update early stopper

6f94957

remove todo

aeae8ca

fix additional_filter_triples

d414a67

add todo for slicing

897b166

Add AMO for slice size

a79ee8c

cthoyt reviewed May 12, 2025

View reviewed changes

cthoyt added 3 commits May 12, 2025 23:35

Add induction loop placeholder

952dda8

Update evaluation_loop.py

10703b2

Update early_stopping.py

dcf53b7

mberr commented May 13, 2025

View reviewed changes

cthoyt reviewed May 14, 2025

View reviewed changes

mberr and others added 7 commits May 14, 2025 12:52

use provided slice_size

a7d9231

add param to docstr

235de84

Reorganize mode passing

e3bbc2f

Update evaluation_loop.py

dd49ad1

Update early_stopping.py

a13294f

Update api.py

b5b034e

Update early_stopping.py

b2e1208

cthoyt mentioned this pull request May 16, 2025

sLCWA Evaluation Loop #1546

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use evaluation loop #1543

Use evaluation loop #1543

Uh oh!

mberr commented May 11, 2025 •

edited

Loading

Uh oh!

cthoyt May 12, 2025

Uh oh!

mberr May 12, 2025

Uh oh!

cthoyt May 15, 2025

Uh oh!

mberr May 13, 2025

Uh oh!

cthoyt May 14, 2025 •

edited

Loading

Uh oh!

cthoyt May 14, 2025

Uh oh!

mberr May 14, 2025

Uh oh!

Uh oh!

Uh oh!

Use evaluation loop #1543

Are you sure you want to change the base?

Use evaluation loop #1543

Uh oh!

Conversation

mberr commented May 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cthoyt May 12, 2025

Choose a reason for hiding this comment

Uh oh!

mberr May 12, 2025

Choose a reason for hiding this comment

Uh oh!

cthoyt May 15, 2025

Choose a reason for hiding this comment

Uh oh!

mberr May 13, 2025

Choose a reason for hiding this comment

Uh oh!

cthoyt May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cthoyt May 14, 2025

Choose a reason for hiding this comment

Uh oh!

mberr May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mberr commented May 11, 2025 •

edited

Loading

cthoyt May 14, 2025 •

edited

Loading