Unexpected / poorly documented behaviour of keyword `limit` in `table.populate()`

## Bug Report

### Description

For certain tables in which the `make()` function can take a long period of time, it is often desirable to force that only a small number of keys are handled before moving on to other tables. 

Datajoint provides two ways to implement this, that can behave unexpectedly differently for reasons that can _only_ be found by examing the datajoint source code in detail:
* `table.populate(limit=x)`
* `table.populate(max_calls=x)`

I believe that the way the `limit` keyword currently operates does not align well either with its implied meaning (based on what the word "limit" means in english language", nor with how it is described in the docstring (`:param limit: if not None, check at most this many keys`)

Within `Autopopulate.populate` (https://github.com/datajoint/datajoint-python/blob/master/datajoint/autopopulate.py#L153), the handling goes as follows:
* Determine the set of incomplete keys for `table` (L212)
* Fetch up to `limit` number of keys from this set
* If `reserve_jobs`, then remove any keys from that group that exist in the `jobs` table (L218)
* if `max_calls`, reduce the length of the list of remaining keys to `max_calls`
* For each remaining key, call `table.make(key)` (L248)

In the case where `limit` is small (equal to or smaller than the number of workers that might be executing `table.populate(reserve_jobs=True, limit=x)` simultaneously), this can result in a situation in which there are many potential keys to be processed, but each additional worker is handed a set of keys to process that are all in the `schema.jobs` table simultaneously, and thus is reduced to an empty set of keys before getting to the `table.make(key)` part of the `.populate()` call. 

The "correct" answer to this case is to use `max_calls` instead, as the limit will then apply after checking whether keys are already in progress (or error, or any other form of unavailability due to the jobs table). 

However, today marks the **4th** time I have spent >1h chasing down weird behaviour where multiple workers were idling /ignoring available work due to misunderstanding / misremembering the meaning of this keyword. 


I have several suggestions for fixing this behaviour, mostly depending on the degree to which it is intentional/desired:

1) Assuming it is unintentional, make `limit` and `max_calls` equivalent, by moving the `limit` handling within `Autopopulate.populate` after the job_table checking instead of before

2) Change the logic for determining the set of available keys to include keys already in flight (i.e. listed in the jobs table with whatever status). This would combine well with issue #873 , although in practice it has the same outcome at (1)

3) If it is intended, change the keyword to something that better reflects its actual role. _To me_ the literal meaning of the word "limit" is _much_ more closely aligned with the logical behaviour of the keyword `max_calls`, and vice versa. It is not mentally logical to me that `limit` applies _before_ ignoring keys that are already in progress - I anticipate (every single time!) that keys already in error/reserved are already excluded from the set of keys that are available to be processed, at least is including the `reserve_jobs` flag.

4) Change the docstring to better reflect the actual behaviour. This obviously involves the smallest changes to compatibility or behaviour for everyone who already uses this logic, but still falls victim to the point I make above in (3). I would suggest something like `if not None, attempt to process this many keys, which may include keys already reserved`


### Reproducibility
Include:
- Any OS
- Any Python version
- Any MySQL version
- Datajoint <=0.14.3 (not tested with later versions, but the problematic logic still exists in code in `master` as of posting date)

Create two tables: one listing keys, the other representing some long-running job

```python
# my_test_schema.py

import time
import datajoint as dj

schema = dj.schema("test_schema")

@schema
class TableOne(dj.Manual):
    definition = """
    id: int
    ---
    """

    contents = list(range(10))

@schema
class TableTwo(dj.Computed):
    definition = """
    -> TableOne
    ---
    some_value: int
    """

    def make(self, key):
        print(f"working on {key}\n")
        time.sleep(30)
        self.insert1({**key, "some_value": key["id"]})
```

Create a script to populate `TableTwo`:

```python
# my_test_script.py
import os
import time
from my_test_schema import *

def test():
    pid = os.getpid()
    while True:
        in_flight_keys = (schema.jobs & {"table_name": TableTwo.table_name}).fetch("key")
        available_keys = TableTwo.key_source - TableTwo - in_flight_keys
        print(f"{pid} : available keys: {len(available_keys)}")
        TableTwo.populate(reserve_jobs=True, limit=1)
        time.sleep(1)   # Reduce screen spam

if __name__ == "__main__":
    test()
```

Run at least two copies of `my_test_script.py` simultaneously. The output will look something like this:

```bash
1 : available keys: 10
working on {"id":0}

2 : available keys 9
2 : available keys 9
2 : available keys 9
2 : available keys 9
....
```
i.e. the second script will rapidly loop through the `while` loop ignoring the 9 other jobs available while the first actually does something (in this test example, sleeping). There is no way to predict which process will pick up the second job once the first is complete, but the same pattern will play out - the other script idling. 





### Expected Behavior

I expect the `limit` keyword to behave the way that the `max_calls` keyword actually does.

Alternately, I expect either a choice of keyword that better reflects the actual logical behaviour, _or_ a more descriptive docstring that better explains the actual logical behaviour.

It should not be necessary to examine the source code of the `autopopulate.py` component of the Datajoint library to understand what the difference between the keywords `limit` and `max_calls` is and why one should choose one or the other. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected / poorly documented behaviour of keyword `limit` in `table.populate()` #1203

Bug Report

Description

Reproducibility

Expected Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected / poorly documented behaviour of keyword limit in table.populate() #1203

Description

Bug Report

Description

Reproducibility

Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unexpected / poorly documented behaviour of keyword `limit` in `table.populate()` #1203