Skip to content

DAOS-17446 common: Chunk dRPC messages larger than 1MB #16479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: kjacque/hello-drpc
Choose a base branch
from

Conversation

kjacque
Copy link
Contributor

@kjacque kjacque commented Jun 5, 2025

In certain contexts it is possible to exceed the 1MB buffer size
limitation for dRPC messages. This is rare, but it renders the
payload impossible to read. This patch refactors dRPC to send
and receive messages in <1MB chunks by adding a header.

With this change, the chunk header is always expected, so client
libdaos and daos_agent must be at equivalent versions. This should
already be the case with the daos-client RPM package.

  • Implement chunking as the default behavior in both control and
    data plane dRPC implementations.
  • Add low-level integration testing of the dRPC call/response
    infrastructure for both C and Go implementations, to test the
    send/recv functionality in each implementation separately from
    higher-level DAOS tests.

Features: control

Signed-off-by: Kris Jacque kris.jacque@hpe.com

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

kjacque added 2 commits June 5, 2025 17:48
In certain contexts it is possible to exceed the 1MB buffer size
limitation for dRPC messages. This is rare, but it renders the
payload impossible to read. This patch refactors dRPC to send
and receive messages in <1MB chunks by adding a header.

With this change, the chunk header is always expected, so client
libdaos and daos_agent must be at equivalent versions. This should
already be the case with the daos-client RPM package.

- Implement chunking as the default behavior in both control and
  data plane dRPC implementations.
- Add low-level integration testing of the dRPC call/response
  infrastructure for both C and Go implementations, to test the
  send/recv functionality in each implementation separately from
  higher-level DAOS tests.

Features: control

Signed-off-by: Kris Jacque <kris.jacque@hpe.com>
Signed-off-by: Kris Jacque <kris.jacque@hpe.com>
@kjacque kjacque self-assigned this Jun 5, 2025
@kjacque kjacque requested review from mjmac and tanabarr June 5, 2025 18:05
@kjacque
Copy link
Contributor Author

kjacque commented Jun 5, 2025

Currently based on #16350. When that lands I will have to rebase, so just seeking preliminary reviews at this point.

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16479/1/testReport/

@mjmac
Copy link
Contributor

mjmac commented Jun 5, 2025

Took a look at the new code... Looks like a nice update. One thing that occurred to me as I was looking at the Go side of it is that it might be better to make a libdrpc and write Go bindings for it. That way we don't have two separate implementations in different languages that have to interoperate.

I know that's quite a shift in direction, and you've already written the guts of the Go side. Would kinda suck to throw that out, but might still be useful as a framework for bindings. If we're going to do that, though, now would be a good time to do it, IMO.

@daosbuild3
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants