Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove SYS_RESOURCE capability from launcher pod #2584

Merged
merged 6 commits into from Aug 30, 2019

Conversation

booxter
Copy link
Contributor

@booxter booxter commented Aug 9, 2019

Instead, set the (unlimited) limit for libvirtd from handler pod that
is already privileged.

Remove SYS_RESOURCE capability from SR-IOV attached VMI launcher pods

@kubevirt-bot kubevirt-bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XXL labels Aug 9, 2019
@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 9, 2019
@booxter
Copy link
Contributor Author

booxter commented Aug 9, 2019

/hold

Need to consider calculating the exact limit size in handler (the formula is not trivial but we can do a reasonable estimate).

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 9, 2019
@slintes
Copy link
Contributor

slintes commented Aug 9, 2019

heads up, we just merged a PR which introduces kubevirt specific SCCs instead of using the privileged SCC. The capability can be removed there as well I guess:

scc.AllowedCapabilities = []corev1.Capability{"NET_ADMIN", "SYS_NICE", "SYS_RESOURCE"}

@booxter booxter force-pushed the ulimit branch 3 times, most recently from 795ca3b to 87dae49 Compare August 16, 2019 16:19
@booxter
Copy link
Contributor Author

booxter commented Aug 16, 2019

Unit test coverage is somewhat down because the code dealing with processes / ulimits is not unit tested. But it should be enough to cover it for regressions with existing functional tests: as long as VMIs still start correctly and have SR-IOV interfaces inside the guest, it should be enough. There is little value in validating that the capability is indeed not present anymore because it's trivial to double check it's not referenced anywhere in the code anymore.

@booxter
Copy link
Contributor Author

booxter commented Aug 16, 2019

/hold cancel

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2019
@booxter
Copy link
Contributor Author

booxter commented Aug 16, 2019

@phoracek @SchSeba this PR should be ready to review and merge.

pkg/virt-handler/isolation/isolation.go Show resolved Hide resolved

func (s *socketBasedIsolationDetector) AdjustResources(vm *v1.VirtualMachineInstance) error {
// bump memlock ulimit for libvirtd
res, err := s.Detect(vm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use a more specific name than result?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(won't block the PR on this one, if you answer no no no to my comments, this PR is good to go)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's of "IsolationResult" type so I thought it's a good enough name. I am ok with renaming. What would be a better name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is being used only in this block and i don't really know. since the other comments are resolved, i wont bother

pkg/virt-handler/isolation/isolation.go Show resolved Hide resolved
Copy link
Member

@phoracek phoracek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2019
@booxter
Copy link
Contributor Author

booxter commented Aug 16, 2019

/retest

1 similar comment
@SchSeba
Copy link
Contributor

SchSeba commented Aug 18, 2019

/retest

Copy link
Contributor

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small comment

@@ -1286,6 +1288,24 @@ func (m *MockGracefulShutdown) TriggerShutdown(vmi *v1.VirtualMachineInstance) {
Expect(err).NotTo(HaveOccurred())
}

type MockIsolationDetector struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be auto generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I've found a way to make it work without manually writing a fake.

Copy link
Member

@vladikr vladikr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

@phoracek
Copy link
Member

/retest

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 19, 2019
@SchSeba
Copy link
Contributor

SchSeba commented Aug 20, 2019

/retest

@vladikr
Copy link
Member

vladikr commented Aug 20, 2019

/test pull-kubevirt-e2e-k8s-multus-1.13.3

@vladikr
Copy link
Member

vladikr commented Aug 21, 2019

/approved
/lgtm

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 22, 2019
@booxter
Copy link
Contributor Author

booxter commented Aug 22, 2019

/hold cancel

Now the ulimit adjustment happens only for VFIO attached VMIs. Also rebased the branch to latest master in hope that some job failures go away.

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 22, 2019
@booxter
Copy link
Contributor Author

booxter commented Aug 23, 2019

/retest

@slintes
Copy link
Contributor

slintes commented Aug 23, 2019

@booxter Travis is red, please run make generate

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2019
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2019
@booxter
Copy link
Contributor Author

booxter commented Aug 26, 2019

/retest

I couldn't reproduce the sriov job failure locally; let's see if it is consistent or a fluke.

@booxter
Copy link
Contributor Author

booxter commented Aug 27, 2019

/retest

booxter and others added 6 commits August 27, 2019 06:01
This syscall is implemented in libraries we use anyway, so we can
easily avoid dealing with unsafe pointers etc.
Instead, set the (unlimited) limit for libvirtd from handler pod that
is already privileged.
Instead of setting to unlimited, try to estimate the actual amount
libvirtd may need for the VM. The actual formula in libvirtd code is
very complex and hard to reproduce (it involves estimating necessary
resources based on NUMA topology, number of CPUs, platform specific
requirements for memory alignment etc.) We are not going to reproduce it
in kubevirt, instead making our best conservative guess and then
allowing libvirtd to set the actual calculated value (that should work
as long as the value used by libvirtd is lower than the limit we set in
kubevirt).
Libvirtd configures the limit in particular domain configurations
only. This limit adjustment is of no use for VMIs not attached to
VFIO.
@vladikr
Copy link
Member

vladikr commented Aug 28, 2019

/retest

@slintes
Copy link
Contributor

slintes commented Aug 28, 2019

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 28, 2019
@slintes
Copy link
Contributor

slintes commented Aug 28, 2019

/retest

@kubevirt-commenter-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

3 similar comments
@kubevirt-commenter-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-commenter-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-commenter-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot kubevirt-bot merged commit 05e84b0 into kubevirt:master Aug 30, 2019
@booxter booxter deleted the ulimit branch August 30, 2019 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants