Remove SYS_RESOURCE capability from launcher pod #2584

booxter · 2019-08-09T00:53:52Z

Instead, set the (unlimited) limit for libvirtd from handler pod that
is already privileged.

Remove SYS_RESOURCE capability from SR-IOV attached VMI launcher pods

booxter · 2019-08-09T00:55:29Z

/hold

Need to consider calculating the exact limit size in handler (the formula is not trivial but we can do a reasonable estimate).

slintes · 2019-08-09T10:02:12Z

heads up, we just merged a PR which introduces kubevirt specific SCCs instead of using the privileged SCC. The capability can be removed there as well I guess:

kubevirt/pkg/virt-operator/creation/components/scc.go

Line 77 in 548ed63

    
           scc.AllowedCapabilities = []corev1.Capability{"NET_ADMIN", "SYS_NICE", "SYS_RESOURCE"}

booxter · 2019-08-16T18:45:31Z

Unit test coverage is somewhat down because the code dealing with processes / ulimits is not unit tested. But it should be enough to cover it for regressions with existing functional tests: as long as VMIs still start correctly and have SR-IOV interfaces inside the guest, it should be enough. There is little value in validating that the capability is indeed not present anymore because it's trivial to double check it's not referenced anywhere in the code anymore.

booxter · 2019-08-16T18:45:36Z

/hold cancel

booxter · 2019-08-16T18:46:10Z

@phoracek @SchSeba this PR should be ready to review and merge.

pkg/virt-handler/isolation/isolation.go

phoracek · 2019-08-16T19:11:11Z

pkg/virt-handler/isolation/isolation.go

+
+func (s *socketBasedIsolationDetector) AdjustResources(vm *v1.VirtualMachineInstance) error {
+	// bump memlock ulimit for libvirtd
+	res, err := s.Detect(vm)


could we use a more specific name than result?

(won't block the PR on this one, if you answer no no no to my comments, this PR is good to go)

it's of "IsolationResult" type so I thought it's a good enough name. I am ok with renaming. What would be a better name?

it is being used only in this block and i don't really know. since the other comments are resolved, i wont bother

pkg/virt-handler/isolation/isolation.go

phoracek

LGTM

booxter · 2019-08-16T21:39:18Z

/retest

SchSeba · 2019-08-18T11:11:42Z

/retest

SchSeba

just a small comment

SchSeba · 2019-08-18T11:45:04Z

pkg/virt-handler/vm_test.go

@@ -1286,6 +1288,24 @@ func (m *MockGracefulShutdown) TriggerShutdown(vmi *v1.VirtualMachineInstance) {
 	Expect(err).NotTo(HaveOccurred())
 }

+type MockIsolationDetector struct{}


This should be auto generated?

OK I've found a way to make it work without manually writing a fake.

vladikr

Looks good to me 👍

phoracek · 2019-08-19T12:41:21Z

/retest

SchSeba · 2019-08-20T08:49:57Z

/retest

vladikr · 2019-08-20T14:08:42Z

/test pull-kubevirt-e2e-k8s-multus-1.13.3

vladikr · 2019-08-21T13:13:08Z

/approved
/lgtm

booxter · 2019-08-22T20:07:16Z

/hold cancel

Now the ulimit adjustment happens only for VFIO attached VMIs. Also rebased the branch to latest master in hope that some job failures go away.

booxter · 2019-08-23T13:41:15Z

/retest

slintes · 2019-08-23T13:49:47Z

@booxter Travis is red, please run make generate

booxter · 2019-08-26T21:05:16Z

/retest

I couldn't reproduce the sriov job failure locally; let's see if it is consistent or a fluke.

booxter · 2019-08-27T04:26:58Z

/retest

This syscall is implemented in libraries we use anyway, so we can easily avoid dealing with unsafe pointers etc.

Instead, set the (unlimited) limit for libvirtd from handler pod that is already privileged.

Instead of setting to unlimited, try to estimate the actual amount libvirtd may need for the VM. The actual formula in libvirtd code is very complex and hard to reproduce (it involves estimating necessary resources based on NUMA topology, number of CPUs, platform specific requirements for memory alignment etc.) We are not going to reproduce it in kubevirt, instead making our best conservative guess and then allowing libvirtd to set the actual calculated value (that should work as long as the value used by libvirtd is lower than the limit we set in kubevirt).

Libvirtd configures the limit in particular domain configurations only. This limit adjustment is of no use for VMIs not attached to VFIO.

vladikr · 2019-08-28T12:48:31Z

/retest

slintes · 2019-08-28T17:41:57Z

/lgtm

slintes · 2019-08-28T20:53:38Z

/retest

kubevirt-commenter-bot · 2019-08-29T02:29:05Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-commenter-bot · 2019-08-29T09:36:06Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-commenter-bot · 2019-08-30T02:53:07Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-commenter-bot · 2019-08-30T06:57:05Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XXL labels Aug 9, 2019

kubevirt-bot requested review from stu-gott and vladikr August 9, 2019 00:54

kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 9, 2019

kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 9, 2019

booxter force-pushed the ulimit branch 3 times, most recently from 795ca3b to 87dae49 Compare August 16, 2019 16:19

kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2019

phoracek requested changes Aug 16, 2019

View reviewed changes

kubevirt-bot assigned phoracek Aug 16, 2019

phoracek approved these changes Aug 16, 2019

View reviewed changes

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2019

SchSeba reviewed Aug 18, 2019

View reviewed changes

vladikr approved these changes Aug 19, 2019

View reviewed changes

kubevirt-bot assigned vladikr Aug 19, 2019

kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 19, 2019

booxter force-pushed the ulimit branch from f530b27 to 581db11 Compare August 22, 2019 20:06

kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 22, 2019

kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 22, 2019

booxter force-pushed the ulimit branch from 581db11 to 7bb7cd0 Compare August 23, 2019 14:20

kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2019

booxter force-pushed the ulimit branch from 7bb7cd0 to 4100351 Compare August 26, 2019 13:39

kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2019

booxter and others added 6 commits August 27, 2019 06:01

Use syscall.Setrlimit instead of direct RawSyscall6

1034cd2

This syscall is implemented in libraries we use anyway, so we can easily avoid dealing with unsafe pointers etc.

Remove SYS_RESOURCE capability from launcher pod

dc7fa75

Instead, set the (unlimited) limit for libvirtd from handler pod that is already privileged.

Use generated mock for IsolationDetector instead of in-house

a3418f1

Don't adjust RLIMIT_MEMLOCK for libvirtd unless VFIO is attached

228587f

Libvirtd configures the limit in particular domain configurations only. This limit adjustment is of no use for VMIs not attached to VFIO.

Document how we estimate memory lock size for libvirtd

17db974

booxter force-pushed the ulimit branch from 4785bd5 to 17db974 Compare August 27, 2019 13:37

kubevirt-bot assigned slintes Aug 28, 2019

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 28, 2019

kubevirt-bot merged commit 05e84b0 into kubevirt:master Aug 30, 2019

booxter deleted the ulimit branch August 30, 2019 15:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove SYS_RESOURCE capability from launcher pod #2584

Remove SYS_RESOURCE capability from launcher pod #2584

booxter commented Aug 9, 2019 •

edited

booxter commented Aug 9, 2019

slintes commented Aug 9, 2019

booxter commented Aug 16, 2019

booxter commented Aug 16, 2019

booxter commented Aug 16, 2019

phoracek Aug 16, 2019

phoracek Aug 16, 2019

booxter Aug 16, 2019

phoracek Aug 16, 2019

phoracek left a comment

booxter commented Aug 16, 2019

SchSeba commented Aug 18, 2019

SchSeba left a comment

SchSeba Aug 18, 2019

booxter Aug 19, 2019

vladikr left a comment

phoracek commented Aug 19, 2019

SchSeba commented Aug 20, 2019

vladikr commented Aug 20, 2019

vladikr commented Aug 21, 2019

booxter commented Aug 22, 2019

booxter commented Aug 23, 2019

slintes commented Aug 23, 2019

booxter commented Aug 26, 2019

booxter commented Aug 27, 2019

vladikr commented Aug 28, 2019

slintes commented Aug 28, 2019

slintes commented Aug 28, 2019

kubevirt-commenter-bot commented Aug 29, 2019

kubevirt-commenter-bot commented Aug 29, 2019

kubevirt-commenter-bot commented Aug 30, 2019

kubevirt-commenter-bot commented Aug 30, 2019

Remove SYS_RESOURCE capability from launcher pod #2584

Remove SYS_RESOURCE capability from launcher pod #2584

Conversation

booxter commented Aug 9, 2019 • edited

booxter commented Aug 9, 2019

slintes commented Aug 9, 2019

booxter commented Aug 16, 2019

booxter commented Aug 16, 2019

booxter commented Aug 16, 2019

phoracek Aug 16, 2019

Choose a reason for hiding this comment

phoracek Aug 16, 2019

Choose a reason for hiding this comment

booxter Aug 16, 2019

Choose a reason for hiding this comment

phoracek Aug 16, 2019

Choose a reason for hiding this comment

phoracek left a comment

Choose a reason for hiding this comment

booxter commented Aug 16, 2019

SchSeba commented Aug 18, 2019

SchSeba left a comment

Choose a reason for hiding this comment

SchSeba Aug 18, 2019

Choose a reason for hiding this comment

booxter Aug 19, 2019

Choose a reason for hiding this comment

vladikr left a comment

Choose a reason for hiding this comment

phoracek commented Aug 19, 2019

SchSeba commented Aug 20, 2019

vladikr commented Aug 20, 2019

vladikr commented Aug 21, 2019

booxter commented Aug 22, 2019

booxter commented Aug 23, 2019

slintes commented Aug 23, 2019

booxter commented Aug 26, 2019

booxter commented Aug 27, 2019

vladikr commented Aug 28, 2019

slintes commented Aug 28, 2019

slintes commented Aug 28, 2019

kubevirt-commenter-bot commented Aug 29, 2019

kubevirt-commenter-bot commented Aug 29, 2019

kubevirt-commenter-bot commented Aug 30, 2019

kubevirt-commenter-bot commented Aug 30, 2019

booxter commented Aug 9, 2019 •

edited