Has anyone encountered skylight in amazon workspaces using extremely high cpu and memory?

342 Views Asked by At

Over the past few days my Amazon Workspace keeps freezing up (Session Interrupted). I am not positive but the culprit seems to be skylight which is consuming over 28.5GB of reserved memory and 148.3GB of Virtual memory as well as between 100-300% of CPU (on an 8 vCPU machine). I also see kswapd0 using high level of cpu, but not constantly. The workspace is running Amazon Linux 2.

I have updated every package that I can and rebooted the workspace (multiple times) and still see this high usage with no processes running and all docker containers stopped.

If anyone has had a similar problem and discovered a solution I would be very appreciative.

4

There are 4 best solutions below

0
Silent Kay On

Yes, i am having this issue also. systemctl stop skylight-agent doesn't stop it, cause the restart policy is set to "Always". Which will just restart when you stop or kill it. And if you systemctl disable the service, it'll cause your workspace to be restarted.

As a workaround, I have currently limited the CPU and RAM usages via doing the following:

  1. systemctl status skylight-agent

  2. Look for "Loaded:" i.e. Loaded: loaded (/usr/lib/systemd/system/skylight-agent.service

  3. sudo nano /usr/lib/systemd/system/skylight-agent.service

  4. Add the following in the file:

Snippet:

[Service]
ExecStart=/usr/sbin/skylight
Restart=always
StandardOutput=journal+console
KillMode=process
CPUQuota=50%
MemoryMax=5000M
MemoryLimit=5000M
MemoryHigh=4500M

MemoryMax and MemoryLimit should be the same, but depending on the version of your instance, one works and the other doesn't. I just both of them in, cause it didn't feel like it respected the MemoryMax.

  1. Save the file, and then run sudo systemctl daemon-reload
  2. Restart the service: sudo systemctl restart skylight-agent.service

MemoryHigh, when threshold is hit, it should throttle the service.

I had to put the above in place because the agent is halting my workspace. This has helped control the workload. Hope the above helps someone.

0
Robert Hunt On

Just to add, we're also seeing the same sort of issues with high resource usage on skylight-agent on one of our workspaces. We operate several but it only seems to be affecting one of them at the moment which is a PowerPro (8 vCPU, 32GB memory) running Amazon Linux 2. First occurrence was on Friday 15th March 2024 and we ended up rebuilding the workspace but it's also started happening again today (Thursday 21st March 2024).

We're trying the workaround by Silent Kay at the moment.

0
IT Robot On

This seems to be an issue with AWS Skylight agent 1.0-200592.0.x86_64, according to AWS support.

They provided us with this temp fix:

A temporary workaround would be to downgrading to 1.0-200586.0.x86_64 $ sudo yum downgrade skylight-agent-1.0-200586.0.x86_64 --enablerepo=skylight

Kindly note that this is a work around because rebooting will update skylight to version 1.0-200592.0.

I would just inform your users and either turn off the service for the time being or downgrade and ask them not to reboot.

Better to make a case with them, so that they reach back out to you to confirm once the fix has been implemented.

0
Silent Kay On

Fix found!

So it seems like the newest version of Skylight agent "200592" is stuck trying to delete log files. The log files are different on different machines, so your one maybe unique to you. We can look at the logs and manually delete them.

Anyway, to find the offending area and fix it, do the following:

Do a cat against: /var/log/skylight/slwsconfigservice.log.* and find where it states something like:

Error trying to remove the file 2024032201log.winbindd.old-2024032201 : remove /var/log/samba/old/TRANSMITTED/2024032201log.winbindd.old-2024032201: no such file or directory

The path /var/log/samba/old/TRANSMITTED/ was the culprit. After doing a rm -rf /var/log/samba/old/TRANSMITTED/*, I then had to restart the skylight-agent for it to drain the RAM (as it was hogging 25-30gb). To do that: sudo systemctl restart skylight-agent.service.

After that skylight was only using 3mb RAM and 0% CPU usage.

That's it, good luck.