cancel
Showing results for 
Search instead for 
Did you mean: 

Your inputs wanted on device health monitoring

NeerajK
JumpCloud Employee
JumpCloud Employee

Hi there!

We're exploring the development of a device health monitoring and alerting solution. This tool would help you keep track of your device's performance, posture, potential issues, and provide alerts when attention is needed.

I are super interested in hearing your insights and suggestions:

  1. What are your biggest pain points or challenges when it comes to monitoring your device's health?
  2. What features or capabilities would you find most useful in a monitoring and alerting tool?
  3. How and where would you prefer to receive alerts or notifications (e.g., email, messaging, ticketing solution)?
  4. What are the top metrics or components you'd want this tool to monitor (e.g., CPU, disk space, battery health, allowed/denied apps, processes, event logs etc)?
  5. Would you find value in auto-remediation capabilities? What types of issues would you feel comfortable letting the tool tackle automatically?

While we're picking your brains, we're also curious to hear your thoughts on receiving alerts beyond just device health monitoring. Would you be interested in getting notified about things like:

  • Potential security risks or policy violations related to identity and access?
  • Successful and failed login attempts across systems, SSO, web apps?

Would love to know if that unified approach resonates with you or if you'd prefer these capabilities to remain separate.

Appreciate any feedback/thoughts/comments you might have!

 

3 REPLIES 3

bwitzig_Zen
Novitiate III

- When diagnosing device health it can be hard to determine the root cause for slowness as there can be a lot of factors

- Cpu usage percentage is tricky as CPU usage can feel "slow" even with seemingly low amounts of cpu usage (8% for example for a 16 thread cpu). As well, on low core count machines, windows will use up "idle" available frequency to apply updates, repair/update indexes. This also combined with hybrid architectures also becoming more common make this even trickier to diagnose. 

- For laptops, CPU usage % can also be tricky as battery power saving/cpu throttling can cause inconsistent performance. Temperature can also be a key factor, but it's hard to measure as "turbo boost" is intended to provide bursty loads which cause the temperature to rise quickly

- Memory usage (active/compressed) will hit roughly 80% then the OS will migrate over to "Swap" whereas a lot of people will assume there is still a lot of free memory. Apple offers memory pressure as a decent way to identify how burdened the system is

- Network performance monitoring is also tricky, should be use ICMP to major services (o365? Google? etc). Do we have a way to check for bufferbloat? If so, how do we get this to work without impacting the client's actual network performance / data usage (do we divide bufferbloat/speed tests to an "on demand" flow?)

- For admins, their preferred chat channel, Jumpcloud admin portal, with the option of client notifications with tips on what to do could be useful for notifications. Major or more long term issues should be raised as a ticket (ongoing high memory/cpu usage etc.)

- Disk usage is important, but also notifications for S.M.A.R.T. alerts would be amazing as this isn't usually easily exposed to the client, or done in a confusing way, as well being over 80% usage on an SSD can impair life of the drive.

- Setting up conditional access to have a "warning" section could also be useful, admin's usually don't have the time to comb through logs, so anything to help pick out logs that actually need attention is useful.

- I would suggest integrating some diagnostic options for clients in the Jumpcloud app, currently it allows you to reset/sync your password but it could do so much more. 

- "service" checks to allow us to make sure specific software is running (AV/Endpoint tools etc.)

- List of applications detected on the machine with the ability to (attempt to) remove remotely

- Ability to identify device age to indicate when assets "age out"

This combined with both per-machine and more global analytics would help us in making better hardware choices. 

Joranna_Ng
Rising Star I
Rising Star I

In addition to @NeerajK questions, it would be great to hear if any of you are currently using a device health monitoring and alerting solution. 
1. What are you using the solution for?
2. What's good and not so good about it?

Thanks!
Joranna

gus_feliciano
Novitiate I

- Metrics to understand memory and CPU consumption on MacOs and Windows devices, for example, being able to evaluate which process or service is demanding more from the machine.

- I think it's interesting to have a dashboard to measure consumption, where the admin can filter by OS and hostnames, it could have the average consumption of each machine to generate a report and identify possible performance problems.

-Disk full alerts would be really cool.

You Might Like

New to the site? Take a look at these additional resources:

Community created scripts

Keep up with Product News

Read our community guidelines

Ready to join us? You can register here.