Puppetry LogoPuppetry Logo (Dark)
StudioPricingBlogAffiliateSupport
Theme
💻

Tech

Technical insights and development updates

How we fixed CUDA Error 101: invalid device ordinal ... torch._C._cuda_getDeviceCount() >  0 🤯

How we fixed CUDA Error 101: invalid device ordinal ... torch._C._cuda_getDeviceCount() > 0 🤯

This article details how a team fixed a server issue where one out of eight GPUs went offline due to a loose power connector. Attempts to bypass the problem via configuration adjustments failed. Success came from directly unbinding the troublesome GPU from the NVIDIA driver, a quick fix that got the server running again without needing a reboot. The story emphasizes simple, effective solutions in tech troubleshooting.

Saravana Rathinam

Saravana Rathinam

March 23, 2024
About Us
Terms of Service
Privacy
Cookies
Facebook
Instagram
YouTube
Discord
Threads
LinkedIn

Puppetâ„¢ and Puppetryâ„¢ are trademarks of ELBO AI Inc.

© 2024 ELBO AI Inc. All rights reserved.