tag:blogger.com,1999:blog-2027699993235411172.post5038534873246768926..comments2022-12-06T10:54:07.864+00:00Comments on Graham White: My Notes: Building Tensorflow GPU on Fedora LinuxGraham Whitehttp://www.blogger.com/profile/03878311939940449093noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-2027699993235411172.post-80560788459233013032019-09-13T09:24:08.860+01:002019-09-13T09:24:08.860+01:00OK, I finally got around to writing up my method a...OK, I finally got around to writing up my method at http://gibbalog.blogspot.com/2019/09/installing-tensorflow-gpu-on-fedora.htmlGraham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-33385941622110824822019-09-05T19:30:23.790+01:002019-09-05T19:30:23.790+01:00> Basically, I'm using the CUDA libraries ...> Basically, I'm using the CUDA libraries from the official NVidia repositories alongside graphics drivers from either the RPMFusion or Negativo repositories. <br /><br />Interesting! Looking forward to seeing how you do it :-)<br /><br />That would indeed be a nice way to get up and running.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-13819713335614299832019-09-05T18:07:14.965+01:002019-09-05T18:07:14.965+01:00Thanks for sharing the video. I'll be interes...Thanks for sharing the video. I'll be interested to see your method when I get a chance to watch the full thing in due course.<br /><br />I have another (updated) blog post in draft with my new method that I'll get around to publishing in the next week or two when I've finished writing the content. Basically, I'm using the CUDA libraries from the official NVidia repositories alongside graphics drivers from either the RPMFusion or Negativo repositories. Doing this allows me to essentially back-level CUDA to CUDA 10.0. Once you're on that level you can simply pip install the pre-compiled Tensorflow GPU module and it works just fine out of the box. It's a much simpler method and far quicker to get a Fedora box up and running with Tensorflow but does mean you're not compiling explicitly for your own hardware so you may lose out on some run time optimisations.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-60785908623234793152019-09-05T16:40:03.807+01:002019-09-05T16:40:03.807+01:00Hi Graham,
finally got back to sorting out the pr...Hi Graham,<br /><br />finally got back to sorting out the problem and made a video about it :-)<br /><br />https://www.youtube.com/watch?v=2ld-9gFyCYs&t=18s<br /><br />So it works, but it needs some manual adjustments. I'll also create an issue in TF to see if they want to adapt the build rules accordingly (not literally, of course, but in some less hacky way ;-)))Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-60023779769713678402019-07-30T21:28:25.907+01:002019-07-30T21:28:25.907+01:00Best of luck. I'll update the post (or perhap...Best of luck. I'll update the post (or perhaps write another one) should I have another go at this at some point. This will involve me doing it on a different machine at work or buying a new GPU for my box at home though so it might be a while. For reference, it should be possible not to get stuck with the requirement on building in /usr but I've not tried with the very latest stack so can't offer much more sensible advice than the usual suggestions of checking you've got the various build prefixes set in the appropriate place(s).Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-30984142981262078732019-07-30T21:26:01.535+01:002019-07-30T21:26:01.535+01:00Thanks! I had been on that page but now looking ag...Thanks! I had been on that page but now looking again I see<br /><br />Tweak the /usr/local/cuda-9.2/targets/x86_64-linux/include/host_defines.h to accept the Fedora default compiler. (Not recommended). <br /><br />That sounds like they install into /usr/local? In that case I really might try that (hoping it helps with the source build)...Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-51082748475069076512019-07-30T21:22:14.450+01:002019-07-30T21:22:14.450+01:00I've just checked and it looks like CUDA 10.0 ...I've just checked and it looks like CUDA 10.0 has been removed from Negativo17 so it wont be possible to downgrade. You could check to see if RPM Fusion offers a 10.0 install? They have guides for the <a href="https://rpmfusion.org/Howto/NVIDIA" rel="nofollow">NVidia driver</a> as well as <a href="https://rpmfusion.org/Howto/CUDA" rel="nofollow">CUDA</a>.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-31069688991960539832019-07-30T21:14:36.753+01:002019-07-30T21:14:36.753+01:00Hi Graham,
thanks for responding, and I'm gla...Hi Graham,<br /><br />thanks for responding, and I'm glad to hear you're still interested in this! :-)<br />I'm pretty sure they must have changed something in the last few weeks, as I could still successfully build in June (with CUDA 10.1 from negativo17).<br /><br />But already then I found it weird to see that they were doing stuff below /usr that couldn't really be intended to be done there (like, copying stuff like sudo, sudoedit etc. some place else).<br /><br />And now the failure evidentĺy is that they're trying to create a tempfile /usr (which fails because of missing permissions). <br /><br />So this is why I think the current build needs a separate CUDA in /usr/local.<br /><br />Right now I've really run out of ideas what to do (I'm also not a build specialist so I can't just quickly tweak the cuda part of the build ...).<br /><br />If you ever try again it would be great if you could let me know how it works for you now :-)<br />BTW CUDA 10.1 works fine when you build from source, it's just the PyPi wheels that need CUDA 10.0 ...<br /><br />One question, as you mention that, would you happen to know how I can downgrade to CUDA 10.0 from negativo17, to at least run a PyPi wheel? dnf downgrade just gives me a prior version of 10.1.<br />Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-47859950460421170852019-07-30T20:53:55.574+01:002019-07-30T20:53:55.574+01:00Hi Sigrid,
Thanks for letting me know about your ...Hi Sigrid,<br /><br />Thanks for <a href="https://twitter.com/zkajdan/status/1156212116220760064" rel="nofollow">letting me know</a> about your comment. I found a problem where Google wasn't notifying me of comments to the blog so I hope I've fixed that now - many thanks!<br /><br />Building TF on Fedora does still work for me, yes. It's very similar to the process I've documented in the blog post except one or two bugs have been fixed but one or two other problems have been introduced as well. However, I'm now at the stage where the hardware on my work laptop is too old (CUDA compute capability is too low) for the supported versions of drivers that are easily installed. For example, Negativo17 tends to support the latest CUDA version where as Tensorflow doesn't. This isn't necessarily a major issue as you can downgrade to an earlier CUDA from Negativo17 and still get that side of things going. Where I struggle now is just that the holy grail of requirements just don't work on my laptop when trying to match up the CUDA version with the Tensorflow version and the compute capability I have.<br /><br />So yes, I'm still very much interested and I really wish they'd make Fedora as much of a first-class citizen as they do Ubuntu but that seems unlikely as they favour using Docker images for GPU acceleration these days anyway.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-3255253469834478622019-07-30T15:32:55.313+01:002019-07-30T15:32:55.313+01:00Hi,
I'm happy to find a fellow TF-on-Fedora n...Hi,<br /><br />I'm happy to find a fellow TF-on-Fedora nerd ... May I ask if building TF on Fedora still works for you?<br />I switched from manual driver + cuda installation to negativo17 about 2 months ago (on a fresh install of F30), and at first I was able to build successfully (there were a few quirks though, see https://github.com/tensorflow/tensorflow/issues/29797).<br /><br />Then a few weeks ago I got a new error - see https://groups.google.com/a/tensorflow.org/forum/#!topic/build/AB_nEXhUF0E - and I've not been able to fix that one. By now I think the current TF build does require a CUDA installation in /usr/local/cuda, because they are doing lots of stuff in /usr/local/cuda/bin which if one has /usr as the CUDA root, just ends up to be /usr/bin which is pretty horrible ;-))<br /><br />I was hoping for someone to react to my above mail to TF, but there was no answer... I do understand they have higher priorities, but I'd still like to be able to build from source, and I'd rather stay with negativo17 which overall works a lot better than the manual method (no dkms failures, etc.).<br /><br />Just asking for your current experience (if you're still interested)? Thanks!Anonymousnoreply@blogger.com