tag:blogger.com,1999:blog-2027699993235411172.comments2022-12-06T10:54:07.864+00:00Graham White: My NotesGraham Whitehttp://www.blogger.com/profile/03878311939940449093noreply@blogger.comBlogger168125tag:blogger.com,1999:blog-2027699993235411172.post-86538317262110870352021-03-18T08:59:02.340+00:002021-03-18T08:59:02.340+00:00No need for the excludes any longer. I'll pas...No need for the excludes any longer. I'll paste my repo file content below but note I leave them disabled by default (enabled=0) so I can pick and choose manually on the command line when I want to pull new bits from those repositories (using --enablerepo=nvidia-cuda --enablerepo=nvidia-ml)<br /><br />[nvidia-cuda]<br />name=NVidia Cuda<br />baseurl=http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64<br />enabled=0<br />gpgcheck=1<br />gpgkey=http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/7fa2af80.pub<br /><br />[nvidia-ml]<br />name=NVidia Machine Learning<br />baseurl=https://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/x86_64/<br />enabled=0<br />gpgcheck=1<br />gpgkey=http://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/x86_64/7fa2af80.pubGraham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-43277663165782565622021-03-18T04:52:50.228+00:002021-03-18T04:52:50.228+00:00For Fedora 33, did you still need to have exclude ...For Fedora 33, did you still need to have exclude lines in /etc/yum.repos.d/nvidia.repo? If so, do they need to be altered to make sure that the versions of tensorflow, cuda, and cuDNN all line up correctly?<br /><br />Thanks for posting this!Anonymoushttps://www.blogger.com/profile/09756393811225554917noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-80946280394530402022021-03-16T15:55:12.343+00:002021-03-16T15:55:12.343+00:00Just to confirm these notes are still working for ...Just to confirm these notes are still working for Fedora 33 and Tensorflow-gpu 2.4.1. However, you now need to "dnf install cuda-11-0.x86_64 libcudnn8 libnccl" and hook up your DNF to the RHEL 8 repositories [1][2] because these repositories are the ones with the correct current level of CUDA in them (i.e. CUDA 11.0)<br /><br />[1] http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64<br />[2] https://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/x86_64/Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-47686433926965761382019-12-09T19:24:51.024+00:002019-12-09T19:24:51.024+00:00Ah yes, thanks Martin! That would be a cut and pa...Ah yes, thanks Martin! That would be a cut and paste error since I tend to leave it disabled on a day to day basis. Thanks for letting me know.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-28384270419513598532019-12-09T19:22:16.499+00:002019-12-09T19:22:16.499+00:00A small note: your .repo file has 'enabled=0&#...A small note: your .repo file has 'enabled=0' in it. It took me a while to figure out that was why it wasn't installing anything. :)Martijn Faassenhttps://www.blogger.com/profile/11607525062261059367noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-3825142815983791892019-09-23T13:27:47.577+01:002019-09-23T13:27:47.577+01:00oh, and I have exclude=cuda* on the negativo17 rep...oh, and I have exclude=cuda* on the negativo17 repository in order to get the cuda packages from the NVidia repo instead.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-31570023135752499862019-09-23T13:25:46.813+01:002019-09-23T13:25:46.813+01:00Yes, this process does work with negativo17 as wel...Yes, this process does work with negativo17 as well. I use the negativo17 approach on my work laptop so I have both working (with the RPM Fusion approach working at home). The only difference I can think of in addition to your observations is that the exclude line changes in the yum configuration you'll need. I have exclude=*cuda10.1* on the nvidia-machine-learning repository when using negativo17.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-60619174754837173252019-09-23T13:09:03.912+01:002019-09-23T13:09:03.912+01:00Thanks for this post Graham, that is very useful!
...Thanks for this post Graham, that is very useful!<br />Basically it should work with negativo17 as well, right - only step 1 changes, while step 2 stays (basically, apart from package names) the same?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-80560788459233013032019-09-13T09:24:08.860+01:002019-09-13T09:24:08.860+01:00OK, I finally got around to writing up my method a...OK, I finally got around to writing up my method at http://gibbalog.blogspot.com/2019/09/installing-tensorflow-gpu-on-fedora.htmlGraham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-33385941622110824822019-09-05T19:30:23.790+01:002019-09-05T19:30:23.790+01:00> Basically, I'm using the CUDA libraries ...> Basically, I'm using the CUDA libraries from the official NVidia repositories alongside graphics drivers from either the RPMFusion or Negativo repositories. <br /><br />Interesting! Looking forward to seeing how you do it :-)<br /><br />That would indeed be a nice way to get up and running.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-13819713335614299832019-09-05T18:07:14.965+01:002019-09-05T18:07:14.965+01:00Thanks for sharing the video. I'll be interes...Thanks for sharing the video. I'll be interested to see your method when I get a chance to watch the full thing in due course.<br /><br />I have another (updated) blog post in draft with my new method that I'll get around to publishing in the next week or two when I've finished writing the content. Basically, I'm using the CUDA libraries from the official NVidia repositories alongside graphics drivers from either the RPMFusion or Negativo repositories. Doing this allows me to essentially back-level CUDA to CUDA 10.0. Once you're on that level you can simply pip install the pre-compiled Tensorflow GPU module and it works just fine out of the box. It's a much simpler method and far quicker to get a Fedora box up and running with Tensorflow but does mean you're not compiling explicitly for your own hardware so you may lose out on some run time optimisations.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-60785908623234793152019-09-05T16:40:03.807+01:002019-09-05T16:40:03.807+01:00Hi Graham,
finally got back to sorting out the pr...Hi Graham,<br /><br />finally got back to sorting out the problem and made a video about it :-)<br /><br />https://www.youtube.com/watch?v=2ld-9gFyCYs&t=18s<br /><br />So it works, but it needs some manual adjustments. I'll also create an issue in TF to see if they want to adapt the build rules accordingly (not literally, of course, but in some less hacky way ;-)))Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-19680427079592312812019-08-06T10:32:56.894+01:002019-08-06T10:32:56.894+01:00Not happening for me, some directions lead to diff...Not happening for me, some directions lead to different places, or wor different over here, like filling in fields at the new acces point titled Bt, that here cant't be altered. And gnubox not recognizing pc as bluetooth device, even though it is connected to the phone through bluetooth. Keep bumping my head on this one, yet not lets me quit it in back of my head, maddening.G.https://www.blogger.com/profile/02566437179890099655noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-60023779769713678402019-07-30T21:28:25.907+01:002019-07-30T21:28:25.907+01:00Best of luck. I'll update the post (or perhap...Best of luck. I'll update the post (or perhaps write another one) should I have another go at this at some point. This will involve me doing it on a different machine at work or buying a new GPU for my box at home though so it might be a while. For reference, it should be possible not to get stuck with the requirement on building in /usr but I've not tried with the very latest stack so can't offer much more sensible advice than the usual suggestions of checking you've got the various build prefixes set in the appropriate place(s).Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-30984142981262078732019-07-30T21:26:01.535+01:002019-07-30T21:26:01.535+01:00Thanks! I had been on that page but now looking ag...Thanks! I had been on that page but now looking again I see<br /><br />Tweak the /usr/local/cuda-9.2/targets/x86_64-linux/include/host_defines.h to accept the Fedora default compiler. (Not recommended). <br /><br />That sounds like they install into /usr/local? In that case I really might try that (hoping it helps with the source build)...Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-51082748475069076512019-07-30T21:22:14.450+01:002019-07-30T21:22:14.450+01:00I've just checked and it looks like CUDA 10.0 ...I've just checked and it looks like CUDA 10.0 has been removed from Negativo17 so it wont be possible to downgrade. You could check to see if RPM Fusion offers a 10.0 install? They have guides for the <a href="https://rpmfusion.org/Howto/NVIDIA" rel="nofollow">NVidia driver</a> as well as <a href="https://rpmfusion.org/Howto/CUDA" rel="nofollow">CUDA</a>.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-31069688991960539832019-07-30T21:14:36.753+01:002019-07-30T21:14:36.753+01:00Hi Graham,
thanks for responding, and I'm gla...Hi Graham,<br /><br />thanks for responding, and I'm glad to hear you're still interested in this! :-)<br />I'm pretty sure they must have changed something in the last few weeks, as I could still successfully build in June (with CUDA 10.1 from negativo17).<br /><br />But already then I found it weird to see that they were doing stuff below /usr that couldn't really be intended to be done there (like, copying stuff like sudo, sudoedit etc. some place else).<br /><br />And now the failure evidentĺy is that they're trying to create a tempfile /usr (which fails because of missing permissions). <br /><br />So this is why I think the current build needs a separate CUDA in /usr/local.<br /><br />Right now I've really run out of ideas what to do (I'm also not a build specialist so I can't just quickly tweak the cuda part of the build ...).<br /><br />If you ever try again it would be great if you could let me know how it works for you now :-)<br />BTW CUDA 10.1 works fine when you build from source, it's just the PyPi wheels that need CUDA 10.0 ...<br /><br />One question, as you mention that, would you happen to know how I can downgrade to CUDA 10.0 from negativo17, to at least run a PyPi wheel? dnf downgrade just gives me a prior version of 10.1.<br />Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-47859950460421170852019-07-30T20:53:55.574+01:002019-07-30T20:53:55.574+01:00Hi Sigrid,
Thanks for letting me know about your ...Hi Sigrid,<br /><br />Thanks for <a href="https://twitter.com/zkajdan/status/1156212116220760064" rel="nofollow">letting me know</a> about your comment. I found a problem where Google wasn't notifying me of comments to the blog so I hope I've fixed that now - many thanks!<br /><br />Building TF on Fedora does still work for me, yes. It's very similar to the process I've documented in the blog post except one or two bugs have been fixed but one or two other problems have been introduced as well. However, I'm now at the stage where the hardware on my work laptop is too old (CUDA compute capability is too low) for the supported versions of drivers that are easily installed. For example, Negativo17 tends to support the latest CUDA version where as Tensorflow doesn't. This isn't necessarily a major issue as you can downgrade to an earlier CUDA from Negativo17 and still get that side of things going. Where I struggle now is just that the holy grail of requirements just don't work on my laptop when trying to match up the CUDA version with the Tensorflow version and the compute capability I have.<br /><br />So yes, I'm still very much interested and I really wish they'd make Fedora as much of a first-class citizen as they do Ubuntu but that seems unlikely as they favour using Docker images for GPU acceleration these days anyway.Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-3255253469834478622019-07-30T15:32:55.313+01:002019-07-30T15:32:55.313+01:00Hi,
I'm happy to find a fellow TF-on-Fedora n...Hi,<br /><br />I'm happy to find a fellow TF-on-Fedora nerd ... May I ask if building TF on Fedora still works for you?<br />I switched from manual driver + cuda installation to negativo17 about 2 months ago (on a fresh install of F30), and at first I was able to build successfully (there were a few quirks though, see https://github.com/tensorflow/tensorflow/issues/29797).<br /><br />Then a few weeks ago I got a new error - see https://groups.google.com/a/tensorflow.org/forum/#!topic/build/AB_nEXhUF0E - and I've not been able to fix that one. By now I think the current TF build does require a CUDA installation in /usr/local/cuda, because they are doing lots of stuff in /usr/local/cuda/bin which if one has /usr as the CUDA root, just ends up to be /usr/bin which is pretty horrible ;-))<br /><br />I was hoping for someone to react to my above mail to TF, but there was no answer... I do understand they have higher priorities, but I'd still like to be able to build from source, and I'd rather stay with negativo17 which overall works a lot better than the manual method (no dkms failures, etc.).<br /><br />Just asking for your current experience (if you're still interested)? Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-87653578666258029742018-06-04T17:52:55.972+01:002018-06-04T17:52:55.972+01:00You could have mentioned my solely uphill walk fro...You could have mentioned my solely uphill walk from QECP to join you all for that last stretch on Saturday! No mean feat for someone my age and with no training!<br /><br />TimAnonymoushttps://www.blogger.com/profile/09669547931173330586noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-72662598692635002472017-07-08T19:06:25.606+01:002017-07-08T19:06:25.606+01:00Thanks, it took me hours to go through this proced...Thanks, it took me hours to go through this procedure (due to kernel version mismatch) but it's the only way I could get rev. E of the chip (EU version) working for my RPi 2. Phew! ^^matt4054https://www.blogger.com/profile/04804304021692595393noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-72090051254675135792017-04-03T01:36:25.332+01:002017-04-03T01:36:25.332+01:00pi@raspberrypi ~/Download/rtl8192eu $ sudo make AR...pi@raspberrypi ~/Download/rtl8192eu $ sudo make ARCH=arm<br />make ARCH=arm CROSS_COMPILE= -C /lib/modules/4.4.13+/build M=/home/pi/Download/rtl8192eu modules<br />make[1]: Entering directory '/root/linux-9892c762a4f8a3f56356cba528104618d1222376'<br /> CC [M] /home/pi/Download/rtl8192eu/core/rtw_cmd.o<br />cc1: error: -Werror=date-time: no option -Wdate-time<br />scripts/Makefile.build:258: recipe for target '/home/pi/Download/rtl8192eu/core/rtw_cmd.o' failed<br />make[2]: *** [/home/pi/Download/rtl8192eu/core/rtw_cmd.o] Error 1<br />Makefile:1385: recipe for target '_module_/home/pi/Download/rtl8192eu' failed<br />make[1]: *** [_module_/home/pi/Download/rtl8192eu] Error 2<br />make[1]: Leaving directory '/root/linux-9892c762a4f8a3f56356cba528104618d1222376'<br />Makefile:1455: recipe for target 'modules' failed<br />make: *** [modules] Error 2mocshttps://www.blogger.com/profile/13138074994212711501noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-40273262785976798182016-08-29T10:01:19.021+01:002016-08-29T10:01:19.021+01:00Solved - problem was indeed kernel version - rpi-s...Solved - problem was indeed kernel version - rpi-source for some reason had downloaded 4.4.13 and i was running 4.4.15Anonymoushttps://www.blogger.com/profile/04773770981096321736noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-85642146429757901092016-08-29T05:18:26.896+01:002016-08-29T05:18:26.896+01:00Graham, thanks for response had previously done th...Graham, thanks for response had previously done the web search and tried "modprobe --force" without success. When I find an answer will update here.Anonymoushttps://www.blogger.com/profile/04773770981096321736noreply@blogger.comtag:blogger.com,1999:blog-2027699993235411172.post-57005324018704448802016-08-26T09:15:53.315+01:002016-08-26T09:15:53.315+01:00Hi Ian,
This is a new one on me, I've been me...Hi Ian,<br /><br />This is a new one on me, I've been messing with Linux since the darker days when re-compiling kernels was fairly common and I've not seen this error before.<br /><br />A quick bit of searching around the web does reveal some hints. It appears there likely to be some sort of versioning error somewhere in your build chain. It looks like you'll be able to load your module with the "modprobe --force" which causes modprobe to ignore these version checks.<br /><br />Try a web search for this: modprobe "Exec format error"<br /><br />This was one of the results I found and looks quite useful... https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=73005Graham Whitehttps://www.blogger.com/profile/03878311939940449093noreply@blogger.com