Warning: there may be occasional oddness due to css and blog edits. **KNOWN ISSUE: possible hidden text**
Showing posts with label fail. Show all posts
Showing posts with label fail. Show all posts

Saturday, February 26, 2022

Port breaks kernel breaks port

So many of us chug happpily along without completely realizing or recognizing how some of the present FreeBSD build mechanisms have become a bit more complex.  Those who never have any need of graphics and remain in a text mode commandline interface for the duration of their use of FreeBSD would not know that there is indeed at least one situation, now, which ties a port and the kernel together.  When everything is working perfectly, this would likely never come up, but a relatively small problem inflated itself to cause my kernel build to fail.

During this troubleshooting quest, first I tried the obvious things, re-rebuild world just to be sure it was ok, then rebuild a GENERIC kernel instead of my custom kernconf and that after having re-enabled some possibly related things in that kernconf to no avail.  After beating my figurative head against the wall for quite a while, I went to twitter to see if @FreeBSDHelp had any ideas.

The details and comments on that discussion thread didn't solve my issue as I could not comprehend how our kernel build was now in any way tied to the build of a port, though the comment that was made may not have explicitly indicated this.  So my next thought was, if I could get a different, earlier version of the /usr/src from git in some way, then rebuild kernel from before the error seemed to appear in /usr/src.  I didn't know this was not the path to take to solve this, but I still wasted far too much time trying to go backward with git to an earlier commit.  I am definitely not a particularly big fan of git, and this exertion didn't help me love it any more.

The reason, besides that it had been in excess of 10 days since the last time I rebuilt my kernel and world, was to get virtualbox working which needed bits from the kernel build which I didn't have, and those need to be the same version as the running OS.  This meant my long journey to rebuild my kernel and world (multiple times each) so that I could use Virtualbox to try the game Veloren which due to whatever is different than expected (FVWM3 and Radeon graphics probably, or similar) does build and install but does NOT run.  I am sure that if I could startup Virtualbox, put tinycore Linux in there, and install Veloren for Linux, I would probably succeed where I was prevented otherwise.  I could not install another OS in Virtualbox because I could not boot the iso, and this due to the virtualbox kernel object not having been loaded.  I couldn't load the needed kernel object since it needed to be built, and now you know why I got stuck down this rabbit hole.

It has been some time since setting that whole "Play Veloren in Virtualbox via tinycore Linux" idea on a back burner or in a box on a shelf somewhere.  One of my incomplete projects is to get my port attempt of Reshade rebuilt, which I was attempting and it ran into some conflicts with python items.  The various python things it needed were installed as version 310 while I already had version 39 of those same ports.  The only way forward was to remove each of the python 3.9 ports to let the reshade build then install what it needed as version 3.10.  Among all of the things that were removed as a consequence of this, was vlc and firefox, both I use daily.  So I gave poudriere a gross list of everything I had installed on my system, let it build, and then discovered a number of things that failed.  The graphics/drm-fbsd13-kmod port was among the fairly long list of things that didn't get built, and it was in the smallish group of "lynchpin" ports, meaning that others failed (were skipped) because it failed.

And so, I thought thats not a problem, I'll go investigate what happened with graphics/drm-fbsd13-kmod to make it fail.  I remembered that poudriere keeps logs of all (or most everything) it does, and I just had to find it.  Since I often end up trying to remember where any certain important thing is and its path, I have been keeping a directory of symbolic links with sometimes more descriptive names.  The appropriate one was,

root@ichigo:~ # ls -l Symbolic_Links/p-keg-logs_bulk_13amd64_latest-per-pkg
lrwxr-xr-x  1 root  wheel  66 May  5  2021 Symbolic_Links/p-keg-logs_bulk_13amd64_latest-per-pkg -> /usr/local/poudriere/data/logs/bulk/13amd64-default/latest-per-pkg

and from there I could do

root@ichigo:~ # tail -n 15 Symbolic_Links/p-keg-logs_bulk_13amd64_latest-per-pkg/drm-fbsd13-kmod-5.4.144.g20220223.log
===> Checking for items in STAGEDIR missing from pkg-plist
Error: Orphaned: %%KMODSRC%%/linuxkpi/dummy/include/linux/random.h
Error: Orphaned: %%KMODSRC%%/linuxkpi/dummy/include/linux/suspend.h
===> Checking for items in pkg-plist which are not in STAGEDIR
===> Error: Plist issues found.
*** Error code 1

Stop.
make: stopped in /usr/ports/graphics/drm-fbsd13-kmod
=>> Error: check-plist failures detected
=>> Cleaning up wrkdir
===>  Cleaning for drm-fbsd13-kmod-5.4.144.g20220223
build of graphics/drm-fbsd13-kmod | drm-fbsd13-kmod-5.4.144.g20220223 ended at Sat Feb 26 00:30:21 CST 2022
build time: 00:04:43
!!! build failure encountered !!!

Firstly, the build failure is due to my choice to be a bit more stringent on builds, to test for various things, so it is possible that this might not appear to most users, although it truly should be visible to all the port maintainers and various FreeBSD developers.  It tells me, as did the small highlighted concise reason in the failed build list output from after I ran poudriere, that it is an issue with the pkg-plist.  This I correctly believed was a simple issue, and easy to fix since this process is something I have repeated many times with my own repos for FreeBSD Port Tree Leaf items such as for Minetest-dev which I wrote about in another blog post.

What I needed to do was go to /usr/ports/graphics/drm-fbsd13-kmod and rename the pkg-plist to pkg-plist-old, and then do a fresh build of it.  Once the build completes, I create a fresh pkg-plist by make makeplist > pkg-plist in order to do a comparison between this new fresh list and the old original list.  This is accomplished by diff -y pkg-plist pkg-plist-old | more to step through the output, looking for something that is present or absent in the newly generated pkg-plist as compared to the old one.  Since a pkg-plist that is in the ports tree may have %%text%% type tags which are often not generated by the make makeplist script, I usually modify the pkg-plist-old to match the one freshly generated.  Once the edits are made, I rename the pkg-plist-old to pkg-plist and then rebuild once more to prove no errors related to the file remain.

Now that graphics/drm-fbsd13-kmod successfully builds and installs, I thought from the back of my mind, that I would try to update kernel and world, due to that vague mention of these two things being related.  World builds as expected, so I go on to the kernel, and then it fails.  It complains that kconfig.mk was missing.  I remember that that was one of the things that I had removed from the graphics/drm-fbsd13-kmod pkg-plist for a reason I am already uncertain about now-- and this is being written within hours of having done it.  I go back to that port tree directory and either the pkg-plist-old was still present or I went through the steps to generate fresh and make the needed edits to fix it.  Whatever actually happened seems to have fallen out of my mind but the result was "Hey! graphics/drm-fbsd13-kmod needs to be built in order for the kernel to build, gee that is weird."

I have been writing about all of this within a relatively short period after succeeding to build kernel and world when it had been broken some week(s) ago.  My new kernel has not yet been installed and I have to build the virtualbox thing(s) that are dependent upon the source.  It is nice now to have this mess cleared up and better understood.  I'll be adding this nit to the lists of build issues for kernel or world, and add emphasis on the relationship between this port and the kernel which likely many of us had not known.  The kernel failure meant the virtualbox port for a kernel object couldn't be built, but the kmod graphics port is what broke the kernel.

Saturday, November 14, 2020

Now I can't boot

There have been plenty of times by now that I have made some sort of adjustment on my system and then it doesn't boot.  We know the usual suspects are /etc/rc.conf and /boot/loader.conf but I'm sure there are others, possibly even a badly thought out recently built and installed custom kernel.  So now we are stuck, we have one box and it fails to boot but the way to solve the problem is to get online from it.  The situation with the broken kernel might be sidestepped easily, simply choose the option from the boot menu to use a different kernel.  If there were also mistakes with the buildworld, and lack of items means no booting, then there needs to be another way.

If you can get to single-user mode (another boot menu item), the changes will be easy to apply.  First, mount -u / and then if your filesystem is ZFS rather than UFS, zfs mount -a and now you can re-edit the typo out of your /boot/loader.conf or some other file, but what if your situation is a bit more complex?

You can use a usb stick to boot and from there mount the drives in your pc, make needed adjustments and get everything back to normal again.  This is where it can be fun, and when I say fun I mean not quite a nightmare though it is a real special pain.  You can probably use any bootable BSD which offers shell access to the machine, but since discovering NomadBSD, it has become my preference.  What NomadBSD has is a complete system which is self-contained within the usb media.  So if you are short of time and cannot fix your system, you can use it to get online to do something important, such as check your work schedule.  Of course, this immediate need situation means that you previously setup a web browser and installed and configured an addon (such as blur by Abine) which stores your passwords, and you had made any other needed adjustments to suit your needs.  So, getting online you have the solution to the problem and you've written it down and now you need to fix whatever is wrong on the HDD of your system.

What you need to do is mount the HDD of your system into the usb media that is loaded.  There should be a directory /media present already, if not, create via mkdir /media because this will be the mount point to reach inside your system HDD.  I would assume that you are already in a shell window (xterm perhaps) or you used shell access from the menu when you booted the usb.  We have two pieces of the puzzle, the solution and the system running with a shell, what we need to do in order to make the changes is get to the drive.

With ZFS, there is a special command which will do what we need, change zroot to the name of your pool.

zpool import -f -R /media zroot

Many times the mistake was made or is corrected in either /etc or /boot, so now to reach those directories or any others on your HDD, you would prefix the desired directory with /media such as below.  Work slowly, re-read the command you've typed before committing to it by pressing return or enter.  While your HDD is mounted to your usb stick, /media is the/ (root) directory of your HDD, and / is the root directory of the usb stick itself.  Entering the specific directory in order to edit a file, such as rc.conf or loader.conf, may be better than remembering every time to prefix with /media, but always pay attention to your current working directory or path.

cd /media/etc

or

cd /media/boot

Only you can know what the problem and solution are.  Now that you have access  to your HDD you can make the corrections and reboot.  That zpool import command is only viable until you reboot and does not need to be turned off or disabled.  We have not made any permanent changes to how your drives are mounted in order to fix the problem, unless of course your problem and solution specifically involves a permanent adjustment to how your drives are mounted.  Right now I do not have any examples handy of the dumb things I have done which resulted in being unable to boot along with how they were fixed.

While tinkering with your system in ways that can only truly be done when it is open source, because you use FreeBSD for your need of control over all of it, you are setting yourself up with the potential for mistakes.  There is nothing wrong with unintentionally doing something incorrectly, it is the surest way to learn.  We may read somewhere how to do something but unknowingly miss a step or configure something wrong or assume their technique will work on our system.  The worst of these experiences involve Boot blockers and unfortunately they can at least temporarily halt all further progress.

Frequently viewed this week