Warning: there may be occasional oddness due to css and blog edits. **KNOWN ISSUE: possible hidden text**
Showing posts with label recovery. Show all posts
Showing posts with label recovery. Show all posts

Friday, September 26, 2025

Broke GUI recovery

Recently I tested xlibre on my system without a safety net which tends to be my usual method.  As mentioned in the previous blog post, I was stuck again using scfb because of unknown reasons.  After many attempts to reinstall rebuilt ports or reinstall pkgs that I knew would be necessary to help me back to a usable GUI on a 4k screen, I gave up.  I wasn't making any progress, and even desktop-installer was no help this time, so I chose to start from nothing instead.  I issued pkg delete -a -f and then tried pkg info to prove that nothing lingered.  This was when pkg reminded me that it was removed also and needed to be bootstrapped back into the system.  This was easy, just had to answer 'y' to the prompt.

Many of my troubles are self-inflicted but some are due to various other things I may encounter though I do not know whether its unique to my situation.  I have in the past encountered issues with various things except that because of how I may be non-standard or do not always use the accepted official way to do things, I cannot report them.  All of this means that when I get myself into trouble, when I break my system softwarewise, I will need to find my own way out of the forest, rarely are any easy breadcrumbs scattered on the ground.

Since I have been through this a few times, although on my newer box which means many files I created with lists of things and other stuff are not present.  Those are possibly still on my former at present 'software + filesystem' broken system.  I know even from recent posts which things I need to reinstall.  What I didn't know about these things is all the stuff that each one installs as dependencies, so starting from nothing will look different than the reinstalls I did not long ago.

First was graphics/mesa-dri which gave me graphics/mesa-libs and graphics/libdrm which had been seperate steps in the past, now I know that was redundant.  Along with those I got graphics/libglvnd and graphics/glslang as well as a long list of things.

devel/autoconf
devel/autoconf-switch
devel/automake
devel/binutils
devel/bison
devel/bsddialog
security/ca_root_nss
devel/ccache
devel/cmake-core
sysutils/coreutils
textproc/expat2
devel/gettext-runtime
devel/gettext-tools
graphics/glslang
devel/gmake
math/gmp
misc/help2man
sysutils/htop
misc/hwdata
print/indexinfo
devel/jsoncpp
x11/libX11
x11/libXau
x11/libXdamage
x11/libXdmcp
x11/libXext
x11/libXfixes
x11/libXrandr
x11/libXrender
x11/libXv
x11/libXxf86vm
graphics/libdrm
devel/libedit
devel/libffi
graphics/libglvnd
converters/libiconv
dns/libidn2
archivers/liblz4
devel/libpciaccess
devel/libtextstyle
devel/libtool
devel/libunistring
devel/libuv
x11/libxcb
textproc/libxml2
x11/libxshmfence
textproc/libyaml
devel/llvm19
lang/lua53
lang/lua54
devel/m4
graphics/mesa-dri
graphics/mesa-libs
devel/meson
math/mpdecimal
math/mpfr
devel/ninja
security/openssl
devel/p5-Locale-gettext
devel/p5-Locale-libintl
converters/p5-Text-Unidecode
textproc/p5-Unicode-EastAsianWidth
devel/pcre2
lang/perl5.42
ports-mgmt/pkg
devel/pkgconf
ports-mgmt/portconfig
devel/py-babel
textproc/py-CommonMark
devel/py-Jinja2
textproc/py-alabaster
www/py-beaker
devel/py-build
devel/py-calver
security/py-certifi
textproc/py-charset-normalizer
lang/cython
textproc/py-docutils
devel/py-flit-core
devel/py-future
devel/py-hatchling
dns/py-idna
graphics/py-imagesize
devel/py-installer
textproc/py-mako
textproc/py-markdown
textproc/py-markdown-it-py
textproc/py-markupsafe
textproc/py-mdit-py-plugins
textproc/py-mdurl
textproc/py-myst-parser
devel/py-packaging
devel/py-pathspec
misc/py-pexpect
devel/py-pluggy
devel/py-ply
sysutils/py-ptyprocess
textproc/py-pygments
devel/py-pyproject-hooks
net/py-pysocks
textproc/py-pystemmer
devel/py-pyyaml
textproc/py-recommonmark
www/py-requests
devel/py-setuptools
devel/py-setuptools-scm
textproc/py-snowballstemmer
textproc/py-sphinx
textproc/py-sphinx-markdown-tables
textproc/py-sphinxcontrib-applehelp
textproc/py-sphinxcontrib-devhelp
textproc/py-sphinxcontrib-htmlhelp
textproc/py-sphinxcontrib-jsmath
textproc/py-sphinxcontrib-qthelp
textproc/py-sphinxcontrib-serializinghtml
devel/py-trove-classifiers
devel/py-typing-extensions
net/py-urllib3
devel/py-wheel
devel/py-wheel044
lang/python311
devel/readline
security/rhash
graphics/spirv-tools
devel/swig
print/texinfo
x11/xcb-proto
devel/xorg-macros
x11/xorgproto
x11/xtrans
archivers/zstd

After all of that was built and installed and cleaned, I went to the next essential item, graphics/drm-kmod which gave me all of the amd, intel, and radeon kmod files but pkg origin which was how I got these lists after the fact did not show the specifics of the numerous chipset names.  No sense listing those multiple duplicates here though they were actually different at install.

graphics/drm-66-kmod
graphics/drm-kmod
graphics/gpu-firmware-amd-kmod
graphics/gpu-firmware-intel-kmod
graphics/gpu-firmware-kmod
graphics/gpu-firmware-radeon-kmod

Next I should have been smarter about it but since graphics was my main issue previously, I was too focused on that aspect not to also focus solely on installing the amdgpu graphics driver.  Later I installed the x11-drivers/xorg-drivers after configuring it for what I specifically needed, a pkg install would include scfb and vesa and maybe intel drivers, all of which I do not want, do not use, and generally do not need.  So as I have mentioned before, I build xorg-drivers from source each time.  The one interesting thing about starting from very little already installed, is that xorg-drivers/xf86-video-amdgpu also installs xorg-server.  I hadn't noticed my error about not installing all the other drivers until after I got fvwm3 reinstalled, but the list from the xorg-drivers install is below, not quite in the pkg origin order it would have given.

archivers/brotli
devel/evdev-proto
print/freetype2
x11-fonts/libXfont2
x11/libXi
graphics/libepoxy
x11-fonts/libfontenc
security/libgcrypt
security/libgpg-error
devel/libudev-devd
devel/libunwind
x11/libxcvt
x11/libxkbfile
textproc/libxslt
x11/pixman
graphics/png
x11-drivers/xf86-video-amdgpu
x11/xkbcomp
x11/xkeyboard-config
x11-servers/xorg-server
x11/libXxf86vm
x11-drivers/xf86-input-keyboard
x11-drivers/xf86-input-libinput
x11-drivers/xf86-input-mouse

One note about the x11-drivers/xf86-input-libinput driver option, it leads to libinput whose config is to avoid the checked by default wacom option which also may lead to webcamd, I do not need either of those.  Except for that, I mostly chose to accept all defaults.  After I installed the amdgpu driver I chose to install x11/xinit which is where I get startx for later use.  The last thing I needed to get a usable desktop was x11-wm/fvwm3 and I generally use the much more recent fvwm3-dev which I keep updated on my own.  What I forgot again was that I did not have x11/xterm installed and this is not added by fvwm3, so at the time I had to reset my pc to exit the gui because I only had the graphics driver installed (though I present it not in this order above).  When I noticed xterm was needed after I added the input drivers, I still had to exit fvwm3 but could use the menu option.

Once I had xterm installed and was back in fvwm3, I could add the other missing items, feh, gkrellm2, umix, bluefish, and firefox.  A few things had an issue with cython installing as a dependency, so to save time, I installed them by pkg even though I keep my own feh-dev updated somewhat regularly.  Many things were still configured for the much smaller screen scfb would permit me to use, so I may want to adjust some of that, but aside from other things I may realize are not present I am done with recovery from my non-recommended xlibre test method.

Now a recap for the recovery from nothing, because these notes are more likely to remain than any hand-written paper.  Below includes a few configurations and a proper ordering of steps.  Note that any desktop environment such as KDE or similar is very extensive and may have a lot of the below as dependencies itself.

 1. pkg delete -a or pkg delete -a -f which will also remove pkg itself.
 2. make install clean for each of the following port origins, or pkg install if you like all defaults.
 3. graphics/mesa-dri provides mesa-libs, libdrm, libglvnd, glslang
 4. graphics/drm-kmod provides gpu-firmware-amd-kmod, gpu-firmware-kmod
 5. Configure x11-drivers/xorg-drivers for amdgpu (your preferred graphics driver), keyboard, libinput, mouse
 6. Configure x11/libinput to untick libwacom
 7. x11-drivers/xorg-drivers provides configured drivers and x11-servers/xorg-server
 8. x11/xinit provides startx
 9. x11/xterm
10. x11-wm/fvwm3 (your preferred mininmalist window manager)
11. startx

Wednesday, February 7, 2024

Zpool-upgrade loader.efi fail

I recently bought a Sony Walkman android device, which I hoped would permit me to use an app to control my Deco S4 wifi devices.  The wifi hardware was added in hopes of improving local wifi from a single Xfinity device and in a more secure way by connecting via my OPNsense box.  Once I solved the wifi device app issue, I still had the Sony Walkman for audio.

Perhaps foolishly, I wanted to use larger better quality flac files, and knowing I still had quite a large number of CDs to run through ripperX, I tried to change my vdev to better compression.  What seemed to me as being better compression algorithms were unavailable to me because my zfs needed an update.  I had done as many as two updates in the past, one which switched me to 'feature flags' and each time was warned about compatibility but never had an issue.

Without any further investigation, assuming that it would be just as easy a process as in the past, so I upgraded zfs and then changed the compression of the vdev for my music files.  Everything ran fine, no issues.  Later, I was playing minetest and after some time in the midst of what I was attempting on the minetest server, my display and mouse, and pc froze up.  There has been an issue with something somewhere which has caused me a panic and reboot, so all I had once more was the inconvenient interruption.  This time it didn't reboot itself, just seemed to remain frozen, so I rebooted.

This is when I discovered that something was wrong.  It started the boot but before it got very far I was greeted with errors including "ZFS: unsupported feature: com.klarasystems:vdev_zaps_v2" and mention that it couldn't find any bootable drives.  Luckily there was a specific term which I could search online for more information: klarasystems:vdev_zaps_v2.

One search result was FreeBSD forum post 14-0-release-zfs-features-gotcha.91085 but I also looked at the 14.0 release notes, and /usr/src/UPDATING:

20160708:
	The stable/11 branch has been created from head@r302406.

After branch N is created, entries older than the N-2 branch point are removed
from this file. After stable/14 is branched and current becomes FreeBSD 15,
entries older than stable/12 branch point will be removed from current's
UPDATING file.

COMMON ITEMS:

	General Notes
	-------------
	Sometimes, obscure build problems are the result of environment
	poisoning.  This can happen because the make utility reads its
	environment when searching for values for global variables.  To run
	your build attempts in an "environmental clean room", prefix all make
	commands with 'env -i '.  See the env(1) manual page for more details.
	Occasionally a build failure will occur with "make -j" due to a race
	condition.  If this happens try building again without -j, and please
	report a bug if it happens consistently.

	When upgrading from one major version to another it is generally best to
	upgrade to the latest code in the currently installed branch first, then
	do an upgrade to the new branch. This is the best-tested upgrade path,
	and has the highest probability of being successful.  Please try this
	approach if you encounter problems with a major version upgrade.  Since
	the stable 4.x branch point, one has generally been able to upgrade from
	anywhere in the most recent stable branch to head / current (or even the
	last couple of stable branches). See the top of this file when there's
	an exception.

	The update process will emit an error on an attempt to perform a build
	or install from a FreeBSD version below the earliest supported version.
	When updating from an older version the update should be performed one
	major release at a time, including running `make delete-old` at each
	step.

	When upgrading a live system, having a root shell around before
	installing anything can help undo problems. Not having a root shell
	around can lead to problems if pam has changed too much from your
	starting point to allow continued authentication after the upgrade.

	This file should be read as a log of events. When a later event changes
	information of a prior event, the prior event should not be deleted.
	Instead, a pointer to the entry with the new information should be
	placed in the old entry. Readers of this file should also sanity check
	older entries before relying on them blindly. Authors of new entries
	should write them with this in mind.

	ZFS notes
	---------
	When upgrading the boot ZFS pool to a new version (via zpool upgrade),
	always follow these three steps:

	1) recompile and reinstall the ZFS boot loader and boot block
	(this is part of "make buildworld" and "make installworld")

	2) update the ZFS boot block on your boot drive (only required when
	doing a zpool upgrade):

	When booting on x86 via BIOS, use the following to update the ZFS boot
	block on the freebsd-boot partition of a GPT partitioned drive ada0:
		gpart bootcode -p /boot/gptzfsboot -i $N ada0
	The value $N will typically be 1.  For EFI booting, see EFI notes.

	3) zpool upgrade the root pool. New bootblocks will work with old
	pools, but not vice versa, so they need to be updated before any
	zpool upgrade.

	Non-boot pools do not need these updates.

	EFI notes
	---------

	There are two locations the boot loader can be installed into. The
	current location (and the default) is \efi\freebsd\loader.efi and using
	efibootmgr(8) to configure it. The old location, that must be used on
	deficient systems that don't honor efibootmgr(8) protocols, is the
	fallback location of \EFI\BOOT\BOOTxxx.EFI. Generally, you will copy
	/boot/loader.efi to this location, but on systems installed a long time
	ago the ESP may be too small and /boot/boot1.efi may be needed unless
	the ESP has been expanded in the meantime.

	Recent systems will have the ESP mounted on /boot/efi, but older ones
	may not have it mounted at all, or mounted in a different
	location. Older arm SD images with MBR used /boot/msdos as the
	mountpoint. The ESP is a MSDOS filesystem.

	The EFI boot loader rarely needs to be updated. For ZFS booting,
	however, you must update loader.efi before you do 'zpool upgrade' the
	root zpool, otherwise the old loader.efi may reject the upgraded zpool
	since it does not automatically understand some new features.

	See loader.efi(8) and uefi(8) for more details.


and then to the manpage for loader.efi, specifically:

EXAMPLES
   Updating loader.efi on the ESP
       The  following  examples	 shows	how to install a new loader.efi	on the
       ESP.

       First, find the partition of type "efi":

	     # gpart list | grep -Ew '(Name|efi)'
	     1.	Name: nvd0p1
		type: efi
	     2.	Name: nvd0p2
	     3.	Name: nvd0p3
	     4.	Name: nvd0p4
	     1.	Name: nvd0

       The name	of the ESP on this system is nvd0p1.

       Second, let's mount the ESP, copy loader.efi to	the  special  location
       reserved	for FreeBSD EFI	loaders, and unmount once finished:

	     # mount_msdosfs /dev/nvd0p1 /boot/efi
	     # cp /boot/loader.efi /boot/efi/efi/freebsd/loader.efi
	     # umount /boot/efi

SEE ALSO
       loader(8), uefi(8)

Since I had experienced similar issues with my box which meant I had to adjust what was on the hard drive, I knew that I would need bootable media so that I could reach the hard drive.  This is what became a more significant and time-consuming problem, at least partly due to my own stubborn foolishness.  I had a number of micro sd cards and a usb reader.  Out of three which had the potential to function as I needed, only one had an old NomadBSD installed upon it.  My mistake was that I insisted upon upgrading it from 12.x to 14.0 which ran into storage constraints and then breaking it from its normal startup process and eventually unable to boot at all.

Later I noticed an old Kingston Digital DataTraveler SE9 64GB USB 2.0 which I discovered has FreeNAS installed.  I could boot it and get to shell which allowed me to get the img file.  It was good enough for some attempts to create a bootable micro sd card but it eventually would fail too quickly during an ftp transfer or dd write, so that I had to find another option.  I was pretty sure that I had some cdroms or dvds which have some kind of FreeBSD, so once I found my cache, I decided to use a PC-BSD 8.2rc2 dvd.

This is when I was finally tired of the whole process and possibly more rested or something, so I actually, after a chunk of two days of attempts, finally made progress.  I booted the PC-BSD disc and after some tries for the GUI, I realized I only truly needed a shell.  From the shell, I was able to mount my target micro sd card which I used to store the downloaded (via ftp -a download.freebsd.org) img file.  I could also format an SSD which I bought more than 5 years ago in a group of four for adding zfs cache devices to my box (obviously still not accomplished).  I shifted the img file to the SSD so I could dd it to the micro sd card.  Once I had the FreeBSD 14.0 installer on the micro sd card, I booted it and began the process of the repair.

My initial plan was to first test the process by an update of the loader.efi on the micro sd card but got stymied.  I followed the steps above (in the loader.efi manpage example).  I discovered that the appropriate partition was ada2p1, which I mounted to /boot/efi and then found that there was no /boot/efi/efi/freebsd path.  What I had was /boot/efi/EFI/BOOT/BOOTxxx.EFI which I left alone.  Instead I created the needed freebsd directory in /boot/efi/EFI.  Once the path was setup, I could copy the loader.efi into the correct location.

This was my very first ever experience with this sort of repair or upgrade, so I was not sure of success until my reboot.  At some point during or shortly after this repair, I decided to properly install FreeBSD 14.0 onto the SSD as a failsafe against future similar problems, I kept the micro sd card as the FreeBSD 14.0 installer.  I used the SSD using the presently unavailable cables2go version of Generic-Adapter-Converter-Optical-External.  Aside from all of the above, I was able to connect to my local network with my Sony Walkman to look for answers and help.  If I can keep my Walkman able to use for such emergencies, this may be an easier method than keeping an entire network and FreeBSD installed on a box functioning, all I will likely need is wifi and access.  I have no plan to ever modify the Walkman away from its functional install.

Definite relief after I was able to properly boot up my box and use it on my last of a series of three days off from work.  I hope that you update your loader.efi BEFORE you upgrade zfs so that you can avoid the excitement of doing the repair above.

Saturday, February 26, 2022

Port breaks kernel breaks port

So many of us chug happpily along without completely realizing or recognizing how some of the present FreeBSD build mechanisms have become a bit more complex.  Those who never have any need of graphics and remain in a text mode commandline interface for the duration of their use of FreeBSD would not know that there is indeed at least one situation, now, which ties a port and the kernel together.  When everything is working perfectly, this would likely never come up, but a relatively small problem inflated itself to cause my kernel build to fail.

During this troubleshooting quest, first I tried the obvious things, re-rebuild world just to be sure it was ok, then rebuild a GENERIC kernel instead of my custom kernconf and that after having re-enabled some possibly related things in that kernconf to no avail.  After beating my figurative head against the wall for quite a while, I went to twitter to see if @FreeBSDHelp had any ideas.

The details and comments on that discussion thread didn't solve my issue as I could not comprehend how our kernel build was now in any way tied to the build of a port, though the comment that was made may not have explicitly indicated this.  So my next thought was, if I could get a different, earlier version of the /usr/src from git in some way, then rebuild kernel from before the error seemed to appear in /usr/src.  I didn't know this was not the path to take to solve this, but I still wasted far too much time trying to go backward with git to an earlier commit.  I am definitely not a particularly big fan of git, and this exertion didn't help me love it any more.

The reason, besides that it had been in excess of 10 days since the last time I rebuilt my kernel and world, was to get virtualbox working which needed bits from the kernel build which I didn't have, and those need to be the same version as the running OS.  This meant my long journey to rebuild my kernel and world (multiple times each) so that I could use Virtualbox to try the game Veloren which due to whatever is different than expected (FVWM3 and Radeon graphics probably, or similar) does build and install but does NOT run.  I am sure that if I could startup Virtualbox, put tinycore Linux in there, and install Veloren for Linux, I would probably succeed where I was prevented otherwise.  I could not install another OS in Virtualbox because I could not boot the iso, and this due to the virtualbox kernel object not having been loaded.  I couldn't load the needed kernel object since it needed to be built, and now you know why I got stuck down this rabbit hole.

It has been some time since setting that whole "Play Veloren in Virtualbox via tinycore Linux" idea on a back burner or in a box on a shelf somewhere.  One of my incomplete projects is to get my port attempt of Reshade rebuilt, which I was attempting and it ran into some conflicts with python items.  The various python things it needed were installed as version 310 while I already had version 39 of those same ports.  The only way forward was to remove each of the python 3.9 ports to let the reshade build then install what it needed as version 3.10.  Among all of the things that were removed as a consequence of this, was vlc and firefox, both I use daily.  So I gave poudriere a gross list of everything I had installed on my system, let it build, and then discovered a number of things that failed.  The graphics/drm-fbsd13-kmod port was among the fairly long list of things that didn't get built, and it was in the smallish group of "lynchpin" ports, meaning that others failed (were skipped) because it failed.

And so, I thought thats not a problem, I'll go investigate what happened with graphics/drm-fbsd13-kmod to make it fail.  I remembered that poudriere keeps logs of all (or most everything) it does, and I just had to find it.  Since I often end up trying to remember where any certain important thing is and its path, I have been keeping a directory of symbolic links with sometimes more descriptive names.  The appropriate one was,

root@ichigo:~ # ls -l Symbolic_Links/p-keg-logs_bulk_13amd64_latest-per-pkg
lrwxr-xr-x  1 root  wheel  66 May  5  2021 Symbolic_Links/p-keg-logs_bulk_13amd64_latest-per-pkg -> /usr/local/poudriere/data/logs/bulk/13amd64-default/latest-per-pkg

and from there I could do

root@ichigo:~ # tail -n 15 Symbolic_Links/p-keg-logs_bulk_13amd64_latest-per-pkg/drm-fbsd13-kmod-5.4.144.g20220223.log
===> Checking for items in STAGEDIR missing from pkg-plist
Error: Orphaned: %%KMODSRC%%/linuxkpi/dummy/include/linux/random.h
Error: Orphaned: %%KMODSRC%%/linuxkpi/dummy/include/linux/suspend.h
===> Checking for items in pkg-plist which are not in STAGEDIR
===> Error: Plist issues found.
*** Error code 1

Stop.
make: stopped in /usr/ports/graphics/drm-fbsd13-kmod
=>> Error: check-plist failures detected
=>> Cleaning up wrkdir
===>  Cleaning for drm-fbsd13-kmod-5.4.144.g20220223
build of graphics/drm-fbsd13-kmod | drm-fbsd13-kmod-5.4.144.g20220223 ended at Sat Feb 26 00:30:21 CST 2022
build time: 00:04:43
!!! build failure encountered !!!

Firstly, the build failure is due to my choice to be a bit more stringent on builds, to test for various things, so it is possible that this might not appear to most users, although it truly should be visible to all the port maintainers and various FreeBSD developers.  It tells me, as did the small highlighted concise reason in the failed build list output from after I ran poudriere, that it is an issue with the pkg-plist.  This I correctly believed was a simple issue, and easy to fix since this process is something I have repeated many times with my own repos for FreeBSD Port Tree Leaf items such as for Minetest-dev which I wrote about in another blog post.

What I needed to do was go to /usr/ports/graphics/drm-fbsd13-kmod and rename the pkg-plist to pkg-plist-old, and then do a fresh build of it.  Once the build completes, I create a fresh pkg-plist by make makeplist > pkg-plist in order to do a comparison between this new fresh list and the old original list.  This is accomplished by diff -y pkg-plist pkg-plist-old | more to step through the output, looking for something that is present or absent in the newly generated pkg-plist as compared to the old one.  Since a pkg-plist that is in the ports tree may have %%text%% type tags which are often not generated by the make makeplist script, I usually modify the pkg-plist-old to match the one freshly generated.  Once the edits are made, I rename the pkg-plist-old to pkg-plist and then rebuild once more to prove no errors related to the file remain.

Now that graphics/drm-fbsd13-kmod successfully builds and installs, I thought from the back of my mind, that I would try to update kernel and world, due to that vague mention of these two things being related.  World builds as expected, so I go on to the kernel, and then it fails.  It complains that kconfig.mk was missing.  I remember that that was one of the things that I had removed from the graphics/drm-fbsd13-kmod pkg-plist for a reason I am already uncertain about now-- and this is being written within hours of having done it.  I go back to that port tree directory and either the pkg-plist-old was still present or I went through the steps to generate fresh and make the needed edits to fix it.  Whatever actually happened seems to have fallen out of my mind but the result was "Hey! graphics/drm-fbsd13-kmod needs to be built in order for the kernel to build, gee that is weird."

I have been writing about all of this within a relatively short period after succeeding to build kernel and world when it had been broken some week(s) ago.  My new kernel has not yet been installed and I have to build the virtualbox thing(s) that are dependent upon the source.  It is nice now to have this mess cleared up and better understood.  I'll be adding this nit to the lists of build issues for kernel or world, and add emphasis on the relationship between this port and the kernel which likely many of us had not known.  The kernel failure meant the virtualbox port for a kernel object couldn't be built, but the kmod graphics port is what broke the kernel.

Saturday, November 14, 2020

Now I can't boot

There have been plenty of times by now that I have made some sort of adjustment on my system and then it doesn't boot.  We know the usual suspects are /etc/rc.conf and /boot/loader.conf but I'm sure there are others, possibly even a badly thought out recently built and installed custom kernel.  So now we are stuck, we have one box and it fails to boot but the way to solve the problem is to get online from it.  The situation with the broken kernel might be sidestepped easily, simply choose the option from the boot menu to use a different kernel.  If there were also mistakes with the buildworld, and lack of items means no booting, then there needs to be another way.

If you can get to single-user mode (another boot menu item), the changes will be easy to apply.  First, mount -u / and then if your filesystem is ZFS rather than UFS, zfs mount -a and now you can re-edit the typo out of your /boot/loader.conf or some other file, but what if your situation is a bit more complex?

You can use a usb stick to boot and from there mount the drives in your pc, make needed adjustments and get everything back to normal again.  This is where it can be fun, and when I say fun I mean not quite a nightmare though it is a real special pain.  You can probably use any bootable BSD which offers shell access to the machine, but since discovering NomadBSD, it has become my preference.  What NomadBSD has is a complete system which is self-contained within the usb media.  So if you are short of time and cannot fix your system, you can use it to get online to do something important, such as check your work schedule.  Of course, this immediate need situation means that you previously setup a web browser and installed and configured an addon (such as blur by Abine) which stores your passwords, and you had made any other needed adjustments to suit your needs.  So, getting online you have the solution to the problem and you've written it down and now you need to fix whatever is wrong on the HDD of your system.

What you need to do is mount the HDD of your system into the usb media that is loaded.  There should be a directory /media present already, if not, create via mkdir /media because this will be the mount point to reach inside your system HDD.  I would assume that you are already in a shell window (xterm perhaps) or you used shell access from the menu when you booted the usb.  We have two pieces of the puzzle, the solution and the system running with a shell, what we need to do in order to make the changes is get to the drive.

With ZFS, there is a special command which will do what we need, change zroot to the name of your pool.

zpool import -f -R /media zroot

Many times the mistake was made or is corrected in either /etc or /boot, so now to reach those directories or any others on your HDD, you would prefix the desired directory with /media such as below.  Work slowly, re-read the command you've typed before committing to it by pressing return or enter.  While your HDD is mounted to your usb stick, /media is the/ (root) directory of your HDD, and / is the root directory of the usb stick itself.  Entering the specific directory in order to edit a file, such as rc.conf or loader.conf, may be better than remembering every time to prefix with /media, but always pay attention to your current working directory or path.

cd /media/etc

or

cd /media/boot

Only you can know what the problem and solution are.  Now that you have access  to your HDD you can make the corrections and reboot.  That zpool import command is only viable until you reboot and does not need to be turned off or disabled.  We have not made any permanent changes to how your drives are mounted in order to fix the problem, unless of course your problem and solution specifically involves a permanent adjustment to how your drives are mounted.  Right now I do not have any examples handy of the dumb things I have done which resulted in being unable to boot along with how they were fixed.

While tinkering with your system in ways that can only truly be done when it is open source, because you use FreeBSD for your need of control over all of it, you are setting yourself up with the potential for mistakes.  There is nothing wrong with unintentionally doing something incorrectly, it is the surest way to learn.  We may read somewhere how to do something but unknowingly miss a step or configure something wrong or assume their technique will work on our system.  The worst of these experiences involve Boot blockers and unfortunately they can at least temporarily halt all further progress.

Frequently viewed this week