OpenBSD Stories: The closest thing to cute kittens (OpenBSD/zaurus)

原始链接: http://miod.online.fr/software/openbsd/stories/zaurus1.html

Sorry.
相关文章

原文

(Follow this link to go back to the main zaurus page.)

During the 1990s, some users of home computers wanted to be able to run a Unix-like operating system on their machines.

There was a group of people working on porting the BSD codebase to the Commodore Amiga, another to the Atari Falcon, and in the United Kingdom, another group working on porting BSD to the Acorn RiscPC. There were also similar efforts targeting Linux, rather than BSD.

Eventually, all these BSD porting efforts merged in NetBSD, in which the port to the RiscPC, initially called RiscBSD and led by Mark Brinicombe, became NetBSD/arm32 at the end of january 1996.

At this time, the OpenBSD code was synced with the NetBSD code on an irregular basis, and OpenBSD obtained these arm32 bits shortly after; but there was noone, among the OpenBSD developers, interested in making OpenBSD/arm32 a reality (probably because none of them had an Acorn RiscPC, as these were quite rare outside the UK.)

Eventually, the vestigial ARM code in OpenBSD was removed in early 2001.

That's all folks! Stay tuned for another OpenBSD story next week!


Ok, the story was not quite finished. Far from it.

Although there was no support for ARM-based hardware in OpenBSD at that time, there was no reason not to work on it, should an interesting ARM platform appear, which could be decently usable under OpenBSD.

(Of course, some people will argue that the RiscPC fits in that category; after all, there was an Ethernet option board which would allow it to be put on a network, and if you were lucky enough to have a good video option, you could have a decent screen resolution and run X11 - very few people know this, but I used to own a RiscPC for which I also bought the Ethernet option, and would have been interested in running OpenBSD on it at some point, but even NetBSD failed to boot on that particular hardware, and I've since given that machine to a friend who managed to blow its power supply a few weeks later...)

A dream of many Unix people, from the dot-com Internet bubble onwards, was to be able to carry a small device, which would allow one to go on the Internet, and be able to fix any network problem from anywhere, at anytime: the ultimate road warrior.

Nowadays, this device is your smartphone, and everyone can pretend to be a road warrior. But in the early 2000s, smartphones did not exist, Handheld PC would not fit in your pockets unless you were André the Giant, and PDAs were not necessarily usable for that role, either lacking good displays, or good keyboards, or good expansion facilities allowing a GPRS modem to be used, or simply decent battery life.

Until the Sharp Zaurus appeared.

More specifically, the Zaurus SL series, released in 2002. These handheld computers would run Linux (thus, why not OpenBSD?), and feature a real keyboard, as well as CompactFlash and SD slots.

However, being a handheld machine, one would expect the porting effort to be difficult, with a twisted maze of GPIO pins, all different, and no easy debug facilities (although there was a debug serial port on the Zaurus, which later turned out to be extremely useful.)

So, prior to working on a Zaurus port, another ARM-based port was needed, to act as a solid fundation, and which would be easily available to the OpenBSD developers.

The obvious choice turned out to be the Chalice Technologies CATS board: an ATX form-factor motherboard, with ISA and PCI slots, powered by a 233MHz Intel SA-110 StrongArm processor, which was being distributed by Simtec Electronics.

That board had been supported in NetBSD since the end of 1998, and would allow developers to use off-the-shelf memory, hard drives, and expansion boards, such as Ethernet boards.

OpenBSD senior developer Dale Rahn, knew the ARM architecture inside-out after having written an ARM710 processor simulator at Motorola (although Motorola had their own competing processor lines, some of the Motorola mobile phones were built around ARM cores.) He started to work on porting the NetBSD/cats codebase to OpenBSD during fall 2003.

I was not aware of this effort until Rahn mentioned this indirectly on the OpenBSD developers chatroom on january 3rd (2004.)

<drahn> miod, was config_defer the bit you were trying to eliminate recently?
<miod> no, indirect config.
<drahn> ok, just ran into config_defer for isa on arm (cats)
<miod> oh, you've got yourself a CATS? I had considered buying one in the past,
       but they were priced too high for their worth
<deraadt> i am mailing simtec to get a deal for us.
<drahn> yup, running netbsd, hopefully for a short time.
Two days later, he was stuck and frustrated.
<drahn> no joy in mudville, loaded part of the kernel then 'Input/output error'
<deraadt> boot block doesn't like it?
<drahn> incorrect conversion of elf kernel to a.out (firmware only knows about
        a.out)
<deraadt> ah
<drahn> might be a firmware upgrade with groks elf haven't found it yet.
<deraadt> rom loads kernel? and groks ffs? ick.
<drahn> yup.
<drahn> might write a bootloader ;-)
[...]
<drahn> I have no idea what the API is on the cats FW, dont even know if it is
        openfirmware or what.
<kevlo> iirc, cats has latest version of firmware...
<drahn> the guy who gave me this board said their might be a newer (beta?)
        version out, but I have not found NetBSD's web pages to be up to find
        the links.
<kevlo> http://mail-index.netbsd.org/port-cats/2003/10/04/0000.html
<kevlo> hmm
The next day, Rahn received from Japan the Zaurus SL-C860 system he would later port OpenBSD to.
<drahn> sigh, of course customs  had to open my package from dragos.
<miod> which was full of prohibited material.
<millert> Someone told them zaurus was da domb...
<drahn> manual is fun, not that I can read it.
He had to play with the Zaurus a bit before returning to work (who wouldn't have?)
<drahn> heh now on from zaurus over openssh
<espie> zaurus OpenBSD ?
<millert> Yes, it is arm-based so we are starting an arm port
<drahn> still linux
<miod> but eventually this will be an arm-ored system!
<millert> Yes, we must arm-or-all
<miod> so after W^X, we'll have arm|all
In the meantime, Theo de Raadt was looking on getting CATS boards to developers, to help with the port once Rahn would have made enough progress to allow other people to contribute.
<deraadt> OK, the cats vendor has given us a pretty good deal.
<drahn> cool.
<deraadt> 3/4 price.
<deraadt> for 4 machines; so i'll be asking austin to run that past him.
<deraadt> like the zaurus?
<drahn> seems cool, getting used to the key layout is a bit of work.
<drahn> the 80211 CF card requires one's hand to be in a different position.

When the CATS board was designed and built, in 1997-1998, it was fit with a simple firmware called ``Cyclone'', which was able to use a serial console or a VGA-compatible PCI video board, and recognize IDE disks and NE2000-compatible PCI network cards.

Apparently, engineers at Simtec became dissatisfied with Cyclone and started to work on a different, more modular, firmware for the CATS board, called ABLE.

Interestingly, Simtec never referred to ABLE as a firmware, but always as a boot loader. Yet it was anything but A Boot Loader Extraordinaire, but we just didn't know yet.

Since it was clear Cyclone would not be maintained any further, it made sense to target ABLE. Rahn thus updated his own board and hoped that it would be able to load an ELF binary...

<drahn> [all caps expletive deleted], I just upgraded the cats box to ABLE
        firmware, then found this comment 'Not booting from ffs'
<drahn> ARGH.
<drahn> back to netboot to recover. shit.
<drahn> btw, ABLE firmware command line editor SUCKS!
<miod> you just didn't found [sic] the TECO mode, that's all.
<drahn> '?' char prints '/', ^U doesn't work, up arrow goes thru history, but
        left arrow doesn't do anything.
Meanwhile, I was a bit worried when I read my name mentioned as one of the developers who would receive a CATS, for practical mailbox size reasons.
Date: Wed, 7 Jan 2004 14:10:01 +0000
From: Miod Vallat
To: Theo de Raadt
Subject: cats

<deraadt> ok, we've got a few cats machines coming (miod, millert, kevlo, me)

Are they just mobos or complete machines (i.e. small parcel or parcel at
least the size of a computer case)?

If it's big, I'll better provide a different shipping address. However
it does not matter much if it's still get shipped to home.
I was confirmed by de Raadt that the parcel would only contain the ATX motherboard, so I did not need to have it sent to my work place.

Simtec, now being aware of the interest of the OpenBSD project to run on the CATS boards, would provide some support to help us get running:

<drahn> Hmm, from CATs:
<drahn> Hi, I shall be your support engineer ;-) (OK so I am actually the lead
<drahn> software enginner because Gavin Simpson (CEO) seems very interested in
<drahn> the sucess of this project, so am I for that matter ;-)
On january 12th, Rahn reached a point where the kernel could be loaded, but did not run very far.
<drahn> whoo
<drahn> OpenBSD/cats booting ...
<drahn> panic: Incompatible magic number passed in boot args
<deraadt> neato.
<drahn> elf kernel.
<tedu> is this going to be one of those archs that requires elf2aout?
<deraadt> ah, you made it better?
<deraadt> elf2aout is wrong.
<drahn> tedu their new firmware supports elf, now we need to work with them to
        figure out what filesystem to load the kernel(bootloader) from.
<drahn> cats is ARM based machine.
<drahn> http://www.simtec.co.uk/products/EB110ATX/resources.html
<drahn> since some asked.
<deraadt> who's arms did they cut off?
Of course, it would have been too easy, had there not been means for the hardware to get in the way...
<drahn> [all caps expletive deleted] CATS box has the extra pin on the IDE
        header on the motherboard.
<kevlo> eh
<drahn> GRR, a 1998 40x CD drive could not read the FS under the new 'ABLE'
        firmwar, after ripping up a IDE cable in the process, switching the
        drive with a newer CD drive (ATAPI burner) it can.
<nate> I just normally rip that pin off the board.  (Though I have removed the
       wrong one before.  That kinda sucks.)
<drahn> sorry having to many problems with this machine to risk actual hardware
        damage.
But on the next day, we started to slowly realize that the ABLE firmware, or boot loader, call it whatever you want, wasn't an ordinary piece of software, but rather, an escape game.
<drahn> CATs people, beware ABLE version 184 it will not work with
        NetBSD/OpenBSD kernels.
<deraadt> that what they gave you?
<miod> is there an updated version available?
<drahn> 183 is on the web page, he emailed me a 184 for testing.
<miod> oh.
<deraadt> and he did not even test it?
<drahn> 183 'works'. 184 can boot the upgrader, but not *BSD
<deraadt> my point is, they did not even test their work?
<deraadt> these are poeple in the UK, who can be surprised.
<drahn> not well enough. Perhaps it can boot linux.
<miod> that's what those open source loonies are for!
The next day, the order for the CATS motherboards was completed.
<deraadt> our 6 cats motherboards have been paid for.
<miod> by whom?
<deraadt> by us.
<deraadt> me, austin, etc.
<deraadt> miod, millert, kevlo, deraadt, mcbride, and an extra in calgary
<miod> did you get a price cut?
<deraadt> yes, 125UK instead of 199UK
<miod> s/h included?
<deraadt> paid for that too
<kevlo> Don't know how long I can get it from UK.
<miod> guess i'll have to buy a case, some memory and a drive, before it gets
       here
<kevlo> I will be outof town between 1/21 and 1/26. You know, Chinese New Year.
<deraadt> it is pretty modern miod, but the clock is slow.  ok?
<miod> sigh... i guess this means sdram and ide
<miod> atx power supply too?
<deraadt> yes, atx
<deraadt> kicking and screaming eh?
<miod> i wonder if i can find a 150W or so power supply
<deraadt>  /m markus wait till he finds out the package is actually a amd64!
<deraadt> ooppsop!!!
<markus> :)
<miod> i have a 32MB sdram stick somewhere, this'll be more than enough
Later the same day, Rahn showed progress:
<drahn> I apologize in advance:
<drahn> OpenBSD 3.4-current (GENERIC) #12: Wed Jan 14 18:09:49 EST 2004
<drahn>     [email protected]:/b/src/sys/arch/cats/compile/GENERIC
<drahn> real mem  = 134217728 (131072K)
<drahn> avail mem = 119902208 (117092K)
<drahn> using 1433 buffers containing 6815744 bytes (6656K) of memory
<drahn> mainbus0 (root)
<drahn> cpu0 at mainbus0 irq -267840048 drq 0xf0312f50: SA-110 step S (SA-1 core)
<drahn> cpu0: DC enabled IC enabled WB enabled EABT
<drahn> cpu0: 16KB/32B 32-way Instruction cache
<drahn> cpu0: 16KB/32B 32-way write-back Data cache
<drahn> footbridge0 at mainbus0 irq -267840048 drq 0xf0312f50footbridge_attach called
<drahn> : DC21285 rev 3
<drahn> pci0 at footbridge0 bus 0
<drahn> vendor 0x10b9 product 0x1533 (class bridge subclass ISA, rev 0xc3) at pci0 dev 7 function 0 not configured
<drahn> ne0 at pci0 dev 9 function 0 vendor 0x1050 product 0x0940 rev 0x00footbridge_pci_intr_map: out of range interruptpin 1 line 2 (0x2)
<drahn> : couldn't map interrupt
<drahn> vga0 at pci0 dev 10 function 0 vendor 0x5333 product 0x8a01 rev 0x01
<drahn> wsdisplay0 at vga0
<drahn> wsdisplay0: screen 0-5 added (80x25, vt100 emulation)
<drahn> pciide0 at pci0 dev 16 function 0 vendor 0x10b9 product 0x5229 rev 0xc1: DMA, channel 0 configured to compatibility, channel 1 configured to compatibility
<drahn> pciide0: channel 0 channel interrupting at irq 14
<drahn> wd0 at pciide0 channel 0 drive >
<drahn> wd0: 128-sector PIO, CHS, 0MB, 0 cyl, 63 head, 13913 sec, 0 sectors
<drahn> wd1 at pciide0 channel 0 drive >
<drahn> wd1: 1-sector PIO, CHS, 0MB, 0 cyl, 63 head, 8224 sec, 0 sectors
<drahn> wd0(pciide0:0:0): using PIO mode 0
<drahn> wd1(pciide0:0:1): using PIO mode 0
<drahn> pciide0: channel 1 channel interrupting at irq 15
<drahn> atapiscsi0 at pciide0 channel 1 drive 0
<drahn> scsibus0 at atapiscsi0: 2 targets
<miod> crash.dalerahn.com heh
<miod> cpu0 irq is strange
<drahn> gross thing is that crash is now my /usr/src nfs server.
<drahn> after bob 'bob the builder' died.
<grange> actually only few ata devices need interrupts for probe, it's new
         promise raid controllers
<drahn> hmm, didn't configure the isa bridge... might be bad. still major
        progress.
<millert> Heh, you can tell Dale has kids
<miod> Todd, what do you mean wrt dale's kids?
<brad> the fact that he mentioned bob the builder.
<millert> "Bob the builder"
<miod> ah. unknown to me.
<millert> british kid's show
<miod> british? what a bad influence on kids!
<millert> Some kind of satanic thing I guess
<tom> huh? my 4-year-old daughter loves it!
<jolan> their website uses flash, satanic indeed
<miod> tom, and when she'll be 21, she'll vote Thatcher.
<tom> as long as she's left home by then :-)
(I have to mention, in the discussion above, that Tom lives in the UK.)

Rahn continued working silently on the port. On january 28th, de Raadt shared good news:

<deraadt> OK, our cats machines have shipped.
<drahn> damn, pressure is on ;-)
<miod> damn. will have to look for parts or even buy them.
<mickey> cats sheeping ...
<deraadt> miod, are you allowed to buy ATX cases?
<miod> and soon we'll have a herd of'em!
<miod> i own three.
<mickey> soon we'll have whole haggis of cats
<miod> and a fourth one without power supply in the attic. i think i still have
       it.
and Rahn shared even better news a bit later:
<drahn> ssh might work better with pty/tty having the right major...
<drahn> I had not fixed them all appearently.
<deraadt> in what?
<deraadt> Oh, in your tree, heh
<drahn> cats.
<deraadt> So you are ssh'ing into it now?
<deraadt> Major bugs now are FP?
<drahn> rebuilding kernel/device nodes.
<deraadt> exit and such bugs are mostly gone?
<drahn> not seen any exit bugs. run many things...
<drahn> need to debug the multiuser startup. could be these device node issues.
<drahn> touch the wrong device...
<drahn> Jan 28 23:35:34 noname init: /bin/sh on /etc/rc terminated abnormally, going to single user mode
<drahn> Jan 28 23:35:34 noname init: kernel security level changed from 1 to 0
<drahn> Enter pathname of shell or RETURN for sh:
<drahn> sh.core
<drahn> :-(
<drahn> almost MU
<deraadt> yuck
(MU above stands for "Multi-User".)

On the next day:

<drahn> $ uname -a
<drahn> OpenBSD cats 3.4 GENERIC#9 cats
<drahn> not really multi user yet, but sshed into it.
Rahn's work was commited to the OpenBSD repository on february 1st.

This allowed other developers to start contributing code cleanups and other adjustments, to begin with.

<deraadt> wow, mickey is fixing cats.  cut, pull, twist, tie, insert, sew, right?
<deraadt> meow meeeeooooow MRREOEAAWOOOWOWWWWWWWOWOOWOW
<deraadt> :-)
<mickey> always wonderred what's inside the cat making all those noises ...
<deraadt> being strangled by an arm inside
<mickey> who said cats do not have arms ?
<hshoexer> I hope fixing cats does not involve litter pans
<deraadt> it's just another architecture that can't do W^X
<deraadt> but i'll accept a non-W^X architecture in my pocket... for now..
On february 4th, the OpenBSD/cats port was starting to become a reality, with binaries in sight.
<drahn> Dragos, likely have a cats snapshot tonight/tomorrow, getting close to
        ready for zaurus ;-)
I received the CATS board on the 9th. Although sent to my home address, its delivery required a signature, and I was away (at work) when the postman came with the parcel; so I had to go to the post office in the morning on my way to work, in order to get it and sign the appropriate delivery forms.
<miod> ok, cats is there
<miod> well, in my car's mall, that is.
<mickey> you put cats in the trunk? how evil ...
<miod> i left milk.
<mickey> do you have mice in the car ?(:
<miod> i hope not
<miod> if i had mice in my car, I'd name them terry.
<mickey> or gerry ?
<miod> no, terry. because i'd have a mice terry car.
<mickey> and go on a magic mice terry tour ?
<miod> why not.
<mickey> or id'd be a female mouse could be named tress
<mickey> s/id/if

Of course, I had not paid attention to the ABLE documentation, which clearly states that it only handles PCI video and network cards, and painted myself in a corner, real quick.

<miod> damn, this cats firmware won't recognize my cheapo oldo isa ne2000 clone.
<miod> guess i'll have to burn a cd.
<henning> you're sick.
<millert> They recognize a PCI ne2000
<miod> i'm not sure i have a pci ne2000
<miod> oh yes! in the os/2 machine.
The engineers at Simtec had setup a dedicated mailinglist for the OpenBSD porting effort, to which all of us with CATS boards had been subscribed.

On february 12th, Gareth Simpson of Simtec sent a mail which apologized for a few mistakes in the board configurations, due to them having been prepared a bit hastily.

Among other things, one of the boards had been shipped with the wrong clock oscillator - apparently, part of the board testing involved running it overclocked for a while, Then the nominal clock would be put back for proper (and safe) operation. Simtec people had noticed one fast clock was missing, having been left on one of the eight boards shipped to OpenBSD developers.

The mail gave precise instructions: where to look for on the board, and what the oscillator was supposed to look like:

Date: Thu, 12 Feb 2004 18:27:55 +0000
From: Gareth Simpson
To: [email protected]
Subject: [OpenBSD] Fixes for hardware problems

Dear All,

With hearing of the problems experienced by some of you, I've
checked our records and found a lapse in our Quality checking here at
Simtec.

There are three problems associated with the batch of boards shipped to
the Open BSD team:

1) One of the boards was incorrectly fitted with a 4.91MHz reference
   oscillator
2) Boards were shipped with an older version of ABLE (V1.70)
3) Serial ports not outputting a stable baud rate

I hope that the new release of ABLE and the steps described
below solve the major issues you have come across and can only apologise
for the inconvenience caused by our mistake.  With the new, revised
systems we've introduced, this kind of problem should not occur again.

If this sorts out the hardware problems I'll go back to hiding in my
design cubicle and leave things to Vince and the others to tackle any
outstanding ABLE issues.


BRGDS,

Gareth



Solutions
---------


(1) Wrong oscillator

Can the person with the 4.91MHz oscillator module please identify
themselves so we can post you the correct 3.68MHz module.  The oscillator
can be identified as being the silver 8 pin module labelled U15 located
by the first PCI slot at the back of the machine.  In the meantime, if
you have access to a 3.68 or 3.57MHz 8 pin oscillator this can be fitted
until the replacement arrives.

With the default clock resistor settings, the 3.68MHz module gives a CPU
core clock of 228MHz.  The 4.91MHz module was used to clock processors
at 304MHz for test purposes.   Although some StrongARMs appear to work
at 300MHz, they are not reliable enough for general use. The 4.91MHz
module had escaped it's [sic] quarantine.




(2)(3) Boards shipped with old version of ABLE - wrong baud rate

These problems are related and stem from Simtec shipping an old version
of ABLE and it not being able to cope with a memory clock other than
50MHz.  Later versions of ABLE calibrate the system clock and set the
console baud rate divisors correctly for 38N1.  It appears that during
the final stages of test, the boards were "upgraded" with an old v1.70
version of ABLE.

To be able to use the serial port correctly with versions of ABLE
earlier than 1.79, the memory clock must be set to 50MHz (both LK4 and
LK5 links fitted).

Serial should work when 50MHz link settings are set and will fix the
baud rate issue so that ABLE can be upgraded.   Once ABLE has been
upgraded, the links can be returned to their old position.  The boards
were shipped set with a memory clock of 66MHz by default. (If slower
PC66 SDRAM is being used,  this might have to be reduced to 60MHz for
100% reliability). 66MHz = LK5-12 LK4-off, 60MHz = LK5-off LK4-12.


The links are located half way between the SST FLASH socket and the first
PCI socket.  LK5 is nearest the floppy connector and LK4 is towards the
back panel.  There is a small table printed on the PCB located between
the Floppy and DIMM socket PL2 listing the link settings for each of the
memory clock options.  When fitted, the link caps link between pins 1
and 2 running from left to right (parallel to the FDC connector).
I checked on the board I had received, and bingo! I was the winner loser!
<miod> ha ha ha excellent!
<miod> i am the one who got the overclocked cats board!
<miod> might explain the usb trouble.
<drahn> hmm, was thinking that.
<deraadt> this rivalry between england and france has been going on for quite
          some time
<drahn> I suspect that kevlo did as well, would explain the serial garbage.
<miod> the guy also said "unstable serial baud rate" as a distinct problem
<miod> (just catching up)

Date: Thu, 12 Feb 2004 22:28:58 +0000
From: Miod Vallat
To: Gareth Simpson
Cc: [email protected]
Subject: Re: [OpenBSD] Fixes for hardware problems

> 1) One of the boards was incorrectly fitted with a 4.91MHz reference
>    oscillator

This would be mine.
[...]
I've just checked, the board here indeed comes with a socketed 4.91 MHz
chip in U15. The good news is that shipping to Europe is less expensive
than shipping too abroad... (-:
[...]
I was then told I would get a new oscillator shipped to my address, but in the meantime there was no reason for me not to try and set up the board. I downloaded Rahn's binaries and booted the installer, which quickly dropped me into the kernel debugger.
<miod> wow, and bsd.rd.cats from ~drahn is in ddb now.
<miod> dale, check for a spurious Debugger() call
<drahn> first line it prints?
<miod> it's not a panic.
<miod> a Debugger() before copyright is printed.
<miod> say "cont" and you boot.
<drahn> there was a problem where setargs default is something like drive=hd0,
        ro foobdebe or something, and that saw the 'd' and enabled debugger.
<drahn> I thought I fixed that by requiring a '-' at the beginning of the
        options.
<miod> mayhaps it picks the d from bsd.rd
<drahn> next time you reset, run 'showargs'
<miod> i have never set them.
<drahn> firmware defaults it to something screwy for linux.
<miod> >showargs
<miod> current: setargs root=/dev/hda3 ro video=dram:1024K
<miod> aack
<drahn> 'd' of dram -> debugger.
<miod> bwahahaha!
<miod> >setargs
<miod> >showargs
<miod> current: setargs ?>?=/dev/hd?>?
<miod> and then!
<miod> >setargs ""
<miod> Freeing already free block (block f0053edc)
<drahn> bleh
<pval> wow
<pval> does this suck or what
<drahn> top quality software.
<miod> time to post on bugtraq, buf oflow in able (-;
It appeared that everyone had a bad experience with ABLE that day...
<kevlo> Great. Using the firmware version 1.88, output via serial port is ok,
        but ps/2 keyboard doens't work
<pval> it's either one or the other, you have to pick which firmware :P
<miod> kevlo, yup, this hit me too, and 1.89 is supposed to correct this
<kevlo> ok, PS/2 keyboard works using 1.89 on cats
<miod> good
<miod> have you tried accessing ffs partitions from the firmware?
<kevlo> It seems to me that it can't access ffs :(
<kevlo> I'm trying to boot from cdrom. "boot (cd0)bsd.rd"
<kevlo> Error obtaing the file format of file (cd0)bsd.rd
<kevlo> :(
<miod> can you ls (cd0)?
<miod> maybe it only tries iso9660 on cd-rom
<kevlo> No, I can't.
<miod> was the disk in the drive before you powered up the machine?
<miod> if not, put the disk in the tray, and hardware reset the board.
<miod> check it finds a (cd0) alias for (hdb) or whatever your cdrom drive is
<kevlo> Mine output is:
<kevlo> hda: ST320014A: diagnosing drive: ok
<kevlo> hdc: ATAPI CDROM: NEC                 CD-ROM DRIVE:282
<kevlo> (hda) 18GB
<kevlo> (hd0) on (hda1)
<kevlo> BSD disklabel within MS-DOS partition (hda1)
<kevlo> (hd1) on (hda1p1)
<kevlo> (hdc) Drive Empty
<kevlo> >ls
<kevlo> (hd1) (aliases to hda1p1)
<kevlo> (hd0) (aliases to hda1)
<kevlo> (nvram0) (aliases to ds1687)
<kevlo> (hda1p1)
<kevlo> (hda1)
<kevlo> (hda)
<kevlo> (tftpboot)
<kevlo> (romfs)
<kevlo> (console)
<kevlo> (ds1687)
<kevlo> Something's wrong?
<miod> (hdc) Drive Empty
<kevlo> That's weird. bsd.rd is on cd
<miod> open the tray, close the tray, wait for the cd-rom drive to settle, and
       reset the board...
<kevlo> Damn, still the same. I also changed three CDROMs *sigh*
The "fun" continued the next day.
<drahn> I did dislike their comment, "able is a bootloader, not firmware"
<drahn> I screwed up and assumed the first arg of setargs would be device,
        fixing very shortly.
<deraadt> no kidding; we need to educate them a bit
<deraadt> If they want to be more than the PC BIOS, then they need to learn more
          about real machines instead of crafting a new kind of balony.
[...]
<drahn> hmm they didn't quite answer my question, is there a difference between
        'boot (hd0)bsd' and '(hd0)bsd'
<miod> yes.
<miod> boot ignore further args but picks setargs.
<miod> without boot you execute the entire command line, setargs ignored.
<miod> [at least that's what i understood]
[...]
<miod> >ls (hd0)
<miod> ffs_mount: magic number is wrong (is 000005a0)
<miod> perror: stat failed (2 - No such file or directory)
<miod> ls: (hd0): No such file or directory
<miod> damn again.
<deraadt> requires a reset eh
<miod> 1.91 cold reset.
<deraadt> wow.
<miod> i am booting the kernel off the cd, then enter root device and, well, it
       kinda works
[...]
<matthieu> It looks to me like the simtec guys should give us the soruce of
           their "bios" and let Miod, Dale and Kevlo fix it.
<miod> i'm sure this would come faster.
<drahn> matthieu, I have been thinking I should work on firmware...
<deraadt> If their code is that bad, imagine their development environment.
<miod> i'm quite sure they don't use any form of cvs
<drahn> there was a 'cvs -n up' line in one of the emails.
<mickey> they probably even do know about make
<miod> otherwise they would have spotted disabling the keyboard driver in 1.88,
       for example.
Despite all the ABLE troubles, I ended up being able to set up my system, and booted the freshly-installed system for the first time.
<miod> generating keys...
[18 minutes later]
<miod> yawn. dale, how much time does ssh-keygen take?
<drahn> dont recall, didn't think it took too long, but it ran before I had FP
        fixed, when openssl ran 6 times faster.
<miod> fixing causes a 6 times loss?
<miod> i wouldn't call this fixing )-:
<drahn> well, when the machine had floating point issues. i had a run of openssl
        (with incorrect flags) and it was much faster than what I have now.
<drahn> IIRC openssl should not use floating point.
<miod> bah, as long as it's faster than sun3...
[12 more minutes later]
<miod> it's starting to get the 68060 look fast wrt DSA key generation.
<drahn> It is my opinion that the machine is faster than this, but something is
        screwed up.
<miod> i share this opinion.
[20 more minutes later]
<miod> ok, it's slower than a 68040 running 12 times slower... impressive.
[48 more minutes later]
<miod> i wonder if the arm slowness wrt ssh-keygen could simply be a very bad
       entropy gathering.
<drahn> is it still running?
<miod> yes.
<drahn> did you install from net or cd?
<miod> from ftp.
<miod> same as theo: could not mount cd
<miod> two line diff i can't test because i don't want to burn too many cds.
<drahn> cd generated how? 'mkhybrid -r -o cd.iso cd'? disk I generated 4 hours
        ago works fine.
<miod> simple mkisofs -r -R -J
Somehow, I was the only one with ssh-keygen spinning.

Eventually I interrupted the process to complete the multiuser boot; when logged at the console, I could rerun ssh-keygen under gdb, but everything looked correct, the process was running correctly and busy deep in the bowels of the crypto library.

It did not occur to me at the moment that this was a consquence of the machine being overclocked - but when I received the proper clock and replaced it, the problem immediately disappeared.

Date: Wed, 18 Feb 2004 18:11:29 +0000
From: Miod Vallat
To: [email protected]
Subject: Re: [OpenBSD] Fixes for hardware problems

> I've just checked, the board here indeed comes with a socketed 4.91 MHz
> chip in U15. The good news is that shipping to Europe is less expensive
> than shipping too abroad... (-:

The correct chip was in the mail today, and I'm glad to announce that my
board suddenly became much more stable - a lot of puzzling issues
(especially process spinning) are gone for good with the correct clock.

So I'll happily confirm that the processor is not stable at 304 MHz.

Do you want me to return the 4.91MHz chip?

Miod
This ssh-keygen however was the least of my problems, as I was unable to get ABLE to boot the kernel from disk; I had to burn the kernel binary on a CD-R, and boot from CD-R, telling the kernel to use the IDE disk as its root device.
Date: Mon, 16 Feb 2004 22:47:29 +0000
From: Miod Vallat
To: [email protected]
Subject: Unable to boot off ffs

The disk setup I have here has its partition recognized since 1.89 but I
can not boot from them. From the error messages and the partial disk
dump provided below, I suppose ABLE expects something about the label
which is not respected by my layout (no MBR, disklabel in logical sector
#1).

Can anybody shed light on this?

Miod

>version
ABLE: 1.92 (cats(vga-x86),footbridge) (ben@kira) Mon Feb 16 21:17:15 GMT 2004
>ls
(cd0) (aliases to (hdb1):iso9660)
(hd8) (aliases to (hda9):ffs)
(hd7) (aliases to (hda8):ffs)
(hd6) (aliases to (hda7):ffs)
(hd5) (aliases to (hda6):ffs)
(hd4) (aliases to (hda5):ffs)
(hd3) (aliases to (hda4):ffs)
(hd2) (aliases to (hda3):ffs)
(hd1) (aliases to hda2)
(hd0) (aliases to (hda1):ffs)
(nvram0) (aliases to ds1687)
(hdb1)
(hdb)
(hda9)
(hda8)
(hda7)
(hda6)
(hda5)
(hda4)
(hda3)
(hda2)
(hda1)
(hda)
(tftpboot)
(romfs)
(console)
(ds1687)
>ls (hd0)
ffs_mount: magic number is wrong (is 000005a0)
perror: stat failed (2 - No such file or directory)
ls: (hd0): No such file or directory
>ls (hd2)
ffs_mount: magic number is wrong (is 000005a0)
perror: stat failed (2 - No such file or directory)
ls: (hd2): No such file or directory
>

hexdump -C -n 16384 /dev/rwd0c (note 0x000005a0 at 0x2558)

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  57 45 56 82 05 00 00 00  45 53 44 49 2f 49 44 45  |WEV.....ESDI/IDE|
00000210  20 64 69 73 6b 00 00 00  51 55 41 4e 54 55 4d 20  | disk...QUANTUM |
00000220  46 49 52 45 42 41 4c 4c  00 02 00 00 3f 00 00 00  |FIREBALL....?...|
00000230  0f 00 00 00 b4 14 00 00  b1 03 00 00 74 6c 4c 00  |....´...±...tlL.|
00000240  00 00 00 00 00 00 00 00  10 0e 01 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000280  00 00 00 00 57 45 56 82  26 1c 10 00 00 20 00 00  |....WEV.&.... ..|
00000290  00 00 01 00 e2 df 01 00  00 00 00 00 00 08 00 00  |....âß..........|
000002a0  07 08 a0 00 85 fe 03 00  e2 df 01 00 00 08 00 00  |.. ..þ..âß......|
000002b0  01 08 10 00 74 6c 4c 00  00 00 00 00 00 00 00 00  |....tlL.........|
000002c0  00 00 00 00 27 41 01 00  67 de 05 00 00 08 00 00  |....'A..gÞ......|
000002d0  07 08 60 00 09 21 03 00  8e 1f 07 00 00 08 00 00  |..`..!..........|
000002e0  07 08 e0 00 09 21 03 00  97 40 0a 00 00 08 00 00  |..à..!...@......|
000002f0  07 08 e0 00 af c0 08 00  a0 61 0d 00 00 08 00 00  |..à.¯À.. a......|
00000300  07 08 40 01 98 7e 16 00  4f 22 16 00 00 08 00 00  |..@..~..O"......|
00000310  07 08 40 01 37 40 10 00  e7 a0 2c 00 00 08 00 00  |[email protected]@..ç ,.....|
00000320  07 08 40 01 56 8b 0f 00  1e e1 3c 00 00 08 00 00  |[email protected]....á<.....|
00000330  07 08 40 01 00 00 00 00  00 00 00 00 00 00 00 00  |..@.............|
00000340  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  00 00 00 00 00 00 00 00  08 00 00 00 10 00 00 00  |................|
00002010  18 00 00 00 68 02 00 00  10 00 00 00 f0 ff ff ff  |....h.......ðÿÿÿ|
00002020  88 18 31 40 f8 77 00 00  8f 75 00 00 01 00 00 00  |..1@øw...u......|
00002030  00 40 00 00 00 08 00 00  08 00 00 00 05 00 00 00  |.@..............|
00002040  00 00 00 00 3c 00 00 00  00 c0 ff ff 00 f8 ff ff  |....<....Àÿÿ.øÿÿ|
00002050  0e 00 00 00 0b 00 00 00  03 00 00 00 00 10 00 00  |................|
00002060  03 00 00 00 02 00 00 00  00 10 00 00 00 fc ff ff  |.............üÿÿ|
00002070  0a 00 00 00 00 10 00 00  80 00 00 00 04 00 00 00  |................|
00002080  00 00 00 00 3f 00 00 00  01 00 00 00 00 00 00 00  |....?...........|
00002090  3a cc 0e 01 3d bc f2 9b  68 02 00 00 00 08 00 00  |:Ì..=¼ò.h.......|
000020a0  00 20 00 00 0f 00 00 00  3f 00 00 00 b1 03 00 00  |. ......?...±...|
000020b0  82 00 00 00 a0 00 00 00  00 25 00 00 a8 93 00 00  |.... ....%..¨...|
000020c0  25 00 00 00 41 07 00 00  6f 1f 00 00 0c 00 00 00  |%...A...o.......|
000020d0  00 00 00 02 2f 00 00 00  00 00 00 00 00 00 00 00  |..../...........|
000020e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002340  00 00 00 00 00 00 00 00  00 00 00 00 a0 74 0e f1  |............ t.ñ|
00002350  00 00 13 f1 00 08 13 f1  20 00 00 00 00 00 00 00  |...ñ...ñ .......|
00002360  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000024a0  00 00 00 00 00 00 00 00  00 00 00 00 00 40 00 00  |.............@..|
000024b0  40 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |@...............|
000024c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002520  00 00 00 00 03 00 00 00  3c 00 00 00 02 00 00 00  |........<.......|
00002530  ff ff 02 04 40 00 04 00  ff 3f 00 00 00 00 00 00  |ÿÿ..@...ÿ?......|
00002540  ff 07 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |ÿ...............|
00002550  01 00 00 00 60 05 00 00  a0 05 00 00 54 19 01 00  |....`... ...T...|
00002560  00 00 1e 00 3c 00 59 00  77 00 94 00 b2 00 cf 00  |....<.Y.w...².Ï.|
00002570  ed 00 0a 01 28 01 45 01  63 01 80 01 9e 01 bb 01  |í...(.E.c.....».|
00002580  d9 01 f7 01 14 02 32 02  4f 02 6d 02 8a 02 a8 02  |Ù.÷...2.O.m...¨.|
00002590  c5 02 e3 02 00 03 1e 03  3b 03 59 03 76 03 94 03  |Å.ã.....;.Y.v...|
000025a0  01 01 01 01 01 01 01 01  01 01 01 01 01 01 01 01  |................|
[...]
00002940  01 01 01 01 01 01 01 01  01 01 01 01 01 01 01 01  |................|
00002950  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00004000

disklabel wd0:

# /dev/rwd0c:
[...]
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 15
sectors/cylinder: 945
cylinders: 5300
total sectors: 5008500
[...]
16 partitions:
#        size   offset    fstype   [fsize bsize   cpg]
  a:   122850        0    4.2BSD     2048 16384   160   # (Cyl.    0 - 129)
  b:   261765   122850      swap                        # (Cyl.  130 - 406)
  c:  5008500        0    unused        0     0         # (Cyl.    0 - 5299)
  d:    82215   384615    4.2BSD     2048 16384    96   # (Cyl.  407 - 493)
  e:   205065   466830    4.2BSD     2048 16384   224   # (Cyl.  494 - 710)
  f:   205065   671895    4.2BSD     2048 16384   224   # (Cyl.  711 - 927)
  g:   573615   876960    4.2BSD     2048 16384   320   # (Cyl.  928 - 1534)
  h:  1474200  1450575    4.2BSD     2048 16384   320   # (Cyl. 1535 - 3094)
  i:  1065015  2924775    4.2BSD     2048 16384   320   # (Cyl. 3095 - 4221)
  j:  1018710  3989790    4.2BSD     2048 16384   320   # (Cyl. 4222 - 5299)
A lot of exchanges happened over the next few days, with me trying different disks or disk setups, Todd Miller reporting that he had been lucky to end up with a working setup, using a 80GB Maxtor disk.

I also was testing new ABLE builds almost daily; as the only way to update the firmware without being able to correctly read from the on-disk filesystems, I had to burn a CD-R for every new firmware.

In one of the exchanges with Simtec engineers, I mentioned that:

I'll wait for the "boot from ffs" problems I have are fixed before I
play with this, as I am burning too many cd-r with just firmware updates
by now...

After having been prodded by Rahn about the on-board IDE controller failing to work in the DMA modes it was supposed to handle, the Simtec engineers investigated and noticed that the factory settings for some board jumpers controlling various timings in bus operation were not correct.

On february 23rd, we received this email from them...

Date: Mon, 23 Feb 2004 22:30:59 +0000
From: Ben Dooks
To: [email protected]
Subject: [OpenBSD] DMA Issue

The DMA issues with your boards have been traced. The newer revision
boards have been shipped with incorrect link settings for LK12, LK13
and LK14.

The modifications to fix this are as follows:

        LK12 - link to be set with cap on 1-2
        LK13 - link to be set with cap on 1-2
        LK14 - link to be set with cap on 1-2

The links are located between the SA-110, DC21285 and the rear
Parallel/Serial socket set.

We are very sorry for the inconvenience this has caused you.

We will issue a Field-Change-Notice for this modification.

It seems that the older revision AA footbridges ignore these
links when configured for PCI Central function, and that the
move to AB revision silicon has caused the problem to surface.

We will be issuing a new copy of ABLE once we have sorted out
the current development issues and a small set of bugs. This
should be V1.94 and be issued tomorrow (24th Feb 2004).


Link Descriptions
-----------------

LK12:   2-3 = footbridge test mode
        1-2 = normal footbridge mode

LK13:   2-3 = footbrigde returns PCI class 0x0E00001
        1-2 = footbridge returns PCI class 0xB400001

LK14:   2-3 = footbridge in boot-rom program mode
        1-2 = footbridge in normal run mode
The resource page was indeed updated with a ``Product Change Notice''.

Quoting from it:

Reason For Change

To ensure correct operation of PCI bus masters;this affects DMA bus master
transfers and in some configurations access of PCI devices to the SDRAM BAR.

This issue is caused because the three links LK12,LK13 and LK14 were shipped set
to test mode.

Description of Changes

LK12,LK13 and LK14 should be moved to position marked 1-2.This will ensure these
links are set for correct operation and not test modes.

<drahn> hey, they found the cats dma problem.
<deraadt> oh??
<millert> Ah, the cats people found the dma issue; good
<deraadt> wd0a: DMA error reading fsbn 16 of 16-31 (wd0 bn 16; cn 0 tn 0 sn 16), retrying
<deraadt> wd0: transfer error, downgrading to Ultra-DMA mode 1
<deraadt> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 1
<deraadt> wd0a: DMA error reading fsbn 16 of 16-31 (wd0 bn 16; cn 0 tn 0 sn 16), retrying
<deraadt> wd0: soft error (corrected)
<millert> Cool, I get udma 1 now on cats
<grange> why it can't do udma2?
<millert> Chip bug
The correct clock and the jumper changes would however not help fix my boot-from-disk issue.
Date: Tue, 24 Feb 2004 14:10:08 +0000
From: Miod Vallat
To: [email protected]
Subject: More on the ffs reading problem

I have no idea whether this can help or not, but I have compared my disk
layout against dale's (which work) and found that our disk geometries
were different.

Dale's disk has a canonical /16/63 geometry:

bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008

while the disk I am using doesn't:

bytes/sector: 512
sectors/track: 63
tracks/cylinder: 15
sectors/cylinder: 945

Dale's partition starts at sector 2:

16 partitions:
#        size   offset    fstype   [fsize bsize   cpg]
  a:   409246        2    4.2BSD     2048 16384   328   # (Cyl.    0*- 405)

Mine started at sector 0, so I changed this as well:

16 partitions:
#        size   offset    fstype   [fsize bsize   cpg]
  a:   122850        0    4.2BSD     2048 16384   160   # (Cyl.    0 - 129)

of course, I am still unable to read any of my ffs partitions from ABLE.

Could the mismatch be caused somehow by the disk geometry?

Miod
While the CATS port was making progress, de Raadt was still thinking about the Zaurus...
<deraadt> cats is there for a reason.  so we can put openbsd in our pockets.
<drahn> dont want a scorched pocket.
<beck> I always have openbsd in my pocket.
<beck> but it's on a usb key :)
<krw> "is that openbsd in your pocket or are you happy to see me?"
<cloder> pocketbsd?
<beck> nah. more like "I'm not logging in from that - here let me reboot it" :)
Things started to settle down as march started.
<millert> Has there been any mail from the cats people in the past few days?
<miod> no
<miod> mine is off until i get a working firmware
At this point, I had given up on being able to boot from disk.
Date: Wed, 3 Mar 2004 01:10:17 +0000
From: Miod Vallat
To: [email protected]
Subject: (not) booting from ffs...

Hello,

  since there has been no progress on the ffs boot issue recently, and
since I am now concentrating my work towards bug fixing and
documentation updates only, for the next OpenBSD release, I probably
won't spend much time on arm hardware.

  Do you guys think it would make things easier for you if I were to
ship the hard drive I am using to UK, so that you can perform live tests
on it? You'd ship it back once the issue is fixed.

Miod
This triggered an interesting answer:
Date: Wed, 3 Mar 2004 11:31:46 +0000
From: Ben Dooks
To: [email protected]
Subject: Re: [OpenBSD] (not) booting from ffs...

On Wed, Mar 03, 2004 at 01:10:17AM +0000, Miod Vallat wrote:
> Hello,
>
>   since there has been no progress on the ffs boot issue recently, and
> since I am now concentrating my work towards bug fixing and
> documentation updates only, for the next OpenBSD release, I probably
> won't spend much time on arm hardware.

We hope to have a release today, I was not very well from thursday-sunday
last week, so all work has suffered a delay.

We think that your hard-drive problem is down to multi-sector operations,
and there has been a change in the code that reads the multi-sector
capabilites from the hard-drive. We have also added an option to force
a maximum limit on multi-sector reads in case the drive is just reporting
a strange value.

>   Do you guys think it would make things easier for you if I were to
> ship the hard drive I am using to UK, so that you can perform live tests
> on it? You'd ship it back once the issue is fixed.

If the next release does not address your problems, then we can look
at this an options, thanks.

-- 
Ben

Q:      What's a light-year?
A:      One-third less calories than a regular year.
They were right, and limiting the number of sectors read at a time in the latest ABLE version was finally able to access my disk correctly.
Date: Thu, 4 Mar 2004 17:44:44 +0000
From: Miod Vallat
To: [email protected]
Subject: Re: [OpenBSD] ABLE 1.94r3 and the state of play

> Attached is ABLE version 1.94r3. This should fix the issue with elf
> loading debug segments and provide a configuration value
> (ide.multi-limit) to limit the multi sector reading
>
> This may fix Miod's problem with his unusual disc (set it to the value
> "1" which will only perform single sector reads despite what the drive
> reports)

I confirm that, with ide.multi-limit set to 1, ABLE will now access the
failing disk correctly. I'll play with different values in order to pick
the best one still allowing me to boot.

Now if only your command line interface would handle control-U
correctly, I think I'd be happy for a while.

Thank you very much for your work guys.

Miod

Date: Fri, 5 Mar 2004 00:55:46 +0000
From: Miod Vallat
To: [email protected]
Subject: Re: [OpenBSD] ABLE 1.94r3 and the state of play

> I confirm that, with ide.multi-limit set to 1, ABLE will now access the
> failing disk correctly. I'll play with different values in order to pick
> the best one still allowing me to boot.

For the record, setting the value to 2 also works. Any other value
results in an error accessing the ffs filesystems.

Miod

<miod> http://www.simtec.co.uk/products/SWOPENBSD/
<miod> "Simtec are happy to be able to assist the OpenBSD team by providing
       hardware and software support to assist their project"
[...]
<drahn> miod it was 1.92 that mostly worked, right? 1.93 was the 'bad' version?
<miod> no 1.94 was bad
<miod> 1.93 was ok for me
<miod> (except for the disk)
<miod> 1.94rc3 apparently has no regressions here so far
[...]
<drahn> weird... with new rom, I get
<drahn> Warning: PCI Class code is wrong, possible LK12-LK14 setting error
<drahn> *******************************************************************************
<drahn> ERROR: Link LK14 is configured for blank program mode
<drahn> ERROR: PCI Bus mastering may not work in this configuration
<drahn> *******************************************************************************
<drahn> I moved the jumpers, and the message disappeared. machine sees to be
        booting still.
<miod> they wrote in their mail that they had added detection of the jumpers in
       order to warn... guess it works.
<miod> time to update the cats documentation to reflect these changes.

From then on, the CATS situation was quite stable. All boards would run, the kernel ran reliably, and ports people were (not) delighted to have to fix 3rd-party software to correctly build on run on OpenBSD/arm systems.

It was time to work on the real objective: the Zaurus port.

To be continued...

联系我们 contact @ memedata.com