Wednesday, December 24, 2008

The UDEV Problem

So another ongoing problem I'm having is the issue of UDEV not starting properly. When I boot, the process will hang at "Starting UDEV...." and sit there for about 4 minutes, until it times out.

The boot process will continue at this point, until it gets to "Loading HAL", at which point it will hang again for another 2 minutes.

At one point, the problem seemed to go away automagically, and the computer booted about as fast as a moderately bloated Windows boot -- which for Linux, is lightning. But then I made the mistake of running the system update. In addition to breaking a bunch of other things, it also blew up my UDEV groove, and the computer was back to aking around 10 minutes to reach a usable desktop.

It would be nice if there was some way to track what UDEV is doing during the boot process, but there isn't.

Problem: UDEV hangs at "Starting UDEV", waits for timeout, stalls again at "Starting HAL daemon"

Reproducable: always
How to Reproduce: Boot the computer
Workaround: Sit there and wait

Steps to troubleshoot:

1. Read piles of man pages and comb through forums, Write a few questions into forums, which were ignored. Read more man pages. Google progressively, trying new searches every time I learned a new word.

2. Downloaded and learned to use a program called "bootchart" (http://www.bootchart.org/). After wrangling with it for several hours, I learned from it that there was a long delay while UDEV "waited for devices to settle".

There was not much information available about what this means, or how to tamper with this process. Most of the info obtained by googling "waiting for devices to settle" found the phrase in forum posts by people whop were also exasperated with UDEV for some reason or other.

There are also some tools available for managing UDEV here: http://linux.die.net/man/8/udevadm. But these are all intended to be used after the computer has performed the glacial task of starting up. There are no tools for looking into UDEV during the boot process.

3. Tried to look around the computer for some explanation of what was happening.

There is a good page of how to write UDEV rules here: http://www.reactivated.net/writing_udev_rules.html. The guy who wrote this is apparently the Lao_Tze of UDEV, since most forum posts about problems with UDEV will refer readers to this guide. Too bad it isn't much help for resolving issues with out of the box UDEV problems.

There is a lot of info about how udev works in general. But to the average user, not much of it means anything. And again, the screed assumes that the reader is trying to create new problems for himself, not solve existing ones. There is no identifiable information about what UDEV is doing while it boots.

It did give me the idea of going through /etc/udev/rules.d and renaming the files one by one in the hope of finding the problem rules file. After about 2 hours I had managed to test about 4 files and got bored. I then copied the entire rules.d directory to a usb stick, and deleted everything except the files that were necessary.

Specifically, I was after 60-pcmcia.rules, for one. I read a lot of posts pointing to it as a likely culprit.

More later ...

More:

1. It seems likely that any problem manifested as a UDEV error happened somewhere else first, since UDEV gets things passed on to it from other programs. I read about this is Border's last night and can't remember the details.

More on WLAN Investigation

I tried a few other distros after a Fedora update broke my system. Thing is, Linux Mint had the same ZD1211 problem. *

It turns out that the bootcode error is not an uncommon problem, and as usual, there is no documentation for the driver from the vendor, nor is there anything to be gained from spending hours combing through forums. People who have this problem are just screwed.

I wonder if this is part of my udev issue.

Anyway, I did see in one Debian thread that it's possible several versions of the driver are loading at once and colliding. Of course, there is no information on how to get this to stop happening, but it's a start.

Maybe it says something about how bad getting info is when my efforts to look into this by a Google search have as the first item my own ramblings on how there is no information.




-----------------
*Ubuntu is a dud. I give up on them. I've tried 5 versions and my lappy barfed them all up.

Saturday, December 13, 2008

A WLAN Investigation

My onboard wifi has stopped working. It will not conect automatically. For awhile, I could get sporadic manual connections by:

1. Restarting the computer or
2. Restarting Network Manager (-->Services -->network-manager)

If I go into Network Configuration (-->System --> Administration --> Network Device Control), the onboard wifi is shown as "inactive". If I highlight it, the "Activate" button on top grays out (unusable).

Under the hardware tab in this screen, the wifi chip is shown as ASUSTek Wireless WL 159g (this is correct).

Under Device Tab, double clicking the adapter:
--> General: Conttrolled by network manager ticked, Activate device when computer starts ticked. Nickname is wlan0. Automatically obtain ip address with DHCP selected. Nothing else is ticked.

--> Harware: MAC Address is 00:13:d4:18:74:b8
--> Wireless settings mode = Master


The adapter is detected in lsusb:

lsusb -v :

Bus 001 Device 003: ID 0b05:170c ASUSTek Computer, Inc. WL-159g
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 255 Vendor Specific Subclass
bDeviceProtocol 255 Vendor Specific Protocol
bMaxPacketSize0 64
idVendor 0x0b05 ASUSTek Computer, Inc.
idProduct 0x170c WL-159g
bcdDevice 48.02
iManufacturer 16 ASUS
iProduct 32 USB2.0 WLAN
iSerial 0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 46
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0x80
(Bus Powered)
MaxPower 500mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 4
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 1
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x04 EP 4 OUT
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 1
Device Qualifier (for other device speed):
bLength 10
bDescriptorType 6
bcdUSB 2.00
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 255 Vendor Specific Subclass
bDeviceProtocol 255 Vendor Specific Protocol
bMaxPacketSize0 64
bNumConfigurations 1
Device Status: 0x0000
(Bus Powered)

==================================

But dmsg had this to say about it:


hub 3-0:1.0: 3 ports detected
usb 1-5: configuration #1 chosen from 1 choice
usb 1-5: New USB device found, idVendor=0b05, idProduct=170c
usb 1-5: New USB device strings: Mfr=16, Product=32, SerialNumber=0
usb 1-5: Product: USB2.0 WLAN
usb 1-5: Manufacturer: ASUS


Then:
zd1211rw 1-5:1.0: phy0

Then:

firmware: requesting zd1211/zd1211_ub
usb 1-5: firmware version 0x4330 and device bootcode version 0x4802 differ
firmware: requesting zd1211/zd1211_ur
usb 1-5: USB control request for firmware upload failed. Error number -32
zd1211rw 1-5:1.0: couldn't load firmware. Error number -32
firmware: requesting zd1211/zd1211_ub
usb 1-5: firmware version 0x4330 and device bootcode version 0x4802 differ
firmware: requesting zd1211/zd1211_ur
usb 1-5: USB control request for firmware upload failed. Error number -32
zd1211rw 1-5:1.0: couldn't load firmware. Error number -32

===================

The wifi would not connect. More, the device was not shown when clicking the nm panel applet. My network was also ot listed.

When I tried to set up "connect to other network," choosing my own network alreay listed under "Auto", the "Disconected" flag came up immediately.

=============================

But another USB plug-in adapter worked fine:

dmesg [snip]

usb 1-6: new high speed USB device using ehci_hcd and address 4
usb 1-6: configuration #1 chosen from 1 choice
usb 1-6: New USB device found, idVendor=050d, idProduct=905b
usb 1-6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-6: Manufacturer: Belkin
phy1: Selected rate control algorithm 'pid'
Registered led device: rt73usb-phy1:radio
Registered led device: rt73usb-phy1:assoc
Registered led device: rt73usb-phy1:quality
usbcore: registered new interface driver rt73usb
firmware: requesting rt73.bin
ADDRCONF(NETDEV_UP): wlan1: link is not ready
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticated
wlan1: associate with AP 00:17:3f:62:1d:c5
wlan1: RX AssocResp from 00:17:3f:62:1d:c5 (capab=0x421 status=0 aid=1)
wlan1: associated
ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
wlan1: no IPv6 routers present
wlan1: disassociating by local choice (reason=3)
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticated
wlan1: associate with AP 00:17:3f:62:1d:c5
wlan1: RX AssocResp from 00:17:3f:62:1d:c5 (capab=0x421 status=0 aid=1)
wlan1: associated
wlan1: disassociating by local choice (reason=3)
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticated
wlan1: associate with AP 00:17:3f:62:1d:c5
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticated
wlan1: associate with AP 00:17:3f:62:1d:c5
wlan1: RX ReassocResp from 00:17:3f:62:1d:c5 (capab=0x421 status=0 aid=1)
wlan1: associated
wlan1: disassociating by local choice (reason=3)
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticated
wlan1: associate with AP 00:17:3f:62:1d:c5
wlan1: RX AssocResp from 00:17:3f:62:1d:c5 (capab=0x421 status=0 aid=1)
wlan1: associated
wlan1: disassociating by local choice (reason=3)
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticate with AP 00:17:3f:62:1d:c5
wlan1: authenticated
wlan1: associate with AP 00:17:3f:62:1d:c5
wlan1: RX ReassocResp from 00:17:3f:62:1d:c5 (capab=0x421 status=0 aid=1)
wlan1: associated

========================

Right now, under lsmod, both rt73 (the Belkin USB external) are loaded:

lsmod:

rt73usb 31232 0
rt2x00usb 17792 1 rt73usb
rt2x00lib 43776 2 rt73usb,rt2x00usb
rfkill 17316 1 rt2x00lib


Then:

zd1211rw 54280 0

=================================

But zd1211 has no info in the "used by" column.


++++++++++++++++++++++++++++++++++

This is important:


firmware: requesting zd1211/zd1211_ub
usb 1-5: firmware version 0x4330 and device bootcode version 0x4802 differ
firmware: requesting zd1211/zd1211_ur
usb 1-5: USB control request for firmware upload failed. Error number -32
zd1211rw 1-5:1.0: couldn't load firmware. Error number -32
firmware: requesting zd1211/zd1211_ub
usb 1-5: firmware version 0x4330 and device bootcode version 0x4802 differ
firmware: requesting zd1211/zd1211_ur
usb 1-5: USB control request for firmware upload failed. Error number -32
zd1211rw 1-5:1.0: couldn't load firmware. Error number -32


Searching forums yields no information other than that this error and problems with ZD1211 are common: a lot of people ask, no one answers.


=================================

Here is additional information about the ASUSTEk WLAN, from

http://www.modem-help.co.uk/ASUSTeK/WL-159g-USB-2-0-Wireless-Network-Adapter.html



http://www.modem-help.co.uk/search.php?id=USB\VID_0B05%26PID_170C#results

http://www.modem-help.co.uk/search.php?id=USB\VID_0B05%26PID_170C#results

http://www.modem-help.co.uk/ZyDAS/chipset.types/802-11-WLAN-Controllerless.html


1. There are files in /lib/firmware for zd1211, but I can't look at them in gedit.
2. Yum shows zd1211-firmware is installed. The "Info" tab in Yumex and the dates of the /lib files match and show the driver version number is 1.4