Monthly Archive for July, 2007

Rant Time: Domain Names

I’ve been looking to buy a domain name for some time now.  I could be boring and just register joeyjwc.com, but I don’t really want to do that.  When giving out my email address to others verbally, most people get confused and try to send an email to joeywc.  They fail to remember that I repeat my first initial.  When I register at new forums now, I try to take the username JWC.  It’s far easier to remember.  Of course, I didn’t expect that a three letter domain name would be free.  In fact, jwc.com has been taken by a legitimate business, so I don’t have any problem with that.  What I do have a problem with, however, is the fact that nearly every other three letter domain name has been taken by these businesses that seek to buy large numbers of domain names in bulk and then sell them off at very high prices.  If you work out the numbers, the return is extremely high given the risk.  Sorry, but I’m not paying $6000 for a domain name.  In fact, 3la.org tells us that every single 3 letter .com domain name has been taken.  I would imagine that most of them are owned by such companies.
So then, just for the heck of it, I decided to look up random words from the dictionary in a WHOIS database.  Guess what?  Even the most obscure words have been registered to various businesses seeking to auction off their domain names.  Maybe I’m just extremely unoriginal, or maybe these companies are getting away with murder.  While nobody cares about a personal website, think about business websites.  I wonder how many small businesses decide to scratch the whole website thing because their domain name has been taken and its owner wants $10,000 for it.  I just don’t see how this is economically helpful.

I’m not arguing that resale is always bad.  The web hosting business has long been comprised of resellers.  However, this system works very well.  Big server companies sell their servers and collocation services to large webhosting companies, which offer their clients dedicated servers, virtual servers, and other large packages.  In turn, these clients often include smaller webhosting companies that can charge their own clients a small amount of money for an appropriate shared hosting package (geared usually towards personal websites and small businesses).  Each party ends up making money: shared hosting clients may use the website to make money via AdSense or other such programs (or because those clients are small businesses), the shared hosting webhosts make money from their clients, the large hosting companies make money because they rely on having a few customers that pay high prices for large hosting packages, and the server farms make money from housing the servers.  The domain name industry doesn’t do that.  It just costs some businesses enormous amounts of money while it causes the domain name owners to become incredibly rich.  ICANN’T believe that ICANN continues to let this occur.  Bob Parsons, of GoDaddy.com, had brought up a similar topic about tasting and kiting in a recent blog post of his.

The bottom line is: if I see another “What you need, when you need it” slogan again, I’m going to scream.

XML Parsing in PHP Revisited

After ranting about how PHP lacks a simple XML parser (and skips straight to a somewhat buggy and much more complex one), I began to think about a better method for XML parsing. Theoretically, it’s quite possible to split an entire XML document into a giant array tree. For example, here is a simple XML document.

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
         <generator>J-Tech RSS Generator</generator>
         <title>J-Tech Site News</title>
         <description>J-Tech News</description>
         <link>http://joeyjwc.x3fusion.com/</link>
         <item>
                  <title>Title</title>
                  <pubDate>Wed, 1 Jan 3000 00:00:00 GMT</pubDate>
                  <description>
                           This is an example post.
                  </description>
                  <guid>http://joeyjwc.x3fusion.com/news/index.php?a=10000</guid>
         </item>
</channel>
</rss>

It’s very easy to see how this could become an array:

[__ XML_PROPS __] =>
         [version] => 1.0
         [encoding] => UTF-8
 
[rss] =>
         [__PROPS__] =>
                  [version] => 2.0
         [channel] =>
                  [generator] =>
                           [__TEXT__] =>  J-Tech RSS Generator
                  [title] =>
                           [__TEXT__] => J-Tech Site News
                  [description] =>
                           [__TEXT__] => J-Tech News
                  [link] =>
                           [__TEXT__] => http://joeyjwc.x3fusion.com/
                  [item] =>
                           [title] =>
                                    [__TEXT__] => Title
                           [pubDate] =>
                                    [__TEXT__] =>  Wed, 1 Jan 3000 00:00:00 GMT
                           [description] =>
                                    [__TEXT__] => This is an example post.
                           [guid] =>
                                    [__TEXT__] => http://joeyjwc.x3fusion.com/news/index.php?a=10000

This example illustrates a few things. First, note that there are a few special variables, namely __XML_PROPS__, __PROPS__ and __TEXT__. __XML_PROPS__ only exists for the < ?xml?> tag and describes its attributes. More generally, __PROPS__ describes the attributes of an arbitrary tag (in this case, it is the tag). Finally, __TEXT__ describes the text content (also described as simple content) of a tag. This variable is needed because tags can have mixed content, meaning that they have both text content and other tags in them. For example:

<tag1>This is some text.
         <tag2>This is some more text</tag2>
</tag1></p>
 
<p><tag1> has mixed content.

So, what about repeated tags in the same family? For example, how could this script compensate for another tag in the XML above? Well, one way to do this would be to label the item tags with unique identifiers. For example, we could have item_0, item_1, item_2, etc. Then, we could stick another property somewhere that describes how many instances of each tag exist. For example, we could have in the [rss][channel] array another element called [__item_NUMBER__] (where item is the repeated tag) that describes the number of instances of the item tag in that family. Then, it would be easy to translate this into a quick for loop to extract the data in order.

This is what I’ve got so far. But I’ll keep thinking about it. It definitely needs to be optimized a bit before it becomes a useful algorithm, but it’s getting there.

Fun with Laptop Keyboards

After doing some research, I discovered a general method for hooking a laptop keyboard up to any computer using a PS/2 port. It is possible to think of a “keyboard” (as in a keyboard for your desktop computer or something) as having two major elements: the keyboard matrix and the keyboard controller. The keyboard matrix consists of a circuit board (either like a traditional PCB or flexible, depending on the keyboard), the keys, and the materials to make the keys make a connection on the board. Some keyboards use a rubber sheet or a bunch of rubber buttons to make the connections. Others use a scissor technique. The keyboard controller takes input data from the keyboard matrix and converts it to a set of instructions readable by computers. Laptops have keyboard controllers built into their motherboards. However, external keyboards have keyboard controllers in the frame of the keyboard itself. For example, I have a PS/2 keyboard that I use for Sophos right now. After taking it apart (held together by 17 screws in all), I discovered that the matrix pinout consisted of two flat cables (well, actually, they were part of the plastic sheet that constituted the matrix) with brush-style mylar connectors that connected to the keyboard controller.

Once I find a suitable controller, setting up the keyboard should be a snap, with two small exceptions.  First, I will have to remap the keys, as I doubt that many of the keycodes will be correct.  Second, actually connecting the keyboard to the controller will be a bit of a challenge.  I’m thinking of etching out a copper tracing using PCB and clamping it to the ribbon cable and also to the keyboard controller.  This might be the only way.  However, I’ll see if I can find a pre-made clamp first.

New Blog Layout

As if I don’t have enough stuff to do, I’ve decided to start redesigning my blog’s layout. I’m thinking of doing something like this, but I’m not quite sure yet. Obviously, that mockup needs quite a bit of work before I can turn it into HTML, but it has a few elements that I really like.

First of all, it uses a simple, grayscale theme. I have been playing around with colorless graphics for a little while now. A few of my tests are actually bordering mediocre, so that’s pretty good. With the appropriate shading, it is possible to achieve a very nice effect. For example, in the grayscale test image, I used a bit of blurring to create a halo-like shadow in the upper half of the image (light background) and also to create a glowing effect on the bottom half of the image (with the dark background). Interestingly, these two effects can even be combined to achieve a layered effect. For example, in the blog layout mockup, I put a dark shadow behind the white main content frame. This 3D effect immediately draws your eye to the center of the page. I had trouble deciding which column would have a dark background (and light text) and which would have a light background (with dark text). So far, I have decided to make the main column’s background dark. After some experimentation, I discovered that my eye tends to wander very quickly to the darker background because it is more bold. However, I’m not quite sure if the white text is as easy to read as dark text on a light background.  Unfortunately, the mockup image doesn’t do the layout justice because Inkscape’s anti-aliasing is extremely strong.  Also, I plan to add some very light textures into the backgrounds to make things less bland.

In other news, I really need to find a PS/2 keyboard in order to rip the keyboard controller out of it so that I can start experimenting with my laptop keyboard.  It will be incredibly cool when I get it to work!

Gentoo is Up and Running.

After installing well over 3000 different packages from Portage, I finally have a usable Gentoo installation. I finally decided on a partition scheme like this:

  • /dev/hda1 – /boot – EXT2 – 100MB
  • /dev/hda2 – SWAP – SWAP – 1GB
  • /dev/hda3 – / – Reiser4 – 159GB or so
  • /dev/hdb1 – /mnt/xp – NTFS – 10GB
  • /dev/hdb2 – /mnt/vista – NTFS – 50GB
  • /dev/hdb3 – /mnt/share – NTFS – 97GB

As for my USB drives, I’ve got two right now. I borrowed a 320GB drive so that I could back up all of my data and selectively copy it over to my new system. I also have an 80GB drive, which I will use for system backups (I’ll get to that in a second).

So, I decided to make hdb3 an NTFS partition because I wanted a partition that I could use to back up all of my files in my home directory and also so that I could store music, videos, and other things that I want to share between Linux and Windows. This was the best way to do that.

I have read numerous warnings about the dangers of Reiser4. I have also read documents that stated that XFS was much better. But, some of the benchmarks are absolutely fantastic. And, even better, Reiser4 is modular, so it can be easily updated to make it even faster. So, I decided to go with it. But, I don’t intend to just leave my data to the whims of a single file system. I intend to keep at least two or three copies of my most important data on different filesystems and different drives at all times. But, that’s not all.
I’m going to design my own backup system. At its core, it will be very simple. Although there may well be nothing wrong with the backup systems that only update changed files, I don’t really feel secure unless I copy the entire file directory over. Considering that my most important files can be condensed down to a couple of gigabytes, I’m just going to use a bzipped tar backend. Every week, a Python script will launch and ask me to backup my data. At this point, it will copy my entire home directory (minus the VirtualBox virtual machines) and copy it to /mnt/share. I call this type of backup a “Stage1″ backup. I suspect that I’ll keep around 5 revisions at a time.

The “Stage2″ backup is a little more interesting. Every 3 weeks or so, the script will ask me to initiate this type of backup. First, it will process a Stage1 backup. Then, it will ask me to connect and mount my external 80GB USB drive. It will then copy all of the data located in /mnt/share to the USB drive. Unfortunately, given the amount of data and the amount of space left in the USB drive, I may only be able to keep 2 or 3 revisions at a time.

A “Stage3″ backup is reserved for if I ever get a 500GB drive or something where I can just copy entire disk images over to the drive. This is the ultimate form of backup because it preserves everything exactly.

I’m not sure what a “Stage4″ backup would be. Perhaps a copy of every single hard drive that I own. That would be several terabytes large.

KDE and Gnome required a full night of compiling each. Then, I realized that I had forgotten to enable the xinerama USE flag for KDE, so I had to recompile a whole bunch of packages again. After I did so, however, TwinView worked very well. My desktop is set up much like my old one. However, KDE is much more responsive thanks to excellent kernel tweaks courtesy of kamikaze-sources. I was originally using viper-sources, as recommended by the Conrad Installation Guide, but they were a bit old (2.6.21 instead of 2.6.22). The kamikaze-sources also provided some tweaks for CFQ, which was nice. I tried using the Dynamic Ticks feature of the kernel, but found it to slow down my system quite a bit. It caused Compiz to not work properly with games (the FPS dropped down to like 20 FPS).  I am also using the SLUB allocator right now, which seems to have some improvements over the existing SLAB model.

So far, I have had only two major issues.  DBUS didn’t seem to want to work correctly.  It was failing when Compiz tried to load (which might have caused Compiz to run smoothly) and it also wasn’t playing nicely with HAL, so my drives were not being automounted.  I later discovered that the security policy implemented was contradicting itself.  It first instructed DBUS to allow certain connections and then later on denied them.  This was a quick fix; I just needed to stick comment tags around the offending block.

The other issue involved Gamin, a file alteration monitor.  Apparently, the issue that I was having occurs with both Gamin and its predecessor FAM.  After using Konqueror for a while, I noticed that the system started to slow down severely.  I later noticed that Konqueror was displaying 100% CPU usage.  After some research, I discovered that the file alteration monitor, which indeed does exactly what you might expect: watch for changes in the filesystem, was reporting numerous errors at extremely high rates, causing my CPU usage to spike.  After a few tests, I found that I was able to stop the issue by sticking three lines into /etc/gamin/gaminrc:

none /var/log/*
fsset ext2 notify
fsset ext3 notify

I didn’t need to add extra fsset lines for ntfs-3g and reiser4 because the issue mainly occurred with the EXT3 USB drive that I had borrowed to back my data up.  After fixing this problem, my CPU usage returned to something normal.

So, Gentoo seems to be running fine right now.  I still have more copying to do before my system is complete, but I’m getting there.

As for Sophos, I have more or less put that project on hold.  It’s going quite well, but I still need a touch screen.  I did buy a laptop keyboard, however.  There is very little information about modding laptop keyboards to work with PS/2 connectors, but I have found a few resources.  This blog post shows how someone had connected his laptop keyboard up to a spare keyboard controller from another keyboard and then used some software to read the resulting keycodes.  It’ll be a bit of work (especially to find a keyboard controller with the same matrix dimensions), but I’m up for it.

Windows Vista Speech Recognition, Gentoo, etc.

I’m going to try to write this entire blog post using the speech recognition feature of Vista.  The system isn’t too bad, but it needs some work.  If you are using it by itself, it will be much slower than using the mouse and keyboard but in combination with the mouse and keyboard, it is actually quite efficient.  For example I can easily start programs without having to open the Start Menu just by saying ”open” and name of the program.  Then a dialog box will come up showing me the choices that I have.  It’s also very easy to switch from program to program with speech recognition.  I only need to say “switch to” and then the program name in order to switch to the program that I want.  In terms of accuracy it seems to be better than the speech recognition feature found in early versions of Office.  With surprisingly little training, it was able to understand me quite well.

My Gentoo installation is coming out fine.  I have X installed but have yet to install KDE fully.  I’m quite surprised at how fast the system loads up.  The Conrad installation method works fairly well but I had to fix a couple of small problems involving linkers and shared libraries, as well as USE flags.  I’ll two had to remove a CFLAG (-freorder-blocks-and-partitions) because it was causing libgmp to fail.  KDE takes forever to build, so I’m going to wait until I have a lot of free time to do that.

For some reason, the Windows Vista bootloader doesn’t seem to want to load Windows XP.  Perhaps I can try to trick my XP Installation CD into installing a separate bootloader just XP so that I can access both Vista and XP from GRUB.  BCDEDIT is overly confusing, in my opinion.  Although, I’m becoming much more proficient at it.

(I gave up on recognition by the last paragraph.  I’m a much faster typist.)

Huh? It isn’t working!? Ha! It’s working! @#$%! Now it’s not working!!!

Instead of doing all of the things that I probably should be doing, I have decided to wipe my hard drives for my main computer and install things over again. My justification for doing this lies in the simple fact that my previous configuration was slow, bloated, and far too precarious for my liking.

I have twin 160 gig hard drives. In reality, it’s probably about a good size for me. I don’t really have a ton of music, videos, and other stuff that many other people have. Between all the Linux ISOs that I have downloaded and the rather small music collection that I do have, I’d say that the total amount of data that I have (not including operating system data, etc.) is about 115 gigs. Of that, my most important documents, code, and so forth probably total about 30 gigs.

Well that’s wonderful, but how do I spread my data about my partitions? Well, used to be something like this:
hda
–hda1 : Windows XP, 100 gigs or so
–hda2 : Linux EXT3, 40-something gigs.
–hda3 : Linux SWAP, 1 gig.

hdb
–hdb1 : NTFS partition for storage, 120 gigs
–hdb2 : Another Windows XP installation

My older music and documents are located in hda1. Newer documents, pictures, code, etc. is located in hda2. Various videos and stuff, as well as Linux ISOs are located in hdb1. hdb2 was for a specific project that I was working on.

In order to write to my NTFS partitions with Linux, I used the NTFS-3G driver. It’s absolutely spectacular, but sometimes, it causes my CPU usage to shoot up to 100%. So, I decided that for a new installation, I would move all of my documents and music to a Linux partition to fix this kind of problem.

hda1′s Windows XP installation was extremely slow, mainly because there were so many programs installed. So, sometime back in January, I promised myself that I would wipe my disks clean and start over.

So, I did. Or, at least, I’m trying to.

First, I borrowed a 320 gig external hard drive and backed up everything. It’s kind of sad funny how my entire life can be fit onto a hard drive.

Next, I repartitioned my hard drives.
hda
–hda1 : /boot, 100 megs
–hda2 : SWAP, 1 gig
–hda3 : Reiser4 (!)

hdb
–hdb1 : NTFS, 10 gigs
–hdb2 : NTFS, 50 gigs
–hdb3 : EXT3

Next, I installed an nLite‘d version of Windows XP onto hdb1. It’s fast and rather pleasant. It’ll be used just in case I need XP for something.

Then, I bit the bullet and installed a copy of Windows Vista. Gosh, Joey, why are you installing the operating system that you are constantly ranting against? Well, to tell you the truth, I wanted to take advantage of the incredibly powerful features of Vista’s speech recognition software.

Here’s where my title comes into play. At first, it refused to even load the DVD because I had two hard drives connected and had set the boot flag to hdb2. The only way to install it was to disconnect the connection to my first hard drive. I’m not sure how it’s going to fare once I reconnect my first hard drive. Vista’s bootloader is needlessly complicated, but I might have to figure out how to use it in order to get it to boot from the second hard drive instead of the first when I chain it from GRUB.

So finally, I got Vista installed. And I even was able to install many drivers for it. Everything was working nicely, for once! Then, Windows Update was so kind as to ask me to install some updates. I then restarted my computer and discovered that I just wasted an hour or two of my life installing Vista. Apparently, something must have gotten corrupted, because instead of being greeted by the little Windows logo, I was being greeted by a beautiful blue screen instead…

I attempted to correct the corrupted files, but I failed miserably. So, I’m reinstalling Vista now.

After I finish reinstalling Vista, I can continue with my Gentoo installation. I’m using the Conrad installation method, with a few minor changes. I have yet to recompile the toolkit. I’m waiting for a day when I can just leave my computer alone and let it compile everything. Hopefully, that day will be Thursday.

One last note: I studied the results for Reiser4 and EXT4 and discovered that Reiser4 seems to perform better in many cases. However, it does have a few disadvantages, including its huge cache size. But overall, it is an excellent filesystem. (I’m not going to go into the whole “is Hans Reiser innocent?” thing.)

I can’t wait to have my computer back up and running.

Edit: Well, looks like the error was with some nVidia drivers that were included in Windows Update. I reinstalled everything, so the OS is running smoothly now. I’m quite impressed. Now, I have to install Gentoo and hope that GRUB’s chainloading techniques will work.