GreenReaper Studios
Assorted skinning stuff from Laurence "GreenReaper" Parry
View
Search

Google thinks they were working recently...

Mar 21, 2024 4:50 PM by Discussion: WinCustomize Talk

Until recently it was possible to use subdomains of the form greenreaper.wincustomize.com to get to your user page. Indeed this is the link provided on Google for "GreenReaper wincustomize".

 

However, this is now giving a 404. Perhaps the domain redirector is broken?

 

My account is indeed still there:

https://www.wincustomize.com/users/2172/GreenReaper

11 Replies Reply 4 Referrals

Now works on Firefox 3!

Dec 8, 2008 4:02 PM by Discussion: Community

I dropped by the WinCustomize Wiki yesterday to spruce things up a little. The skin now works properly on Firefox 3, and I also upgraded the wiki engine and added a few anti-bot extensions to keep the spam away.

Over time we've built up a fair amount of content, including almost 100 tutorials, but there's plenty of room for more. It's your wiki, though - what do you want to see there?

0 Replies Reply 7 Referrals

A short review of Windows Vista build 5536

Aug 27, 2006 12:02 AM by Discussion: Windows Vista

As the developer in charge of Stardock's Vista labs, I'm one of the few who gets to "play" with the new builds right away. Up until now this has meant several hours of reinstalling software over the top of a fresh install. This time I tried an upgrade from 5472 to 5536, and as it's the way many of you will be introduced to Vista I thought I'd share the results with you. I also wanted to see whether or not I agreed with blogger Robert McLaws, one of those who has been playing around with the interim builds and who has been predicting great improvements ( http://www.longhornblogs.com/robert/archive/2006/08/24/Windows_Vista_Pre_RC1_Is_Available_Now.aspx ). Is he right? Read on for my take . . .

Impressions of 5536...

The setup has started to include the "info cards" - in this case, little messages promising that you, too, can be a great director, famous (PowerPoint) presenter, or maybe even pilot the space shuttle with Windows Vista. Again, Microsoft is trying to push the "experience" on you - and giving you something to look at while you wait for its performance ratings to complete. I'm told a clean install is not that bad, but you have to wonder how many end users are going to be doing a clean install. In truth, the upgrade didn't take more than around 45 minutes for me, though I've had others say it took them over an hour. I could see upgrading from XP taking longer, if only because most upgrade candidates will have big registries and more cruft for the installer to sort through.

So what's the score once you're upgraded? On my dual-core E1505 laptop (labeled "Windows Vista Capable" by Dell, though that's pushing it for their lower models), it takes about 30 seconds from the start of the Windows booting process, 50 to the desktop and 1:00 to the Welcome screen . . . and after that it's hard to tell because other things kick in, but you can start working straight away. It's not slow, though I suspect this relies significantly on having half a gig of memory around to throw at the boot process. My total boot load was a shade over 550Mb, which compares less-than-favourably with the 230Mb of XP on the same laptop. Admittedly, I'm not running the Tablet PC service on that (nor will regular Vista users have to), and it trimmed about 50Mb off the working set over time. Still, I wouldn't want to actually use Vista in less than 1Gb, particularly since every single open window carries the cost of the DWM's buffering.

One of the things that really does keep going is Windows Defender and the Windows Firewall. They appear to make significant disk accesses totaling (on various boots) 50-100Mb by the first five minutes just after loading up the desktop. Security! The joke that the second core was added to check up on the first one is getting a little too close to truth, though the real cost is waiting for the disk. Good thing I got a 7200 rpm disk. On the whole, though, performance is definitely far closer to that which I'd expect from an operating system that's meant to be released in, yes, two months.

There's a few nice little user interface tweaks that make things just a little bit more friendly; for example, the way the "other logoff options" button is actually large enough to hit this time around. Progress remains to be made: on my laptop's high-DPI screen (an upgrade which I readily recommend) all the column headers in were incorrectly sized, making it harder to see what was actually being displayed. If Microsoft expects its ISVs to expend the effort to solve such problems, it needs to get its own software following best practices first. And yes, Windows Media Center looked great - except I didn't have a cursor!

Some things seem set to remain obscure to most users with Vista, like how badly your disk is defragmented (requires use of a command-line tool in administrator mode), and exactly how much help that USB key is as a ReadyBoost device. Perhaps that's for the best, considering Vista's main target market, but "you don't need to know" still rankles to a techie like myself. Worse, I've heard they want to make the logon sound mandatory ( http://scobleizer.wordpress.com/2006/08/24/the-startup-sound-in-vista/ ). Guess what, Microsoft? It's our computer, and you're the guest. Learn to live with that restriction on your branding efforts, and put in a usable "off" switch, or we'll do it for you.

Drivers remain an issue, too. At least now there are drivers for most components, though some features are lacking (including Vista-compatible help for Device Manager itself). OpenGL still isn't all the way there, even though my X1400 drivers were built just 10 days ago. This is partly Microsoft's fault, because they didn't finalize a high-performance interface for OpenGL until it was almost too late to have one at all. Maybe they were concentrating too closely on DirectX 10, overlooking all the great consumer applications that make use of the competition - like, oh, Second Life, which still bombs out in this build.

That wasn't the worst flaw. On a hunch, I shut the lid. 7 seconds to sleep - not bad, though it could be better. I opened it back up, and . . . whoops. The screen powers up, but it's not showing anything, and the system is non-responsive. Scratch one for the RC - this is a laptop, it needs to be able to sleep. Others here at Stardock have had other serious problems that appear to be linked with display drivers, and it's clear we're going to continue to need to see significant improvement in this area. I'm sure the driver crews are working flat-out at NVIDIA and ATI AMD.
Edit: As of 2 September, ATI has released drivers that fix this problem.

The company has been dragging its heels for a while, pushing little features into the product to make up for all the big names (remember WinFS?) that didn't quite make the cut. It seems they've mostly gotten past that, though you know they're going to want to spring a feature or two on us for the RC (Virtual PC Express, for a start - http://blogs.zdnet.com/BTL/index.php?p=2649 ). Speaking on behalf of the development community, I'm glad to be entering the finishing straight. We need a stable set of features to build our own programs on.

The Verdict

So are the latest builds really any better? Despite the problems, a I'd have to say a qualified yes. It's about time - there's precious little of it left to fix the very real bugs that remain, let alone the "features" being forced in by Microsoft Marketing. We're going to need a release candidate out soon in order to find all the niggling little compatibility issues. That requires Vista to be solid enough for beta testers to want to use it as their main OS, and from my experiences it's not quite there yet.

Microsoft's developers have a little over two months to deliver on what they've promised - a next-generation operating system that can provide a solid base for years to come. This build is a sign that they may finally be in gear: but I worry that it may be too little, too late. Will they rise to the challenge, and deliver something they can be proud of? For all our sakes, I hope so.

21 Replies Reply 107 Referrals

What will this mean for DesktopX?

Jul 25, 2005 4:17 AM by Discussion: Windows Software
It's big news. Looks like Yahoo has decided to take a position in the widget market, and it's doing it in a big way, by giving away Konfabulator. No doubt they intend to use it as a tool to cement their web presence:

Yahoo is counting on the widgets to make users more curious about certain topics, services or events, ultimately driving more traffic to its Web site so it can serve up more moneymaking ads and expand its current base of 10.1 million subscribers who pay for premium services, Schneider said.

But what will this mean for DesktopX, and similar products? Heck, what will it mean for Konfabulator? This sort of change often means interesting goings-on in the near future . . . and for the long-term.

Of course, similar transactions have happened before. Some of you may remember eFX from a long, long time ago. It used to be a competitor to WindowBlinds, and was quite highly regarded at the time, but it never got beyond 0.40 because it was bought up by a company, and disappeared without trace after that. Then there was the story of the rise and slow decline of Winamp, bought by AOL. But perhaps it will be different this time.

Will Yahoo do any better with Konfabulator? Or will this be another example of a popular application bought by a huge corporation and then crushed by it? Can Arlo and his team keep cranking out the widgets? I guess we'll have to see . . .

At least now the picture on the front of the Konfabulator website makes sense. ;)
15 Replies Reply 20 Referrals

Browser plugins help you find skins, articles and forum posts

Jun 29, 2005 5:00 AM by Discussion: Developer Journals
Ever wanted to do a quick search for a skin? I've always found it a bit of a pain - you have to go to WinCustomize, wait for it to load, scroll down to the search box, select the library from that dropdown box, type the word in and wait for the results to come back. Wouldn't it be nice if there were fewer steps in that?

One obvious solution is the WinCustomize SkinBrowser. If you find yourself repeatedly searching through the library, it's a good idea to grab a copy of that by subscribing to WinCustomize. But what if you just want to look for a skin every few days? It's a hassle having to start the skinbrowser up - it's probably faster to use the website, unless you want to do several searches.

Enter search plugins. These little babies integrate into the search box on your browser - you just have to select the plugin you want to use for a particular search, and then type in your search term, and it searches that library or forum for you!

(for those of you who don't yet have a browser with a search box, I advise you to get Firefox at this point )

So, how to get these plugins? Well, you'll be glad to know I've done all the hard work for you. Just go to this page and click the appropriate links for the search plugins that you want to install. They should then show up in your search box to select. I've thrown in some for the GalCiv libraries and forums (even the GC2 ones that don't actually have a search yet ), as well as one for JoeUser. If you like the idea and want one for another popular site, you might be able to find what you're looking for here.

Please note that they are not official tools and are liable to stop working if the method used to search changes. Having said that, I hope you find them helpful!

For any developers wishing to do something similar, I suggest you look at the Mozilla overview and the Apple docs.
32 Replies Reply 50 Referrals

Distributed identity, but where's the catch?

Jun 28, 2005 3:02 AM by Discussion: Virtual Communities
OpenID looks interesting. It's a method of identification without explicit registration (except on your home site - one you choose to trust, rather than one you have to trust). It's in the process of being rolled out on LiveJournal and other sites that use its software. I suspect it could come into use elsewhere, if it's simple enough to setup.

I wonder, will I someday sign myself GreenReaper of WinCustomize.com?

PS: My article about WinCustomize search plugins is more useful than this one. ;)
1 Reply Reply 32 Referrals

A Tale of Two Character Encodings

Jun 12, 2005 2:00 PM by Discussion: Software Development

This friday I finished up some work on a new version of one of Stardock's products, which'll probably see the light shortly after the company finishes moving to Plymouth. So what do geeks do with their down-time? Well, in my case, it's often pretty much the same to what I do for money, only for the communities I'm interested in. Recently, a lot of my time has been spent on the Creatures Community, the group of people who've played the Creatures series of artificial life games. When I'm not contributing to the Creatures Wiki, I'll be writing some sort of tool, like this sprite thumbnail viewer, or polishing the next version of JRNet. But enough about my projects, as this one is actually about someone else's . . .

GEL is a genetic editor for Creatures 2. It is used to edit the genes for the various creatures (Norns, Ettins and Grendels). There are other editors, but people get attached to their favourite programs, and GEL is no different.

The trouble is, although GEL worked great on Windows 98, it didn't seem to want to work on XP. OK, so that wasn't great, but the real problem was that the source code - the words that the programmer types and gives to the compiler to turn into a program - had been lost in a hard disk accident, and so he couldn't fix the problem. Without the source code, you just have the "compiled" version, and it's very hard to make any changes to that.

There were several people upset about this, though and it's always a shame to lose a useful program, so I decided to see if I could do something about it. Overall it took about two days of work to get it back up and running, which I thought was pretty good. I figured it would be kinda neat to tell you how I did it, and show you some of the different tools I used, so I wrote this article. Just skip over the bits that get too technical!

Let's start with what I got when I installed the program and tried starting it up myself:

OK. I got this error, and then when I pressed OK it closed on me. That didn't work out that well. I'm sure you've seen similarly confusing errors on your own computers! Turns out, it's not always easy for programmers to figure out what it means, either . . .

So, we have a problem. But where? "Path not found" isn't a very helpful message - it doesn't tell you what path, for a start! I decided this would be the first thing to try and find out, so I started up FileMon, a utility that monitors what files are accessed by running programs. I was looking for any "not found" messages, and there were a few, but they all turned out to be dead ends.

By now it was clear that it wasn't going to be as easy as a missing file or a permission problem. The next thing I tried was another of the Sysinternals tools, RegMon. This does much the same thing as FileMon - monitor what's happening - but for the Windows Registry, so you can see what settings are being written and read. I consider both of these tools essential if you want to know what's really going on.

This was the last registry read before the error

As it happens, RegMon did turn up something - the last thing that GEL read before it all started to go wrong was the main path of Creatures 2. The thing was, this registry read didn't fail. This just happened to be the last thing that it did with the registry that I couldn't narrow down to other causes. I did try modifying this value in the registry, but this just resulted in slightly different errors.

After that, I briefly tried using another utility called API Monitor in order to see what calls the program was making to the operating system. This program is rather like a general version of Regmon and Filemon - while they monitor specific things, API Monitor "hooks" pretty much every system function that there is and records their use. Unfortunately, I couldn't find what I was looking for; I later found out that it didn't even start sending messages until a window had been created.

A small aside . . .

One thing I did notice using API Monitor was the amazing number of calls that were made - almost ten thousand of them - just to display an error box on the screen.

Of course, it wasn't just displaying the box, it was:

  • Loading the code to process the error
  • Looking up the error message
  • Playing the error sound, which meant:
    • Loading sound libraries
    • Checking what sound devices were available
    • Figuring out what sound was the error sound
    • Loading and playing the sound
  • Loading up the screen reader library in case it needed to read the message to me
  • Lots of other things that might have been useful if it had actually done anything after showing the error

There's a simple reason CPUs keep getting faster - they have to, because if they didn't, there's no way we'd be able to use all the things our computers do for us! The really crazy thing is that displaying such a box (with all the above features) just takes one line of code; something like this:

MsgBox "Hi there! This is the message" & vbNewLine &, vbInformation, "This is the title!"

Truly, we live in an age of wonders.

To recap: I'd found it wasn't a case of failed registry entries or a file not being there. It was time to bring out the big guns.

My first tool is recognizable to pretty much all Windows programmers, even if they don't use it themselves - Microsoft's Visual Studio. This is the number one tool for Windows development, and although it has its detractors, it's pretty good as development environments go. I would use this to run the program and stop it halfway, examining and changing the memory that it used.

The second might be a little less familiar to most programmers - IDA, the Interactive Disassembler. A disassembler is a program that turns compiled programs back one step into "assembly code", the last point at which it can be considered remotely readable. Few programmers actually write code at this level - most use a higher-level language like C++, Pascal, Java or Visual Basic - but it is usually possible to get a good idea of how parts of a program works through reading it in assembly.

Disassembling programs (also known as reverse-engineering) is something of a shady activity - one of IDA's most popular uses (though not one they advertise) is to figure out how to get around serial code checks, and this is one reason why disassembly is forbidden in most software licenses. However, all tools have their uses, and when you need to know exactly what a program is doing in order to fix it, but don't have the source, a good disassembler is a requirement.

Anyway, I started the program running in the Visual Studio debugger - a mode in which you can control exactly how a program executes, and modify the variables it is using - and ran through the code to see where the problem occurred. It was pretty easy to see what part of the error was in - a file called glsupcts.dll that came with the program. To see exactly what the code did, I set IDA running on it; after a few minutes it had an assembly listing of the code ready for me to read.

A note about DLLs

DLLs are not that much different from EXEs - they're both files that contain "code" (and sometimes other things like icons or embedded sounds). The main difference is that the EXE files contain the bit of code that starts the whole thing going, whereas DLLs tend to get called up by those EXEs to do their share of the work.

Of course, the assembly code wasn't actually all that easy to read. Something that made it even difficult was that the program had been written in Visual Basic, a language that I like which has a very easy to use system of programming, but which is often more general than required. As a result, it often did things in an odd way, and the code made a lot of calls to functions in the Visual Basic library. Of course, since these library calls were not documented, I ended up having to decompile this library as well, just to figure out what the program was doing! Hopefully nobody from Microsoft who cares is reading this.

Reading through the IDA output, I found the check for the registry value just before the error occured. It certainly seemed these were linked in some way. Then I found a reference to "AllChemicals.str", a file that contains the names of chemicals in Creatures. It made sense that GEL would try to load this file, so that it knew what each of the chemicals was called!

Now I had a clue - since I knew from reading the FileMon output that it never actually managed to load that file, it was probably failing while trying to. Using Visual Studio to look at the memory when the program crashed, I saw there was something odd about the path it had given to the "open file" function. It started off fine, but the end didn't look at all right. Here was my problem!

The system had used part of the memory given to it to work out the path (see the end for details), and GEL had thought this was part of the path itself. It was all clear now - the buffer was not being trimmed of the working copy, and this was getting left after the path name, so when the program put "AllChemicals.str" on the end, the middle of the path was invalid. This was the reason it wasn't showing up on FileMon - it didn't even get to the point where it looked for the file on the disk.

So what could I do? Well, I knew it was trimming off the last part of the string - the trouble was, it thought it was twice as big as it actually was, so it was keeping twice as much as it should. The length had to be stored as a number somewhere. Eventually I found the number being returned from a call to a function called vbaLenBstr - which naturally calculated the length of the incorrect path. Now I just needed to divide it by two and it would only use the correct portion of the string.

Remembering my computer operations, I knew that the best way to divide by two was to shift the number to the right. What does this mean? Well, you can think of numbers inside a computer as being like a group of people all standing in a row, with flags with numbers on - starting from the right, they'd go 1, 2, 4, 8, 16 . . . all the powers of 2. When you shift right, the people all look at whoever's holding the next-highest flag, and do what they're doing. It looks like this:

Flag num 128 64  32  16   8   4   2   1  
Before:   1   0   1   0   1   0   1   1 = 128 + 32 + 7 + 2 + 1 = 171
After:    0   1   0   1   0   1   0   1 = 64 + 16 + 4 + 1 = 85
Voila - division by two! Of course, you lose any remainder, since there's no 0.5 flag. Fortunately there's no such thing as half a character.

Of course, I'm not a whiz at assembly, so I had to look up exactly how to do the shift - I actually found the one I needed elsewhere in the code, so I could just use that. Now I had my instruction, and I knew where it had to go. It should be simple from here, right?

Well, no. The trouble is, you can't just add another instruction to the middle of a compiled program, moving all the others along. It would be like rearranging pages in a book and not updating the index (which is regenerated each time you "compile" a book). Worse, since machine instructions usually take more than one byte, moving them means instructions would start in the wrong place, changing their whole meaning - imagine what would happen if you kept all the spaces in a book in the same place but moved each letter along one position! When things get out of order in a computer, programs crash.

One thing I could have done would be to overwrite what was there already (perhaps something that didn't matter much). It had been enough trouble figuring out what one piece of assembly did, though - I didn't want to have to go through that all over again!

Fortunately, I didn't have to try that, because there was a convenient area of NOP instructions nearby. NOP stands for "no op" - it's an instruction that does nothing but move onto the next instruction. This seems useless, but it can in fact be useful for various things.

In this case, it was useful because it meant I had some space to work with. Because this space was free, I could fill it with more code. I needed to add just one instruction, but to get to that instruction I needed to put a jump instruction in. I looked these up, and it turned out the one I needed was a whole five bytes long, including the place to jump to.

That meant I had to move the code it replaced down into the section of NOPs as well, after my right shift. I then needed to jump back up to the point after the first jump instruction, so that the code could continue as if nothing had happened.

So, after that, the big question is did it work? . . . Yes! It finally loads!

For those who've read this far, congratulations! I hope you found this little view into my world educational.

This is a bit more than you'd usually have to do when debugging a problem, but it's pretty representative of what most programmers do in real life - it's not all fast cars, mansions and stock options! Sure, you don't usually have to go as hardcore as writing assembly-code patches for broken DLLs (I'm sure I'm going to get nasty comments from the real hardcore folks out there who do stuff like this every day , but a lot of the time you're figuring out problems with existing code, not just writing new code.

Often it's not our code, either - it'll be written by someone else (who left six months ago) in a way that seems totally nonsensical. Sometimes you're right to think that, other times you just don't understand it yet; either way, you have to fix it, and probably add a few new things, too! Ahh, well, all in a day's work . . .

Bonus! (warning: very techy)

If the program was buggy, why did it work on Windows 98? Well, the difference is the way in which the operating system works with text. On Windows 98, the standard is to have one byte per text (UTF-8), which is nice and fast, but which means you only get 256 different characters to choose from at any one moment; not enough for many languages. In Windows XP, the standard (called UTF-16) is to have two bytes, which gives you 65536 characters, which is good enough for most purposes.

Asking for a value from the registry is done by preparing a memory buffer - an area of memory to hold the result - and telling the operating system how you want the data. Because GEL had to work on both operating systems, it used the Windows 98 method, asking for a UTF-8 string (step 1 on the text diagram), and expanded the text later (step 2). But on Windows XP, this meant that the operating system had to reduce the size of the text, since it stored everything in UTF-16.

What was the easiest way to do that? Why, to read the output into a buffer and then collapse it down to UTF-8 by copying each character back just the right amount (computers rarely "move" data, since it costs twice the time it takes just copying it - deletion is an extra step) - first 0 steps, then 1, then 2 . . .

Where should this conversion take place? Since the memory space to store the text in was already available, it used that, safe in the knowledge that any extra text in the buffer should be ignored by the program, which was told how much had been returned. At least, that's how it should have worked . . . but instead, the program used the whole string and just counted the length of it rather than relying on the value it was given. On Windows 98, that space was never filled with text, as it was UTF-8 to start with. Although the bug was still there, it had no actual effect, since counting the length of the text came up with the right answer.

By the way, this explains why the "extra" text is quadruple-spaced - it was UTF-16 to start with, and it was left there because it was the second half of the text, which was not overwritten by the UTF-8 version of itself. It was then incorrectly read as part of the UTF-8 string, and expanded again by GEL into UTF-32.

The irony is that if it had been read straight from the registry as UTF-16, no conversion would have been necessary, and the application would probably have worked. Such are the ways of code!

8 Replies Reply 62 Referrals

A Tale of Two Character Encodings

Jun 12, 2005 2:00 PM by Discussion: Software Development

This friday I finished up some work on a new version of one of Stardock's products, which'll probably see the light shortly after the company finishes moving to Plymouth. So what do geeks do with their down-time? Well, in my case, it's often pretty much the same to what I do for money, only for the communities I'm interested in. Recently, a lot of my time has been spent on the Creatures Community, the group of people who've played the Creatures series of artificial life games. When I'm not contributing to the Creatures Wiki, I'll be writing some sort of tool, like this sprite thumbnail viewer, or polishing the next version of JRNet. But enough about my projects, as this one is actually about someone else's . . .

GEL is a genetic editor for Creatures 2. It is used to edit the genes for the various creatures (Norns, Ettins and Grendels). There are other editors, but people get attached to their favourite programs, and GEL is no different.

The trouble is, although GEL worked great on Windows 98, it didn't seem to want to work on XP. OK, so that wasn't great, but the real problem was that the source code - the words that the programmer types and gives to the compiler to turn into a program - had been lost in a hard disk accident, and so he couldn't fix the problem. Without the source code, you just have the "compiled" version, and it's very hard to make any changes to that.

There were several people upset about this, though and it's always a shame to lose a useful program, so I decided to see if I could do something about it. Overall it took about two days of work to get it back up and running, which I thought was pretty good. I figured it would be kinda neat to tell you how I did it, and show you some of the different tools I used, so I wrote this article. Just skip over the bits that get too technical!

Let's start with what I got when I installed the program and tried starting it up myself:

OK. I got this error, and then when I pressed OK it closed on me. That didn't work out that well. I'm sure you've seen similarly confusing errors on your own computers! Turns out, it's not always easy for programmers to figure out what it means, either . . .

So, we have a problem. But where? "Path not found" isn't a very helpful message - it doesn't tell you what path, for a start! I decided this would be the first thing to try and find out, so I started up FileMon, a utility that monitors what files are accessed by running programs. I was looking for any "not found" messages, and there were a few, but they all turned out to be dead ends.

By now it was clear that it wasn't going to be as easy as a missing file or a permission problem. The next thing I tried was another of the Sysinternals tools, RegMon. This does much the same thing as FileMon - monitor what's happening - but for the Windows Registry, so you can see what settings are being written and read. I consider both of these tools essential if you want to know what's really going on.

This was the last registry read before the error

As it happens, RegMon did turn up something - the last thing that GEL read before it all started to go wrong was the main path of Creatures 2. The thing was, this registry read didn't fail. This just happened to be the last thing that it did with the registry that I couldn't narrow down to other causes. I did try modifying this value in the registry, but this just resulted in slightly different errors.

After that, I briefly tried using another utility called API Monitor in order to see what calls the program was making to the operating system. This program is rather like a general version of Regmon and Filemon - while they monitor specific things, API Monitor "hooks" pretty much every system function that there is and records their use. Unfortunately, I couldn't find what I was looking for; I later found out that it didn't even start sending messages until a window had been created.

A small aside . . .

One thing I did notice using API Monitor was the amazing number of calls that were made - almost ten thousand of them - just to display an error box on the screen.

Of course, it wasn't just displaying the box, it was:

  • Loading the code to process the error
  • Looking up the error message
  • Playing the error sound, which meant:
    • Loading sound libraries
    • Checking what sound devices were available
    • Figuring out what sound was the error sound
    • Loading and playing the sound
  • Loading up the screen reader library in case it needed to read the message to me
  • Lots of other things that might have been useful if it had actually done anything after showing the error

There's a simple reason CPUs keep getting faster - they have to, because if they didn't, there's no way we'd be able to use all the things our computers do for us! The really crazy thing is that displaying such a box (with all the above features) just takes one line of code; something like this:

MsgBox "Hi there! This is the message" & vbNewLine &, vbInformation, "This is the title!"

Truly, we live in an age of wonders.

To recap: I'd found it wasn't a case of failed registry entries or a file not being there. It was time to bring out the big guns.

My first tool is recognizable to pretty much all Windows programmers, even if they don't use it themselves - Microsoft's Visual Studio. This is the number one tool for Windows development, and although it has its detractors, it's pretty good as development environments go. I would use this to run the program and stop it halfway, examining and changing the memory that it used.

The second might be a little less familiar to most programmers - IDA, the Interactive Disassembler. A disassembler is a program that turns compiled programs back one step into "assembly code", the last point at which it can be considered remotely readable. Few programmers actually write code at this level - most use a higher-level language like C++, Pascal, Java or Visual Basic - but it is usually possible to get a good idea of how parts of a program works through reading it in assembly.

Disassembling programs (also known as reverse-engineering) is something of a shady activity - one of IDA's most popular uses (though not one they advertise) is to figure out how to get around serial code checks, and this is one reason why disassembly is forbidden in most software licenses. However, all tools have their uses, and when you need to know exactly what a program is doing in order to fix it, but don't have the source, a good disassembler is a requirement.

Anyway, I started the program running in the Visual Studio debugger - a mode in which you can control exactly how a program executes, and modify the variables it is using - and ran through the code to see where the problem occurred. It was pretty easy to see what part of the error was in - a file called glsupcts.dll that came with the program. To see exactly what the code did, I set IDA running on it; after a few minutes it had an assembly listing of the code ready for me to read.

A note about DLLs

DLLs are not that much different from EXEs - they're both files that contain "code" (and sometimes other things like icons or embedded sounds). The main difference is that the EXE files contain the bit of code that starts the whole thing going, whereas DLLs tend to get called up by those EXEs to do their share of the work.

Of course, the assembly code wasn't actually all that easy to read. Something that made it even difficult was that the program had been written in Visual Basic, a language that I like which has a very easy to use system of programming, but which is often more general than required. As a result, it often did things in an odd way, and the code made a lot of calls to functions in the Visual Basic library. Of course, since these library calls were not documented, I ended up having to decompile this library as well, just to figure out what the program was doing! Hopefully nobody from Microsoft who cares is reading this.

Reading through the IDA output, I found the check for the registry value just before the error occured. It certainly seemed these were linked in some way. Then I found a reference to "AllChemicals.str", a file that contains the names of chemicals in Creatures. It made sense that GEL would try to load this file, so that it knew what each of the chemicals was called!

Now I had a clue - since I knew from reading the FileMon output that it never actually managed to load that file, it was probably failing while trying to. Using Visual Studio to look at the memory when the program crashed, I saw there was something odd about the path it had given to the "open file" function. It started off fine, but the end didn't look at all right. Here was my problem!

The system had used part of the memory given to it to work out the path (see the end for details), and GEL had thought this was part of the path itself. It was all clear now - the buffer was not being trimmed of the working copy, and this was getting left after the path name, so when the program put "AllChemicals.str" on the end, the middle of the path was invalid. This was the reason it wasn't showing up on FileMon - it didn't even get to the point where it looked for the file on the disk.

So what could I do? Well, I knew it was trimming off the last part of the string - the trouble was, it thought it was twice as big as it actually was, so it was keeping twice as much as it should. The length had to be stored as a number somewhere. Eventually I found the number being returned from a call to a function called vbaLenBstr - which naturally calculated the length of the incorrect path. Now I just needed to divide it by two and it would only use the correct portion of the string.

Remembering my computer operations, I knew that the best way to divide by two was to shift the number to the right. What does this mean? Well, you can think of numbers inside a computer as being like a group of people all standing in a row, with flags with numbers on - starting from the right, they'd go 1, 2, 4, 8, 16 . . . all the powers of 2. When you shift right, the people all look at whoever's holding the next-highest flag, and do what they're doing. It looks like this:

Flag num 128 64  32  16   8   4   2   1  
Before:   1   0   1   0   1   0   1   1 = 128 + 32 + 7 + 2 + 1 = 171
After:    0   1   0   1   0   1   0   1 = 64 + 16 + 4 + 1 = 85
Voila - division by two! Of course, you lose any remainder, since there's no 0.5 flag. Fortunately there's no such thing as half a character.

Of course, I'm not a whiz at assembly, so I had to look up exactly how to do the shift - I actually found the one I needed elsewhere in the code, so I could just use that. Now I had my instruction, and I knew where it had to go. It should be simple from here, right?

Well, no. The trouble is, you can't just add another instruction to the middle of a compiled program, moving all the others along. It would be like rearranging pages in a book and not updating the index (which is regenerated each time you "compile" a book). Worse, since machine instructions usually take more than one byte, moving them means instructions would start in the wrong place, changing their whole meaning - imagine what would happen if you kept all the spaces in a book in the same place but moved each letter along one position! When things get out of order in a computer, programs crash.

One thing I could have done would be to overwrite what was there already (perhaps something that didn't matter much). It had been enough trouble figuring out what one piece of assembly did, though - I didn't want to have to go through that all over again!

Fortunately, I didn't have to try that, because there was a convenient area of NOP instructions nearby. NOP stands for "no op" - it's an instruction that does nothing but move onto the next instruction. This seems useless, but it can in fact be useful for various things.

In this case, it was useful because it meant I had some space to work with. Because this space was free, I could fill it with more code. I needed to add just one instruction, but to get to that instruction I needed to put a jump instruction in. I looked these up, and it turned out the one I needed was a whole five bytes long, including the place to jump to.

That meant I had to move the code it replaced down into the section of NOPs as well, after my right shift. I then needed to jump back up to the point after the first jump instruction, so that the code could continue as if nothing had happened.

So, after that, the big question is did it work? . . . Yes! It finally loads!

For those who've read this far, congratulations! I hope you found this little view into my world educational.

This is a bit more than you'd usually have to do when debugging a problem, but it's pretty representative of what most programmers do in real life - it's not all fast cars, mansions and stock options! Sure, you don't usually have to go as hardcore as writing assembly-code patches for broken DLLs (I'm sure I'm going to get nasty comments from the real hardcore folks out there who do stuff like this every day , but a lot of the time you're figuring out problems with existing code, not just writing new code.

Often it's not our code, either - it'll be written by someone else (who left six months ago) in a way that seems totally nonsensical. Sometimes you're right to think that, other times you just don't understand it yet; either way, you have to fix it, and probably add a few new things, too! Ahh, well, all in a day's work . . .

Bonus! (warning: very techy)

If the program was buggy, why did it work on Windows 98? Well, the difference is the way in which the operating system works with text. On Windows 98, the standard is to have one byte per text (UTF-8), which is nice and fast, but which means you only get 256 different characters to choose from at any one moment; not enough for many languages. In Windows XP, the standard (called UTF-16) is to have two bytes, which gives you 65536 characters, which is good enough for most purposes.

Asking for a value from the registry is done by preparing a memory buffer - an area of memory to hold the result - and telling the operating system how you want the data. Because GEL had to work on both operating systems, it used the Windows 98 method, asking for a UTF-8 string (step 1 on the text diagram), and expanded the text later (step 2). But on Windows XP, this meant that the operating system had to reduce the size of the text, since it stored everything in UTF-16.

What was the easiest way to do that? Why, to read the output into a buffer and then collapse it down to UTF-8 by copying each character back just the right amount (computers rarely "move" data, since it costs twice the time it takes just copying it - deletion is an extra step) - first 0 steps, then 1, then 2 . . .

Where should this conversion take place? Since the memory space to store the text in was already available, it used that, safe in the knowledge that any extra text in the buffer should be ignored by the program, which was told how much had been returned. At least, that's how it should have worked . . . but instead, the program used the whole string and just counted the length of it rather than relying on the value it was given. On Windows 98, that space was never filled with text, as it was UTF-8 to start with. Although the bug was still there, it had no actual effect, since counting the length of the text came up with the right answer.

By the way, this explains why the "extra" text is quadruple-spaced - it was UTF-16 to start with, and it was left there because it was the second half of the text, which was not overwritten by the UTF-8 version of itself. It was then incorrectly read as part of the UTF-8 string, and expanded again by GEL into UTF-32.

The irony is that if it had been read straight from the registry as UTF-16, no conversion would have been necessary, and the application would probably have worked. Such are the ways of code!

8 Replies Reply 62 Referrals

A Tale of Two Character Encodings

Jun 12, 2005 2:00 PM by Discussion: Software Development

This friday I finished up some work on a new version of one of Stardock's products, which'll probably see the light shortly after the company finishes moving to Plymouth. So what do geeks do with their down-time? Well, in my case, it's often pretty much the same to what I do for money, only for the communities I'm interested in. Recently, a lot of my time has been spent on the Creatures Community, the group of people who've played the Creatures series of artificial life games. When I'm not contributing to the Creatures Wiki, I'll be writing some sort of tool, like this sprite thumbnail viewer, or polishing the next version of JRNet. But enough about my projects, as this one is actually about someone else's . . .

GEL is a genetic editor for Creatures 2. It is used to edit the genes for the various creatures (Norns, Ettins and Grendels). There are other editors, but people get attached to their favourite programs, and GEL is no different.

The trouble is, although GEL worked great on Windows 98, it didn't seem to want to work on XP. OK, so that wasn't great, but the real problem was that the source code - the words that the programmer types and gives to the compiler to turn into a program - had been lost in a hard disk accident, and so he couldn't fix the problem. Without the source code, you just have the "compiled" version, and it's very hard to make any changes to that.

There were several people upset about this, though and it's always a shame to lose a useful program, so I decided to see if I could do something about it. Overall it took about two days of work to get it back up and running, which I thought was pretty good. I figured it would be kinda neat to tell you how I did it, and show you some of the different tools I used, so I wrote this article. Just skip over the bits that get too technical!

Let's start with what I got when I installed the program and tried starting it up myself:

OK. I got this error, and then when I pressed OK it closed on me. That didn't work out that well. I'm sure you've seen similarly confusing errors on your own computers! Turns out, it's not always easy for programmers to figure out what it means, either . . .

So, we have a problem. But where? "Path not found" isn't a very helpful message - it doesn't tell you what path, for a start! I decided this would be the first thing to try and find out, so I started up FileMon, a utility that monitors what files are accessed by running programs. I was looking for any "not found" messages, and there were a few, but they all turned out to be dead ends.

By now it was clear that it wasn't going to be as easy as a missing file or a permission problem. The next thing I tried was another of the Sysinternals tools, RegMon. This does much the same thing as FileMon - monitor what's happening - but for the Windows Registry, so you can see what settings are being written and read. I consider both of these tools essential if you want to know what's really going on.

This was the last registry read before the error

As it happens, RegMon did turn up something - the last thing that GEL read before it all started to go wrong was the main path of Creatures 2. The thing was, this registry read didn't fail. This just happened to be the last thing that it did with the registry that I couldn't narrow down to other causes. I did try modifying this value in the registry, but this just resulted in slightly different errors.

After that, I briefly tried using another utility called API Monitor in order to see what calls the program was making to the operating system. This program is rather like a general version of Regmon and Filemon - while they monitor specific things, API Monitor "hooks" pretty much every system function that there is and records their use. Unfortunately, I couldn't find what I was looking for; I later found out that it didn't even start sending messages until a window had been created.

A small aside . . .

One thing I did notice using API Monitor was the amazing number of calls that were made - almost ten thousand of them - just to display an error box on the screen.

Of course, it wasn't just displaying the box, it was:

  • Loading the code to process the error
  • Looking up the error message
  • Playing the error sound, which meant:
    • Loading sound libraries
    • Checking what sound devices were available
    • Figuring out what sound was the error sound
    • Loading and playing the sound
  • Loading up the screen reader library in case it needed to read the message to me
  • Lots of other things that might have been useful if it had actually done anything after showing the error

There's a simple reason CPUs keep getting faster - they have to, because if they didn't, there's no way we'd be able to use all the things our computers do for us! The really crazy thing is that displaying such a box (with all the above features) just takes one line of code; something like this:

MsgBox "Hi there! This is the message" & vbNewLine &, vbInformation, "This is the title!"

Truly, we live in an age of wonders.

To recap: I'd found it wasn't a case of failed registry entries or a file not being there. It was time to bring out the big guns.

My first tool is recognizable to pretty much all Windows programmers, even if they don't use it themselves - Microsoft's Visual Studio. This is the number one tool for Windows development, and although it has its detractors, it's pretty good as development environments go. I would use this to run the program and stop it halfway, examining and changing the memory that it used.

The second might be a little less familiar to most programmers - IDA, the Interactive Disassembler. A disassembler is a program that turns compiled programs back one step into "assembly code", the last point at which it can be considered remotely readable. Few programmers actually write code at this level - most use a higher-level language like C++, Pascal, Java or Visual Basic - but it is usually possible to get a good idea of how parts of a program works through reading it in assembly.

Disassembling programs (also known as reverse-engineering) is something of a shady activity - one of IDA's most popular uses (though not one they advertise) is to figure out how to get around serial code checks, and this is one reason why disassembly is forbidden in most software licenses. However, all tools have their uses, and when you need to know exactly what a program is doing in order to fix it, but don't have the source, a good disassembler is a requirement.

Anyway, I started the program running in the Visual Studio debugger - a mode in which you can control exactly how a program executes, and modify the variables it is using - and ran through the code to see where the problem occurred. It was pretty easy to see what part of the error was in - a file called glsupcts.dll that came with the program. To see exactly what the code did, I set IDA running on it; after a few minutes it had an assembly listing of the code ready for me to read.

A note about DLLs

DLLs are not that much different from EXEs - they're both files that contain "code" (and sometimes other things like icons or embedded sounds). The main difference is that the EXE files contain the bit of code that starts the whole thing going, whereas DLLs tend to get called up by those EXEs to do their share of the work.

Of course, the assembly code wasn't actually all that easy to read. Something that made it even difficult was that the program had been written in Visual Basic, a language that I like which has a very easy to use system of programming, but which is often more general than required. As a result, it often did things in an odd way, and the code made a lot of calls to functions in the Visual Basic library. Of course, since these library calls were not documented, I ended up having to decompile this library as well, just to figure out what the program was doing! Hopefully nobody from Microsoft who cares is reading this.

Reading through the IDA output, I found the check for the registry value just before the error occured. It certainly seemed these were linked in some way. Then I found a reference to "AllChemicals.str", a file that contains the names of chemicals in Creatures. It made sense that GEL would try to load this file, so that it knew what each of the chemicals was called!

Now I had a clue - since I knew from reading the FileMon output that it never actually managed to load that file, it was probably failing while trying to. Using Visual Studio to look at the memory when the program crashed, I saw there was something odd about the path it had given to the "open file" function. It started off fine, but the end didn't look at all right. Here was my problem!

The system had used part of the memory given to it to work out the path (see the end for details), and GEL had thought this was part of the path itself. It was all clear now - the buffer was not being trimmed of the working copy, and this was getting left after the path name, so when the program put "AllChemicals.str" on the end, the middle of the path was invalid. This was the reason it wasn't showing up on FileMon - it didn't even get to the point where it looked for the file on the disk.

So what could I do? Well, I knew it was trimming off the last part of the string - the trouble was, it thought it was twice as big as it actually was, so it was keeping twice as much as it should. The length had to be stored as a number somewhere. Eventually I found the number being returned from a call to a function called vbaLenBstr - which naturally calculated the length of the incorrect path. Now I just needed to divide it by two and it would only use the correct portion of the string.

Remembering my computer operations, I knew that the best way to divide by two was to shift the number to the right. What does this mean? Well, you can think of numbers inside a computer as being like a group of people all standing in a row, with flags with numbers on - starting from the right, they'd go 1, 2, 4, 8, 16 . . . all the powers of 2. When you shift right, the people all look at whoever's holding the next-highest flag, and do what they're doing. It looks like this:

Flag num 128 64  32  16   8   4   2   1  
Before:   1   0   1   0   1   0   1   1 = 128 + 32 + 7 + 2 + 1 = 171
After:    0   1   0   1   0   1   0   1 = 64 + 16 + 4 + 1 = 85
Voila - division by two! Of course, you lose any remainder, since there's no 0.5 flag. Fortunately there's no such thing as half a character.

Of course, I'm not a whiz at assembly, so I had to look up exactly how to do the shift - I actually found the one I needed elsewhere in the code, so I could just use that. Now I had my instruction, and I knew where it had to go. It should be simple from here, right?

Well, no. The trouble is, you can't just add another instruction to the middle of a compiled program, moving all the others along. It would be like rearranging pages in a book and not updating the index (which is regenerated each time you "compile" a book). Worse, since machine instructions usually take more than one byte, moving them means instructions would start in the wrong place, changing their whole meaning - imagine what would happen if you kept all the spaces in a book in the same place but moved each letter along one position! When things get out of order in a computer, programs crash.

One thing I could have done would be to overwrite what was there already (perhaps something that didn't matter much). It had been enough trouble figuring out what one piece of assembly did, though - I didn't want to have to go through that all over again!

Fortunately, I didn't have to try that, because there was a convenient area of NOP instructions nearby. NOP stands for "no op" - it's an instruction that does nothing but move onto the next instruction. This seems useless, but it can in fact be useful for various things.

In this case, it was useful because it meant I had some space to work with. Because this space was free, I could fill it with more code. I needed to add just one instruction, but to get to that instruction I needed to put a jump instruction in. I looked these up, and it turned out the one I needed was a whole five bytes long, including the place to jump to.

That meant I had to move the code it replaced down into the section of NOPs as well, after my right shift. I then needed to jump back up to the point after the first jump instruction, so that the code could continue as if nothing had happened.

So, after that, the big question is did it work? . . . Yes! It finally loads!

For those who've read this far, congratulations! I hope you found this little view into my world educational.

This is a bit more than you'd usually have to do when debugging a problem, but it's pretty representative of what most programmers do in real life - it's not all fast cars, mansions and stock options! Sure, you don't usually have to go as hardcore as writing assembly-code patches for broken DLLs (I'm sure I'm going to get nasty comments from the real hardcore folks out there who do stuff like this every day , but a lot of the time you're figuring out problems with existing code, not just writing new code.

Often it's not our code, either - it'll be written by someone else (who left six months ago) in a way that seems totally nonsensical. Sometimes you're right to think that, other times you just don't understand it yet; either way, you have to fix it, and probably add a few new things, too! Ahh, well, all in a day's work . . .

Bonus! (warning: very techy)

If the program was buggy, why did it work on Windows 98? Well, the difference is the way in which the operating system works with text. On Windows 98, the standard is to have one byte per text (UTF-8), which is nice and fast, but which means you only get 256 different characters to choose from at any one moment; not enough for many languages. In Windows XP, the standard (called UTF-16) is to have two bytes, which gives you 65536 characters, which is good enough for most purposes.

Asking for a value from the registry is done by preparing a memory buffer - an area of memory to hold the result - and telling the operating system how you want the data. Because GEL had to work on both operating systems, it used the Windows 98 method, asking for a UTF-8 string (step 1 on the text diagram), and expanded the text later (step 2). But on Windows XP, this meant that the operating system had to reduce the size of the text, since it stored everything in UTF-16.

What was the easiest way to do that? Why, to read the output into a buffer and then collapse it down to UTF-8 by copying each character back just the right amount (computers rarely "move" data, since it costs twice the time it takes just copying it - deletion is an extra step) - first 0 steps, then 1, then 2 . . .

Where should this conversion take place? Since the memory space to store the text in was already available, it used that, safe in the knowledge that any extra text in the buffer should be ignored by the program, which was told how much had been returned. At least, that's how it should have worked . . . but instead, the program used the whole string and just counted the length of it rather than relying on the value it was given. On Windows 98, that space was never filled with text, as it was UTF-8 to start with. Although the bug was still there, it had no actual effect, since counting the length of the text came up with the right answer.

By the way, this explains why the "extra" text is quadruple-spaced - it was UTF-16 to start with, and it was left there because it was the second half of the text, which was not overwritten by the UTF-8 version of itself. It was then incorrectly read as part of the UTF-8 string, and expanded again by GEL into UTF-32.

The irony is that if it had been read straight from the registry as UTF-16, no conversion would have been necessary, and the application would probably have worked. Such are the ways of code!

8 Replies Reply 62 Referrals

Join the Stardock/WinCustomize.com Folding@Home team today!

Dec 19, 2004 3:41 PM by Discussion: Personal Computing
Does your computer sit there doing nothing most of the time? Worried that using a $1000 computer just for web-browsing is a bit wasteful? Are you thinking that SETI@Home or distributed.net may be fruitless endeavours? Well, you just might be interested in Folding@Home! Have your computer spend its time doing some useful calculations for medical research (see the FAQ) rather than letting those cycles go to waste, and download a client now!

I've made a Stardock/WinCustomize.com team - all you have to do is enter team ID number 41029 when asked by the Folding@Home client. And that's it! It will automatically get new blocks of work from the internet every so often and send results back, but you shouldn't have to touch it unless you want to. I chose to have it installed as a service, and the only way I know it's running is that CPU is at 100% or thereabouts all the time. It's all idle use, so it's not stealing the cycles from anything I want to run, and I have the satisfaction of knowing that my computer is working on something useful . . . even when I'm asleep. Give it a go! :CONGRAT:
41 Replies Reply 45 Referrals

 
Page 1 of 2