Projects: Disassembling and Debugging
Assembly and C Programs that
Display ASCII Characters Onscreen
( Exercises in x86 Assembly Language and MS-DEBUG)


I'll explain how to use DEBUG to disassemble and step through a few 8086 Assembly programs here, and also comment on how complex the code from "C compilers" can become versus that of small .COM programs in which the programmers often write their own Machine code in Assembly Language.

If you have no experience whatsoever with DEBUG, I suggest that you first work through my Guide to DEBUG (make sure to work on the DEBUG program listed under the ENTER [e] command as it concerns displaying all the Extended ASCII characters on screen), and then study the Detailed Step-by-step Analysis of the EICAR Program to gain experience in using more DEBUG commands before finally coming back to this page.

Since our first Assembly program is only 69 bytes long, you can simply "Copy and Paste" the following Enter Data commands into DEBUG:

e 100 b8 00 02 ba 00 00 b9 16 00 50 52 51 e8 13 00 59
e 110 5a 58 cd 21 42 50 52 51 e8 15 00 59 5a 58 e2 e9
e 120 eb 1e b8 00 02 ba 2e 00 b9 03 00 cd 21 e2 fc c3
e 130 e8 ef ff ba 3b 01 b4 09 cd 21 c3 0d 0a 24 90 90
e 140 b8 00 4c cd 21
Then type these commands at the DEBUG prompts to create the program file called disp22.com ( it will be created in the same folder DEBUG was started from):
-n disp22.com
-rcx
CX 0000
:45             [ 69 bytes in decimal ]
-w
-q
This program simply displays the bytes 00h through 15h (a total of 22 characters) each on a separate line with three dots on either side of the character itself.
Note: Some of these bytes will move the cursor position or perform some other action instead of displaying a character on your screen.

PROJECT: Once you've run the program to see what the output looks like, use DEBUG's U (unassemble) command to disassemble it into its Assembly Language code for reference. You can make a 1 pass disassembly listing quite easily by creating and saving a text file with the following lines (and then following my instructions below):

n disp22.com
l
u 100 13a
d 13b 13d
u 13e 144
q


( Make sure you press the ENTER key once or even twice after typing this 'q' -- if not, DEBUG will 'lock up' waiting for a RETURN you'll never be able to enter.
NOTE: I saved you the trouble of having to figure out later that the bytes from 13b to 13d are DATA; not instructions. Commercial disassemblers often make many passes through code trying to determine the difference between Code and Data elements. )

You can name this file anything you want, but I'll use the name, disp22.dsf, here. Run this Debug script file in the same folder as disp22.com from a command line prompt like this:

  C:\temp>debug < disp22.dsf > disp22.asm

which redirects the normal DEBUG screen output into the file disp22.asm. ( Unfortunately, and I have no idea why it should be so, but the file this creates has many spaces at the end of each line, and sometimes you need to add more RETURNS after saving it in Notepad.) Clean up the file as best you can, then try separating the Subroutines (sections of code that are pointed to by CALL instructions) from the rest of the code and data.

Open the program disp22.com in DEBUG, and try stepping through it using the T (Trace) and P (Proceed) commands (while making reference to your Assembly lising) until you understand how it operates. [ CAUTION: Always use the Proceed command to execute any 'INT' instruction, or you'll find yourself trying to trace through a huge section of the computer's BIOS code instead of just this little program! ]   You should be able to place some comments in your Assembly listing describing how various instructions affect the program and/or its output, or labeling the names of the BIOS or DOS INTerrupt(s) that are used to display data on the screen. You can send me questions or comments about the code using this online Form.

FOR EXTRA CREDIT:

What you see here is the output from another simple Assembly program similar to the one discussed above. But this program first displays the Hexadecimal value of each byte that the program attempts to display a character for on the screen. Can you write an Assembly program that produces the same output?

If you're interested in doing this, but have problems along the way, I'll try to help you without giving away too much of how to do it.

DISP32.COM running under a .PIF file set to display in a DOS-Window of 43 lines per page using a 7 x 12 Bitmap Font.



HINT: There are many different ways to write programs that can all produce the same output. I just happened to use LOOP and INC instructions to create the Hexadecimal numbers similar to disp22.com's routines.




Looking Inside a Simple C Program


The following program was created with a " C Language " Source Compiler and Linker. I want you to see how much extra code is added to what is a relatively simple source code... All kinds of routines for checking what version of DOS you are running to various Memory Allocation calls, etc. add thousands of bytes that seem to have nothing to do with what I wrote in my Source code file:

After running Chartype.exe once or twice on your computer, open the program in NOTEPAD and under the 'Edit' menu, select 'WordWrap' before proceeding... then press the 'CTRL + END' keys to go to the end of the file where you'll see a lot of the program's text. Note that there's an extra line of text near the bottom that has nothing to do with what you saw in the program's output: "COMPAQ print scanf : floating point formats not linked" and moving up to the beginning of this text section, you'll also find a phrase that's purely for identifying the type of compiler/linker I used, "Borland C++ - Copyright 1991 Borland Intl."

IF you are running Windows 9x/ME (the NOTEPAD in 2000/XP doesn't have this problem): When exiting NOTEPAD, MAKE SURE YOU click on the 'NO' button in answer to the question: Do you want to save the changes? For some weird reason, if you save it this way (even though you simply Word-wrapped the file), Notepad will convert every single 00-byte of a binary file to a space character (20h); making the executable completely useless! I suggest that all Windows 9x/ME users obtain TheGUN.exe from my FreeTools page (to replace NOTEPAD); you'll be able to open files of any size with it, and never have to worry about this word-wrap nusance!

Although it's possible to open Chartype.exe in MS-DEBUG and begin stepping through the code with the Proceed or Trace commands, you'll most likely become bored very quickly since Borland's compiler added lots of extra 'housekeeping' routines concerning DOS handles and Memory allocation right at the beginning of the code... If you really want to work through the relevant parts of this program, here's a time-saving tip: You can immediately skip to the instruction beginning at CS:0291 with the command: g 291 to bypass all that Borland stuff. But even then, there are lots of lines of code that seem quite wasteful compared to an Assembly program... Some important subroutines are found at 055D and 1A94 both of which call many other subroutines which will put your head into a spin unless you take the time to disassemble the whole program before trying to find the the very few lines of code that actually call a BIOS Video INTerrupt to display the characters on your screen! (Note: Locations 0CB9 thru 0CE8 and 0F45 thru 0F68 are all DATA locations not CODE even though they are found in the Code Section of the program! Anything found in the Data Section should always be DATA though.) There are at least five different video functions used in this program (due to some 'convoluted' programming there are actually others), and they're all found in one subroutine that's 161 bytes long.
Can you tell me where it's located and/or what the five explicit video functions are called?

If you have any questions about these programs or discussions, please use my online feedback form here: Comments/Questions for The Starman.

[ The Starman. Revised: 27 OCT 2001.]
Last Update: 27 JUN 2003.



The Starman's ASM Index Page

The Starman's Realm Index Page