Ben Collver
2025-02-24 01:03:51 UTC
Reply
Permalink============================
SYS.COM corrupted a NetDrive image. But why?
Posted: 2025-02-22
Tags: DOS, NetDrive, ForgotToCheckReturnCode
In ye olden days to make a diskette bootable you had to format it
using the /s option of the FORMAT command. That works fine for blank
disks, but software vendors had a small problem--they would sell you
a disk with their software on it but they couldn't include the DOS
files needed to make it bootable because they were not selling you
DOS. To get around this they would leave space available on the disk
and have you use the SYS command, which copied the magic boot loader
code onto from your DOS disk onto their disk. There were some
restrictions on where the free space was located and how much was
required, but it generally worked to allow you to make a diskette
bootable without having to format it.
Last year somebody reported a problem with the DOS 3.3 SYS.COM
command when used with NetDrive. They started with a valid FAT12
image, ran SYS.COM to make it bootable, and then they were not able
to mount the image using NetDrive again. Running SYS.COM against the
image had broken something.
Besides copying the operating system's hidden files to the target
drive letter, SYS.COM also copies some boot code into the first
sector of the disk. In general it does not make sense to run it
against a NetDrive image because you already had to boot DOS to mount
the image, but it should not hurt anything. So I decided to have a
look at what was going on.
The first step was to recreate the problem. I created a 10MB FAT12
disk image using NetDrive. The first few bytes of the first sector
(the volume boot record) are shown below:
The FAT12 NetDrive image when first created:
00000000 EB 3C 90 4E 45 54 44 52 49 56 45 00 02 08 01 00 .<.NETDRIVE.....
00000010 02 00 02 00 50 F8 60 00 00 00 00 00 00 00 00 00 ....P.`.........
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<... snip ...>
000001E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000001F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000200 F8 FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Dissecting that we find:
Offset Bytes Description
------ ----------------------- ------------------------------------
0x00 EB 3C 90 Jump to executable code
0x03 4E 45 54 44 52 49 56 45 OEM ID ("NETDRIVE")
0x0B 00 02 Bytes per sector (512)
0x0D 08 Sectors per cluster (8)
0x0E 01 00 Reserved sectors (1)
0x10 02 Number of File Allocation Tables (2)
0x11 00 02 Root directory entries (512)
0x13 00 50 Sectors (20480)
0x15 F8 Media Descriptor (hard drive)
0x16 60 00 Sectors per fat (6)
That is a minimal BIOS Parameter Block (BPB) as defined by DOS 2.0
but also recognizable to later versions of DOS. Later versions of DOS
have extended it a few times.
At offset 0x200 you see the start of the first File Allocation Table
(FAT). The first byte 0xF8 is the media descriptor byte, which should
be the same as the one in the BPB. This is FAT12 so entries are 12
bits in size; the first entry is actually 0xF8F and the second entry
is 0xFFF. Ignoring the media descriptor entry and the second entry
which is reserved, this FAT is completely empty.
I mounted the new image under IBM PC DOS 3.3 and ran SYS.COM against
it. That looked normal. However, when I disconnected the image and
tried to mount it again NetDrive complained that it was a bad image:
C:\MTCP>sys e:
System transferred
C:\MTCP>netdrive disconnect e:
mTCP NetDrive by M Brutman (***@gmail.com) (C)opyright 2008-2025
Version: Feb 17 2025
NetDrive device opened, IOCTL_read return code: 8 0A1E:0020 0A1E:1768
Drive disconnected from network.
Server session (54559) ended
C:\MTCP>netdrive connect calculon:2002 small.dsk e:
mTCP NetDrive by M Brutman (***@gmail.com) (C)opyright 2008-2025
Version: Feb 17 2025
NetDrive device opened, IOCTL_read return code: 8 0A1E:0020 0A1E:1768
Resolving calculon, press [ESC] to abort.
Server ip address is: 192.168.2.101
Next hop address: 98:90:96:C3:14:70
Error opening virtual hard drive: small.dsk,
bad BPB and file size doesn't match a 170K or 320K DOS 1.x disk
C:\MTCP>_
Well, the BPB started off correctly but now it seems bad. Let's look
at the first sector now that SYS.COM has altered it:
00000000 EB 34 90 49 42 4D 20 20 33 2E 33 00 02 3B C1 75 .4.IBM 3.3..;.u
00000010 1A 8B 16 DC 09 8B 0E DE 09 2B CA 74 D4 8B 1E 00 .........+.t....
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 12 ................
00000030 00 00 00 00 01 00 FA 33 C0 8E D0 BC 00 7C 16 07 .......3.....|..
00000040 BB 78 00 36 C5 37 1E 56 16 53 BF 2B 7C B9 0B 00 .x.6.7.V.S.+|...
<... snip ...>
000001C0 0D 0A 44 69 73 6B 20 42 6F 6F 74 20 66 61 69 6C ..Disk Boot fail
000001D0 75 72 65 0D 0A 00 49 42 4D 42 49 4F 20 20 43 4F ure...IBMBIO CO
000001E0 4D 49 42 4D 44 4F 53 20 20 43 4F 4D 00 00 00 00 MIBMDOS COM....
000001F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA ..............U.
00000200 F8 FF FF 03 40 00 05 60 00 07 F0 FF 09 A0 00 0B ***@..`........
00000210 C0 00 0D E0 00 0F F0 FF 00 00 00 00 00 00 00 00 ................
Offset Bytes Description
------ ----------------------- -------------------------------------
0x00 EB 34 90 Jump to executable code
0x03 49 42 4D 20 20 33 2E 33 OEM ID ("IBM 3.3")
0x0B 00 02 Bytes per sector (512)
0x0D 3B Sectors per cluster (59)
0x0E C1 75 Reserved sectors (49525)
0x10 1A Number of File Allocation Tables (26)
0x11 8B 16 Root directory entries (35606)
0x13 DC 09 Sectors (56239)
0x15 8B Media Descriptor (unknown)
0x16 0E DE Sectors per fat (3806)
It makes sense for the jump instruction and OEM ID to change. And the
bytes per sector field is correct. But the rest of the BPB is
garbage. Something corrupted it.
Looking at the rest of the sector the boot code starts at offset 0x3E
and that looks reasonable. There is also the bootable partition
signature (0xAA55) at offset 0x1FE, and the FAT shows some additional
entries for the two hidden files that were copied over.
I tried it again, this time with a diskette image mounted using
NetDrive, and it did everything perfectly. Which implies that the
problem is not in NetDrive, but in the difference between hard drive
images and floppy disk images.
So the DOS 3.3 SYS command added the boot code and updated the FAT
correctly, but it clobbered the BPB. But only on the NetDrive hard
drive image. Why?
DOS 3.2 added a function called "Generic IOCTL" which allows DOS to
query a device to get its geometry, write a track, read a track,
format a track, etc It also added code to handle these additional
calls for the devices supported by the BIOS. For example, here is a
call to "Get Device Parameters" (Generic IOCTL, sub function 0x60)
for drive C:
C:\DOS>debug
-a
18CE:0100 mov ax,440d
18CE:0103 mov bl,03
18CE:0105 mov cx,0860
18CE:0108 mov dx,0400
18CE:010B int 21
18CE:010D int 20
18CE:010F
-g=100 10d
AX=0001 BX=0003 CX=0860 DX=0400 SP=FFEE BP=0000 SI=0000 DI=0000
DS=18CE ES=18CE SS=18CE CS=18CE IP=010D NV UP EI PL NZ NA PO NC
18CE:010D CD20 INT 20
-g
Program terminated normally
-d 400 l 30
18CE:0400 89 05 01 00 41 00 00 00-02 04 01 00 02 00 02 B1 ....A...........
18CE:0410 FF F8 40 00 3F 00 10 00-3F 00 FA 89 45 06 8B 46 ***@.?...?...E..F
18CE:0420 FC 89 05 8B 46 EC 89 EC-5D 5F 5E C3 51 56 55 89 ....F...]_^.QVU.
-
Note that at the first breakpoint (after the IOCTL call) the Carry
Flag (NC) is not set. This means the call was successful and the data
returned is reliable.
The documentation says that the DEVICEPARAMS data structure has a BPB
starting at offset 0x06, which here shows:
Offset Bytes Description
------ ----- ------------------------------------
0x06 00 02 Bytes per sector (512)
0x08 04 Sectors per cluster (8)
0x09 01 00 Reserved sectors (1)
0x0B 02 Number of File Allocation Tables (2)
0x0C 00 02 Root directory entries (512)
0x0E B1 FF Sectors (65457)
0x10 F8 Media Descriptor (hard drive)
0x11 40 00 Sectors per fat (4)
That makes sense for a 32MB C: drive.
Let's run that code again against the NetDrive drive image, with the
boot sector returned to what it was before it was corrupted:
DS=18CE ES=18CE SS=18CE CS=18CE IP=010D NV UP EI PL NZ NA PO NC
18CE:010D CD20 INT 20
-g
Program terminated normally
-d 400 l 30
18CE:0400 89 05 01 00 41 00 00 00-02 04 01 00 02 00 02 B1 ....A...........
18CE:0410 FF F8 40 00 3F 00 10 00-3F 00 FA 89 45 06 8B 46 ***@.?...?...E..F
18CE:0420 FC 89 05 8B 46 EC 89 EC-5D 5F 5E C3 51 56 55 89 ....F...]_^.QVU.
-a 103
18CE:0103 mv bl,05
18CE:0105
-g=100 10d
AX=0001 BX=0005 CX=0860 DX=0400 SP=FFEE BP=0000 SI=0000 DI=0000
DS=18CE ES=18CE SS=18CE CS=18CE IP=010D NV UP EI PL NZ NA PO CY
18CE:010D CD20 INT 20
-g
Program terminated normally
-d 400 l 30
18CE:0400 89 05 01 00 41 00 00 00-02 04 01 00 02 00 02 B1 ....A...........
18CE:0410 FF F8 40 00 3F 00 10 00-3F 00 FA 89 45 06 8B 46 ***@.?...?...E..F
18CE:0420 FC 89 05 8B 46 EC 89 EC-5D 5F 5E C3 51 56 55 89 ....F...]_^.QVU.
-_
I changed one instruction to change to the NetDrive drive number and
ran the code again, but this time at the first breakpoint the Carry
Flag (CY) is set. This means there was an error, and AX holds the
error code. Value 0x0001 means "ERROR_INVALID_FUNCTION" which makes
sense because the NetDrive device driver doesn't support this
function. (It is not required to be supported.)
If we dig around inside of SYS.COM we can see a call to Generic IOCTL:
C:\DOS>debug sys.com
-u 9a5
18EC:09A5 8A1E2209 MOV BL,[0922]
18EC:09A9 BAC003 MOV DX,03C0
18EC:09AC B444 MOV AH,44
18EC:09AE B00D MOV AL,0D
18EC:09B0 B508 MOV CH,08
18EC:09B2 B160 MOV CL,70
18EC:09B4 CD21 INT 21
18EC:09B6 8BDA MOV BX,DX
18EC:09B8 8D7707 LEA SI,[BX+07]
18EC:09BB 46 INC SI
18EC:09BC 46 INC SI
18EC:09BD 83FEFF CMP SI,-01
18EC:09C0 7445 JZ 0A07
18EC:09C2 83FEFE CMP SI,-02
-
Here we see it getting the drive number from storage, setting AX to
0x440D, setting CH to 0x08 (a block device) and CL to 0x60 (get
device parameters). DS:DX will be the pointer to the parameter block
to fill in.
Let's run the code!
C:\DOS\>debug sys.com e:
-u 9a5
18EC:09A5 8A1E2209 MOV BL,[0922]
18EC:09A9 BAC003 MOV DX,03C0
18EC:09AC B444 MOV AH,44
18EC:09AE B00D MOV AL,0D
18EC:09B0 B508 MOV CH,08
18EC:09B2 B160 MOV CL,70
18EC:09B4 CD21 INT 21
18EC:09B6 8BDA MOV BX,DX
18EC:09B8 8D7707 LEA SI,[BX+07]
18EC:09BB 46 INC SI
18EC:09BC 46 INC SI
18EC:09BD 83FEFF CMP SI,-01
18EC:09C0 7445 JZ 0A07
18EC:09C2 83FEFE CMP SI,-02
-g 9b6
AX=0001 BX=0005 CX=0860 DX=03C0 SP=0192 BP=0000 SI=0000 DI=0001
DS=1923 ES=1923 SS=1994 CS=1923 IP=0436 NV UP EI PL ZR NA PE CY
1923:0436 CC INT 3
-
At the breakpoint after the Generic IOCTL we see that the Carry Flag
(CY) is set and the error code is set to ERROR_INVALID_FUNCTION, just
as it was above. And here is the bug ... nothing is checking the
Carry Flag to see if there was an error after the Generic IOCTL call.
The Generic IOCTL writes DEVICEPARAMS structure at DS:DX, assuming
the call did not fail. If the call fails, as it does here, we'll just
see whatever was already in that storage. As before, the BPB
structure will be at offset 0x06. Here is what we got back in that
data structure:
18EC:09B0 B508 MOV CH,08
18EC:09B2 B160 MOV CL,70
18EC:09B4 CD21 INT 21
18EC:09B6 8BDA MOV BX,DX
18EC:09B8 8D7707 LEA SI,[BX+07]
18EC:09BB 46 INC SI
18EC:09BC 46 INC SI
18EC:09BD 83FEFF CMP SI,-01
18EC:09C0 7445 JZ 0A07
18EC:09C2 83FEFE CMP SI,-02
-g 9b6
AX=0001 BX=0005 CX=0860 DX=03C0 SP=0192 BP=0000 SI=0000 DI=0001
DS=1923 ES=1923 SS=1994 CS=1923 IP=0436 NV UP EI PL ZR NA PE CY
1923:0436 CC INT 3
-d ds:3c0
1923:03C0 1E 09 09 B4 40 CD 21 72-E4 3B C1 75 1A 8B 16 DC ***@.!r.;.u....
1923:03D0 09 8B 0E DE 09 2B CA 74-D4 8B 1E 19 09 B4 40 CD .....+***@.
1923:03E0 21 72 CA 3B C1 74 C6 F9-C3 B8 23 19 8E D8 A0 22 !r.;.t....#...."
1923:03F0 09 FE C8 BA 00 00 E8 90-00 72 0E 81 3E 82 0F 55 .........r..>..U
1923:0400 AA 75 06 BE 91 0D EB 57-90 B4 32 8A 16 22 09 CD .u.....W..2.."..
1923:0410 21 8A 47 16 0E 1F 2C F8-98 8B D8 D1 E3 8B B7 E7 !.G...,.........
1923:0420 0B 0B F6 75 18 8A 1E 22-09 BA C0 03 B4 44 B0 0D ...u...".....D..
1923:0430 B5 08 B1 60 CD 21 CC DA-8D 77 07 46 46 83 FE FF ...`.!...w.FF...
-_
(Note that the segment registers changed causing the instruction
pointer to shift ... we are still in the same code though, just using
aliased memory locations. Thanks segmented x86!)
IOCTL BPB:
72 E4 3B C1 75 1A 8B 16 DC 09 8B 0E DE 09 2B CA 74 D4 8B 1E 19 09 B4 40 CD
Corrupted BPB:
00 02 3B C1 75 1A 8B 16 DC 09 8B 0E DE 09 2B CA 74 D4 8B 1E 00 00 00 00 00
Except for the first two bytes (the sector size) and the last four
bytes (dpHugeSectors) that lines up perfectly. So the failure to
check the Carry Flag wound up causing bad BPB data to be written to
the volume boot record.
So what are those bytes? It looks like code to me, and we can confirm
that by just disassembling it:
-d ds:3c0
1923:03C0 1E 09 09 B4 40 CD 21 72-E4 3B C1 75 1A 8B 16 DC ***@.!r.;.u....
1923:03D0 09 8B 0E DE 09 2B CA 74-D4 8B 1E 19 09 B4 40 CD .....+***@.
1923:03E0 21 72 CA 3B C1 74 C6 F9-C3 B8 23 19 8E D8 A0 22 !r.;.t....#...."
1923:03F0 09 FE C8 BA 00 00 E8 90-00 72 0E 81 3E 82 0F 55 .........r..>..U
1923:0400 AA 75 06 BE 91 0D EB 57-90 B4 32 8A 16 22 09 CD .u.....W..2.."..
1923:0410 21 8A 47 16 0E 1F 2C F8-98 8B D8 D1 E3 8B B7 E7 !.G...,.........
1923:0420 0B 0B F6 75 18 8A 1E 22-09 BA C0 03 B4 44 B0 0D ...u...".....D..
1923:0430 B5 08 B1 60 CD 21 CC DA-8D 77 07 46 46 83 FE FF ...`.!...w.FF...
-u 3c0
1923:03C0 1E PUSH DS
1923:03C1 0909 OR [BX+DI],CX
1923:03C3 B440 MOV AH,40
1923:03C5 CD21 INT 21
1923:03C7 72E4 JB 03E7
1923:03C9 3BC1 CMP AX,CX
1923:03CB 751A JNZ 03E7
1923:03CD 8B16DC09 MOV DX,[09DC]
1923:03D1 8B0EDE09 MOV CX,[09DE]
1923:03D5 2BCA SUB CX,DX
1923:03D7 74D4 JZ 03AD
1923:03D9 8B1E1909 MOV BX,[0919]
1923:03DD B440 MOV AH,40
1923:03DF CD21 INT 21
-
Great, where did it come from? Using the search feature of DEBUG.COM
we can find those bytes, and they appear right before the suspect
code:
C:\DOS>debug sys.com e
-r
AX=0000 BX=0000 CX=129E DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
DS=18EC ES=18EC SS=18EC CS=18EC IP=0100 NV UP EI PL NZ NA PO NC
18EC:0100 E90912 JMP 130C
-s cs:100 129e 09 09 b4 40 cd 21 72 e4
18EC:0940
-u cs:940
18EC:0940 1E PUSH DS
18EC:0941 0909 OR [BX+DI],CX
18EC:0943 B440 MOV AH,40
18EC:0945 CD21 INT 21
18EC:0947 72E4 JB 092D
18EC:0949 3BC1 CMP AX,CX
18EC:094B 751A JNZ 0967
18EC:094D 8B16DC09 MOV DX,[09DC]
18EC:0951 8B0EDE09 MOV CX,[09DE]
18EC:0955 2BCA SUB CX,DX
18EC:0957 74D4 JZ 092D
18EC:0959 8B1E1909 MOV BX,[0919]
18EC:095D B440 MOV AH,40
18EC:095F CD21 INT 21
-
I am a little bit freaked out by that because the pointer to the
buffer is set before the IOCTL call; the code knowingly sets a
pointer to a buffer into what looks like its code area. Let's hope
they knew they were done with that part of the code, or it's just
another interesting bug to dissect.
So SYS.COM clearly doesn't work on hard drive images mounted with
NetDrive, but it did work on a floppy image. What is the difference
and why did it work?
The answer requires us to look inside of the BPB again. The BPB has a
field called the "media descriptor byte" which is used to describe
the layout of the image. This single byte has a limited range of
valid values:
Value Description
----- -------------------------------------------------------------
F0 3.5 inch, 2 sides, 18 sectors per track, 80 tracks, 1440KB or
3.5 inch, 2 sides, 36 sectors per track, 80 tracks, 2880KB or
5.25 inch, 2 sides, 15 sectors per track, 80 tracks, 1.2MB
F8 Hard disk, any geometry
F9 3.5 inch, 2 sides, 9 sectors per track, 80 tracks, 720KB or
5.25 inch, 2 sides, 15 sectors per track, 80 tracks, 1220KB
FA 5.25 inch, 1 side, 8 sectors per track, 40 tracks, 160KB
FB 3.5 inch, 2 sides, 8 sectors per track, 80 tracks, 640KB
FC 5.25 inch, 1 side, 9 sectors per track, 40 tracks, 180KB
FD 5.25 inch, 2 sides, 9 sectors per track, 80 tracks, 360KB or
8 inch, 2 sides, single density, 500KB
FE 5.25 inch, 1 side, 8 sectors per track, 40 tracks, 160KB or
8 inch, 1 side, single density, 250KB or
8 inch, 2 sides, double density, 1220KB or
FF 5.25 inch, 2 sides, 8 sectors per track, 40 tracks, 320KB.
You can tell this wasn't well thought out. The media descriptor byte
is often not enough to tell you what you are working with; you need
to combine it with knowledge of the physical drive type too.
When I run SYS.COM against a floppy image mounted using netdrive the
breakpoint after the Generic IOCTL call does not even get hit:
Server ip address is: 192.168.2.101
Next hop address: 98:90:96:C3:14:70
Session (52981) started, virtual hard drive opened: floppy.dsk
Packet driver connected at interrupt: 0x63
NetDrive packet driver shim connected at: 0x65
Drive size: 368640, Media descriptor byte from FAT: FD
C:\MTCP>
C:\MTCP>debug sys e:
File not found
-q
C:\MTCP>cd \dos
C:\DOS>debug sys.com e:
-r
AX=0000 BX=0000 CX=129E DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
DS=18EC ES=18EC SS=18EC CS=18EC IP=0100 NV UP EI PL NZ NA PO NC
18EC:0100 E90912 JMP 130C
-g 9b6
System transferred
Program terminated normally
-
I am pretty certain that it used the media descriptor byte from the
BPB and did not bother making the Generic IOCTL call. I used a 360KB
disk image with a media descriptor byte of FD, which is very common.
And no IBM PC ever shipped from the factory with an 8 inch drive so
it is not ambiguous. So as an experiment I used a 2880KB disk image
which has a media descriptor byte of F0, which is also shared with
1440KB diskettes. Sure enough SYS.COM tried to make the Generic IOCTL
call on that image, failed, and corrupted that BPB.
So in short:
* DOS block device drivers do not need to implement the Generic IOCTL
call.
* DOS 3.3 SYS.COM will work correctly with a device driver that does
not implement Generic IOCTL if you are using a media type where
there is no ambiguity about what the media actually is. (Example, a
360KB disk image that is 2 sided, 40 tracks, and 9 sectors per track.)
* When in doubt about what it is working with (which is always true
of a hard drive) DOS 3.3 SYS.COM will try to use the Generic IOCTL
call to get the BPB of the device instead of just reading it
directly from the first sector.
* DOS 3.3 SYS.COM has a bug where it fails to check the return code
from Generic IOCTL. This causes it to write garbage to the BPB of
the media when it happens.
The bug was probably introduced in DOS 3.2. I'm pretty sure that it
is still present in DOS 4.0, as the code is still not checking the
Carry Flag after the Generic IOCTL call.
SYS1.ASM from DOS 4.0 code
<https://github.com/microsoft/MS-DOS/blob/
2d04cacc5322951f187bb17e017c12920ac8ebe2/v4.0/src/
CMD/SYS/SYS1.ASM#L3198>
Created February 23rd, 2025
(C)opyright Michael Brutman, mbbrutman at gmail dot com
From:
<https://www.brutman.com/Adventures_In_Code/DOS_33_SYS_Bug_Hunt/
DOS_33_SYS_Bug_Hunt.html>