view dxa.1 @ 14:84c0facfc43c

Merge changes from upstream v0.1.4.
author Matti Hamalainen <ccr@tnsp.org>
date Thu, 14 Oct 2021 01:38:52 +0300
parents 4410c9c7750d
children
line wrap: on
line source

.TH DXA "1" "31 January 2019"

.SH NAME
dxa \- 6502/R65C02 disassembler

.SH SYNOPSIS
.B dxa
[\fIOPTION\fR]... \fIFILE\fR

.SH DESCRIPTION
.B dxa
is the semi-official disassembler option for the
.BR xa (1)
package, a weakly patched version of Marko Mäkelä's
.B d65
disassembler that generates output similar to the de facto coding
conventions used for
.BR xa (1).
The package is designed to intelligently(?) scan arbitrary code and (with
hints) can identify the difference between data and valid machine code,
generating a sane looking, "perfect" disassembly with data and code portions.
.LP
Perfect, in this case, means that you can take what
.B dxa
spits out and feed it right back into
.BR xa (1),
and get the exact same object file you started with, even if sometimes
.B dxa
can't identify everything correctly. With a few extra options, you can tease
and twist the output to generate something not quite so parseable, or even
more like true assembler source.
.SH OPTIONS
For historical and compatibility reasons, the long options (--) only exist
if
.B dxa
were compiled with LONG_OPTIONS enabled in options.h.
.TP
.B \--datablock xxxx-yyyy
.TP
.B \-b xxxx-yyyy
Defines the memory range xxxx to yyyy (hexadecimal, inclusive) to be a data
block. The memory range can be further specified:
.RS
.IP *
If the range is preceded by ! (an exclamation point), such as
.BR !c000-cfff ,
then it is 
further defined to be a data block with no vectors in it either.
.IP *
If the range is preceded by ? (a question mark), then it is further
defined to be a data block that is completely unused and therefore
no valid routine may contain instructions whose parameter lie in this
range.  Useful for providing enhanced protection against misinterpreting
data to be program code, but be careful, or some code may be listed as
data.  For instance, the Commodore 64 firmware uses the base address $CFFF
when initializing the video chip, and the BASIC interpreter uses the
addresses $9FEA and $9FEB while loading the instruction pointers.
In addition to this, there are a number of BIT instructions used for
skipping the next instruction.  Thus, you must allow addresses like $1A9,
$2A9 and so on.
.RE
.TP
.B \--datablocks filename
.TP
.B \-B filename
Reads data blocks from file
.B filename
as if they had been specified on the command
line, one per line (such as xxxx-yyyy, ?xxxx-yyyy, etc.).
.TP
.B \--labels filename
.TP
.B \-l filename
Causes label names to be read from file
.BR filename .
This file format is the same as the labelfile/symbol table file generated
by
.BR xa (1)
with the
.B \-l
option. The
.B \-l
was chosen on purpose for consistency with
.BR xa (1).
.TP
.B \--routine xxxx
.TP
.B \-r xxxx
Specifies an address (in hexadecimal) that is declared to be a valid routine.
.B It is strongly recommended
that you specify the initial execution address as a routine.
For example, for a Commodore 64 binary with a BASIC header that performs
.B SYS
.BR 2064 ,
specify
.B \-r0810
so that disassembly starts at that location (or use the
.B \-U
option, which can automatically do this for you). Note that specifying this
manually may have interactions with datablock detection
.RB ( \-d ).
.TP
.B \--routines filename
.TP
.B \-R filename
Causes a list of routines to be read from file
.BR filename ,
one per line as if they had been specified on the command line.
.TP
.B \--addresses option
.TP
.B \-a option
Determines if and what kind of address information should be dumped with the
disassembly, if any. Note that this may make your output no longer intelligible
to
.BR xa (1).
The valid options are:
.RS
.TP
.B disabled
Dump source only with no address information. This is the default.
.TP
.B enabled
Write the current address at the beginning of each line.
.TP
.B dump
Write the current address at the beginning of each line, along with a hexdump
of the bytes for which the statement was generated.
.RE
.TP
.B \--colon-newline
.TP
.B \-N
.TP
.B \--no-colon-newline
.TP
.B \-n
A purely cosmetic option to determine how labels are emitted. Many people,
including myself, prefer a listing where the label is given, then a tab,
then the code
.RB ( \-n ).
Since this is
.I my
preference, it's the default. On the other hand, there are also many who
prefer to have the label demarcated by a colon and a newline, and the code
beginning indented on the next line. This is the way
.B d65
used to do it, and is still supported with
.BR \-N .
.TP
.B \--processor option
.TP
.B \-p option
Specify the instruction set. Note that specifying an instruction set that
permits and disassembles illegal and/or undocumented
NMOS opcodes may make your output unintelligible to
.BR xa (1).
Only one may be specified. The valid options are:
.RS
.TP
.B standard-nmos6502
Only official opcodes will be recognized. This is the default.
.TP
.B r65c02
Opcodes specific to the Rockwell 65C02 (R65C02) will also be allowed.
.TP
.B all-nmos6502
Allows all 256 NMOS opcodes to be disassembled, whether documented or
undocumented.
Note that instructions generated by this mode are not guaranteed to work on
all NMOS 6502s.
.TP
.B rational-nmos6502
Only allows "rational" undocumented instructions. This excludes ANE, SHA,
SHS, SHY, SHX, LXA and LAXS. This is a judgment call.
.TP
.B useful-nmos6502
Only allows "useful" undocumented instructions. This excludes ANE, SHA,
SHS, SHY, SHX, LXA, LAXS, NOOP and STP. This is a judgment call.
.TP
.B traditional-nmos6502
Only allows the most widely accepted undocumented instructions based on
combinations of ALU and RMW operations. This excludes ANE, SHA, SHS, SHY,
SHX, LXA, LAXS, NOOP, STP, ARR, ASR, ANC, SBX and USBC.
This is a judgment call.
.RE
.TP
.B \--get-sa
.TP
.B \-G
.TP
.B \--no-get-sa xxxx
.TP
.B \-g xxxx
Enables or disables automatic starting address detection. If enabled (the
default),
.B dxa
looks at the first two bytes as a 16-bit word in 6502 little-endian format
and considers that to be the starting address for the object, discarding them
without further interpretation. This is very useful for Commodore computers in
particular. If your binary does not have a starting address, you must
specify one using
.B \-g
or
.B \--no-get-sa
followed by a hexadecimal address. The starting address will then be encoded
into the output using
.BR "* =" .
.TP
.B \--no-word-sa
.TP
.B \-q
.TP
.B \--word-sa
.TP
.B \-Q
Only relevant if automatic starting address detection is enabled. If so,
the default is to also emit the starting address as a
.B \&.word
pseudo-op before the starting address indicated with
.B * =
so that it will be regenerated on re-assembly
.RB ( \-Q ).
Otherwise, if this option is disabled, the starting address word will not be
re-emitted and
will need to be tacked back on if the target requires it. If you specify an
address with
.BR \-g ,
then that address will be used here too.
.TP
.B \--no-detect-basic
.TP
.B \-u
.TP
.B \--detect-basic
.TP
.B \-U
If the starting address is recognized as a typical BASIC entry point
(currently supported for Commodore computers), then
.B dxa
will attempt to see if a BASIC header is present, and if so, determine its
length and mark the section as a completely dead
datablock not eligible for further disassembly or referencing. If the
first line is a construct such as
.B 10 SYS
.BR 2061 ,
then 
.B dxa
will additionally parse the provided address and mark it as a valid routine
if the address is within the boundaries of the disassembled file.
Note that although its heuristics
are designed to be permissive, it may nevertheless misinterpret certain files
with intentionally pathologic line link addresses, and unusual applications
where the linked machine code is designed to actually
.I modify
the BASIC text may not
disassemble correctly with this option. These are highly atypical situations,
so this option will likely become the default in a future release.
.TP
.B \--verbose
.TP
.B \-v
Enables verbose output, which may or may not be useful in the same way that
Schroedinger's Cat may or may not be dead.
.TP
.B \--help
.TP
.B \-?
.TP
.B \-V
A quick summary of options.
.LP
The following options control how program code is scanned and determined
to be a valid (or invalid) portion of a putative routine.
.TP
.B \--datablock-detection option
.TP
.B \-d option
This controls how the program automatically detects data blocks for addresses
where no previous hints are specified. Only one method may be specified.
The valid options are:
.RS
.TP
.B poor
As much as the object as possible will be listed as program code, even if
there are illegal instructions present. This is the default.
.TP
.B strict
Assumes that all declared routines call and execute only valid instructions. If
any portion of code declared as a routine leads to an address block containing
illegal opcodes, a consistency error will occur and disassembly will stop.
.TP
.B skip-scanning
Program addresses that are not referenced by any routine will not be scanned
for valid routines (thus data a priori).
.RE
.TP
.B \--no-external-labels
.TP
.B \-e
.TP
.B \--external-labels
.TP
.B \-E
Controls whether labels should be generated for addresses outside of the
program itself. The default is not to (i.e., leave the addresses absolute).
.TP
.B \--address-tables option
.TP
.B \-t option
Controls detection of address tables/dispatch tables. The following options
are available:
.RS
.TP
.B ignore
Don't attempt to detect address tables.
.TP
.B detect-all
Address tables referencing any label will be detected.
.TP
.B detect-internal
Address tables with labels whose addresses lie within the program's address
range will be detected. This is the default.
.RE
.TP
.B \--no-suspect-jsr
.TP
.B \-j
.TP
.B \--suspect-jsr
.TP
.B \-J
These options indicate whether JSRs are always expected to return to the
following instruction or not. This will affect how routines are parsed. For
example, the Commodore 128 KERNAL has a routine called PRIMM that prints a
null-terminated string directly following the JSR instruction, returning after
the null byte. In this case,
.B \-J
should be specified to alert the disassembler that this is possible. The
default is to expect "normal" JSRs (i.e.,
.BR \-j ).
.TP
.B \--no-one-byte-routines
.TP
.B \-o
.TP
.B \--one-byte-routines
.TP
.B \-O
These options permit or inhibit a single RTS, RTI or BRK instruction (or STP
if enabled by the instruction set), or a conditional branch, from being
automatically identified as a routine. The default is to inhibit this; specific
cases may be selectively overridden with the
.B \-r
option.
.TP
.B \--no-stupid-jumps
.TP
.B \-m
.TP
.B \--stupid-jumps
.TP
.B \-M
These options consider jumps or branches to the current address (such as
JMP *, BCC *) to be invalid or valid code depending on which is specified.
Note that BVC * is always accepted as the V flag can sometimes be toggled
by an external hardware signal. The default is to consider them
invalid otherwise.
.TP
.B \--no-allow-brk
.TP
.B \-w
.TP
.B \--allow-brk
.TP
.B \-W
These options control if BRK (or STP if enabled by the instruction set) should
be treated as a valid exit from a routine, just like RTS or RTI. The default
is not to do so.
.TP
.B \--no-suspect-branches
.TP
.B \-c
.TP
.B \--suspect-branches
.TP
.B \-C
These options are rarely needed, but account for the case where a program may
intentionally obfuscate its code using branches with unusual destination
addresses like LDA #2:BEQ *-1. In the default case, this would be considered
to be invalid and not treated as a routine
.RB ( \-c );
if
.B \-C
is specified, it would be accepted as valid.
.SH BUGS/TO-DO
There are probably quite a few bugs yet to be found.
.LP
65816 opcodes are not (yet) supported.
.LP
There are a few options Marko created that aren't hooked up to anything (and
are not documented here on purpose). I might finish these later.

.SH "SEE ALSO"
.BR xa (1),
.BR file65 (1),
.BR ldo65 (1),
.BR printcbm (1),
.BR reloc65 (1),
.BR uncpk (1)

.SH AUTHOR
This manual page was written by Cameron Kaiser <ckaiser@floodgap.com>.
.B dxa
is based on
.B d65
0.2.1 by Marko Mäkelä.
Original package (C)1993, 1994, 2000 Marko Mäkelä. Additional changes
(C)2006-2019 Cameron Kaiser.
.B dxa
is maintained independently.

.SH WEBSITE
http://www.floodgap.com/retrotech/xa/