GnuWin
If you wish to compile yourself: get the source package <package>-<version>-src.zip.
You need GNU Bash, GNU Make and Mingw32 GCC and BinUtils.
In these notes it is assumed that you are familiar with Bash,
Make and GCC. Win32 implementations of Bash can be found in the CygWin tools, in the Msys tools, and in the
DJGPP tools. Win32
implementations of Make can be found in CygWin, Msys, DJGPP,
and Mingw. CygWin Bash and Make work quite
well. Note that, when you mix CygWin or Msys Bash and a native
Make, problems may occur because CygWin and Msys have their own
way of absolute filenames (for example c:/tools
becomes /cygdrive/c/tools
in CygWin and /c/tools
in Msys).
You can install Cygwin and its basic utilities (Autoconf, Automake, Bash,
Bison, Coreutils, Diffutils, Bash, Findutils, Flex, Gawk, Grep, Libtool, M4,
Make, Patch, Sed, Which) from any Cygwin mirror by using the
setup program.
Then install Mingw; you'd best use the latest
regular release ("Current"). Mingw can be downloaded from its
Sourceforge site. You'll
need GCC, Binutils and Windows API. Do not install these into the Cygwin
directory. Make sure the directory with the GCC and Binutils executables comes
before the Cygwin ones in your Path. You cannot use the
Cygwin GCC and Binutils, because the executables they create are not native
Windows ones, but depend on the Cygwin emulation layer (cygwin1.dll).
If you use the sources from GnuWin, then these have already been patched
and configured and there is no need to execute configure. Remove any .deps
directories, because they contain the dependencies, mostly header files,
for the sources and these may be different for your machine; then execute
./config.status
to recreate the default .deps
directories.
If you use the original sources, the configuration and ad hoc changes needed to compile are
done in Makefile.mingw
; type make -f makefile.mingw
at the Bash prompt. General configure options have been set in a config.site
; make sure that the
environment variable CONFIG_SITE
points to this file. If there is
no Makefile.mingw, then type .
/configure
.
When configure has finished, type make
. Sometimes you
need additional libraries and include files. Usually the line
export
LIBS = ...
and other lines with
-l...
in Makefile.mingw
show which additional libraries are needed. If you have these
libraries, then you will also have the include files needed by
these libraries. Rarely you need more include files; if on
compiling you get an error message about a missing include file,
then these might be found somewhere in the CygWin, Msys, or DJGPP distributions, but be
careful not to replace any native declarations. If you make from the original
sources, then you may need to apply patches from the patches
directory in the GnuWin sources, in particular when make exits prematurely
with an error message.
In Makefiles, you may have to change ln -s
to cp
,
or use a version of ln
that actually copies instead of
making soft links.
More and more packages use LibTool for compiling, linking and
installing. When installing into a directory with ~
in
its name, such as c:/progra~1
, it gets confused; so change all occurrences of ~
in libtool
into some other character, e.g. !
There have been reports
that GnuWin
executables have crashed on systems with processors other than
Intel, e.g. on systems with an AMD processor. These crashes can
be avoided by compiling with options specific to Win32 systems, e.g.
by using -mms-bitfields -march=i386
as
options to GCC.
Several packages use functions that are standard on Unix, for example for obtaining the user name. Some have MS-Windows equivalents, others don't. You will have to provide a MS-Windows equivalent that does something sensible; usually a dummy that does nothing, also works. Equivalents for several functions are in the LibGw32C library, which is an extension of the Msup and Mstubs libraries. Source code, e.g. from LibGw32C, for the needed functions can be copied to the package sources; you'll also have to adapt your own Makefiles and include files. Examples of code conversion between Unix and MS-Windows can also be found in Chapter 9 of the Unix Application Migration Guide on MSDN.
Packages that contain a library, usually build only a static
library (with extension .a
). A dynamic link library (DLL)
with corresponding import library can be built from this static
library with the linker ld
, by dlltool
or by dllwrap
(provided in the Mingw BinUtils collection). The shell scripts a2dll
and o2dll
show more details.
If a package has originally been configured by means of autoconf
(shown by the existence of the file configure.in
or configure.ac
),
then it might be reconfigured to make dynamic libraries, but very
often this does not seem to be worth the trouble.
When you have built the DLL, you can rebuild the executables
such that they use the DLL. Delete or rename the executable first.
Since often the Makefile
calls the library explicitly (for
example ../.libs/foo.a
) rather than with the -L/-l-
options
(in the example: -L ../.libs -lfoo
), either change the Makefile
or temporarily rename the import library to the name of the
static library. Then run make
again.
For libraries that are called in the standard way with -L/-l,
Mingw automatically chooses the import library for the DLL rather
than the static library if the import library has extension .dll.a
.
For packages that use LibTool, this will not work, since
LibTool then remakes the static library. Instead change in libfoo.la
(in the directory just above the .libs
directory that
contains libfoo.a
), the term libfoo.a
to
libfoo.dll.a
,
and run make
again. In principle, LibTool will build
dynamic libraries if the option --enable-shared
to configure
has been set, but in practice only the latest versions of LibTool
can handle this and even then you may still end up with a static
library. Some helper scripts, latool
and
rctool
may be used instead of LibTool when dynamic
libraries are to be created.
It is possible to create import libraries for use with MSVC and BCC.
On Unix it is practice to add a version number to the names of shared libraries; releases of a shared library that have the same version number have compatible interfaces, i.e. functions are called in the same way. On MS-Windows this seems also useful, so GnuWin dynamic libraries have a version number attached, usually computed from the LibTool interface version number.
Be careful not to mix different versions of the same library, since this may
lead to crashes. In particular do not mix different run-time libraries, such as
crtdll.dll
, msvcrt.dll
, msvcrtnn.dll
,
where nn
denotes the version number (20
,
30
, ...); see the
MS
Knowledge Base and
MSDN.
Nor should you mix CygWin dlls and native dlls.
Mingw versions 2.95.3-3 and earlier cannot import static data
from a DLL in the standard way, i.e. by using the extern
declaration. This shows as an auto import warning when
linking an executable that uses the DLL: Warning: resolving
vvvvv by linking to __imp__vvv (auto-import)
where
vvv
is the name of the static variable. It may also show as an Undefined
reference to _nm__vvv
or Undefined reference to
dllname_dll_a_iname
where dllname
is the name of the dll to
be created. If this occurs you may have to change some include
files that declare these static data. Include at the start of the
source:
#ifndef __GNUC__
# define __DLL_IMPORT __declspec(dllimport)
#else
# define __DLL_IMPORT __attribute__((dllimport)) extern
#endif
#if defined (BUILD_ddd_DLL) || !defined (__WIN32__)
# define DLL_IMPORT extern
#else
# define DLL_IMPORT __DLL_IMPORT
#endif
Replace extern
by DLL_IMPORT
in all relevant
places in the include files, and add -DBUILD_ddd_DLL=1
,
where ddd
indicates the DLL, as flag to the compiler
when compiling code that imports from the DLL.
Versions 2.95.3-4 and higher circumvent this auto-import problem
when the option --enable-auto-import
is given to the
linker; for versions 2.95.3-6 and higher this is the default
behaviour, so you need not set the option. Very occasionally you
still get an error message; solve this in the above manner (see
also the documentation of the GNU linker ld
, section 2.1.1).
On MS-Windows there is a difference between
text filemode and
binary filemode. Normal text files are files where CR-LF
signifies a line ending. Text file with LF as line endings can be
correctly read by the input functions of the runtime library; the
only error occurs with ftell
in the last part of the
file (see below).
Unless you are sure that a file is always a text file, it is
best to open it in binary mode; so add "b"
to the mode
when using fopen
and O_BINARY
when using open
.
For O_BINARY
to be defined, you may have to include fcntl.h
.
After a file has been opened, its mode may be changed by calling setmode
before any output or input has occurred.
Standard input, output and error can be opened in binary mode by adding
#include <fcntl.h>
int _CRT_fmode = _O_BINARY;
to the beginning of the main program file, or by including stdbin.h
. Alternatively, you can
compile stdbin.h
into a small library and link it to the
executable.
Similarly, all other files will be opened in binary mode, even
when "b"
has not been specified in the mode parameter
of fopen
, when
#include <fcntl.h>
int _fmode = _O_BINARY;
is added to the beginning of the main program file, when binmode.h
is included, or when
binmode.h
has been compiled into a library and linked to the
executable.
The result of
ftell
when a file with LF characters as
line endings is opened as a text file may differ from the result
when the same file is opened as binary file. When a file
containing CR-LF characters is opened as text file, the CR's are
deleted while reading; this is done when characters from the file
are transferred to the read buffer. Ftell
correctly computes the
number of bytes for a position in this file by doubling the
number of LF's that are still in the read buffer. When a file
with LF's as line endings is opened as a text file, then ftell
again doubles the number of LF's still in the read buffer
when computing the number of bytes, but now this is of course
incorrect. Because of the particular way the CR's in a CR-LF text
file are deleted, this error only matters when the last part of
the file is in the read buffer, so that normally positions in the
last 512 bytes of the file are incorrectly determined by ftell
.
This does not matter when the result of ftell
is only used as
input for
fseek
to return to a previous position, but it does
matter when ftell
is used to determine the absolute position in a
file.
The path separator on Unix is the colon (:
) and the directory
separator is the forwardslash (/
); on MS-Windows these are the
semicolon (;
) and the backslash (\
). Filenames with forwardslashes are understood by MS-Windows, but you will have to
change colons to semicolons when used as path separator. Tests
for absolute filenames (on Unix filenames starting with /
,
on MS-Windows filenames starting with x:/
or \\
)
must also be changed, as well as absolute filenames such as /tmp/...
,
/usr/...
, /dev/...
, /etc/...
. These
filename issues may also occur in shell scripts provided with the
package. Often they are also the cause of failure in tests or
checks with make test
or make check
.
Temporary file names may either be hardcoded (/tmp/...
)
or created with the help of an environment variable, usually TMP
or TMPDIR
. On Windows, the temporary file directory is Temp
or Windows/Temp
; and on Win9x the corresponding
environment variable is TEMP
. You will have to change
the Unix names, set the Unix environment variables, or adapt the
source to look also for the Windows environment variable TEMP
.
Wildcards on the command-line are expanded by the command-line interpreter. If you wish to disable this filename globbing, then add
int _CRT_glob = 0;
to the beginning of the main program file.
On Unix, executables usually are installed into /usr/bin
and implementation-independent files, such as configuration and
language files, in /usr/share
or in /usr/etc
,
whose names are often hard coded in the executable; see the File System Hierarchy
Standard. On MS-Windows there is no default location, and
instead most packages go into a directory of their own, e.g. E:/Program
Files/<package>
. When the name of the
implementation-independent directory is hard-coded in the
program, packages with implementation-independent files must be
installed in their default installation directory, which for GnuWin is
always C:/Progra~1/<package>
.
It is not very difficult to change a program such that it also
looks into the implementation-independent directory relative to
the directory where the executable is installed; for example,
when the program has been installed into D:/Applic/<package>
,
it looks for its configurations in say C:/Progra~1/<package>/share
and when nothing has been found there, it looks in D:/Applic/<package>/share
.
This solution has been followed in the later ports on GnuWin, which
thus may be installed in any directory provided the subdirectory
structure is maintained. Native language support (NLS) in LibIntl has also been adapted in
this way. An alternative solution would have been to let the
program read an initialization file in its program directory or
let it read the registry.
For this so-called run-time relocation it is best to use
Gnulib. You'll need the source
files error.c
, progname.c
, progreloc.c
, relocatable.c
, and the header
files areadlink.h
, error.h
, progname.h
,
relocatable.h
. Add the additional
source files to the files to be compiled either in the package library, usually
in the directory gl
or lib
, or to the sources, usually in the directory
src
. You must also define the macros INSTALLPREFIX
equal to the original installation directory, INSTALLDIR
equal to
the original installation directory of the executables, EXEEXT
equal to the extension of the executable, as well as NO_XMALLOC
(unless you have a function xmalloc
, in which case you must use
xreadlink.h
instead of areadlink.h
). In the language of Autoconf,
this usually amounts to
-DINSTALLPREFIX="$(prefix)" -DINSTALLDIR="$(bindir)" -DEXEEXT="$(EXEEXT)"
-DNO_XMALLOC
In the source files you must replace each occurrence of filenames to be
relocated by relocate(<filename>)
; in each source file where you do
this, you must include the header file
relocatable.h
, preferably in the form
#ifdef ENABLE_RELOCATABLE
# include <relocatable.h>
#else
#
define relocate(path) (path)
#endif
In the main source file, usually main.c
, you
must add the statement set_program_name(argv[0]);
and include the
header file progname.h
. If in
main.c
, a variable program_name
has already been declared,
you must remove this declaration as well as the statement that assigns a value
to program_name
, usually argv[0]
.
Normally the functions of the MS-Windows C-runtime library (msvcrt.dll
)
can access files up to 231-1 bytes, i.e. 2 GB. In
particular this holds for the group of stat
and
seek
functions: stat, fstat, seek, fseek,
lseek, tell
, and ftell
as well as the
related types ino_t
and off_t
. Special msvcrt
-functions
and types, indicated by the addition of i64
to their
name, can access files up to 263-1 bytes, i.e. 9 EB (exabyte)
= 9,000,000 TB (terabyte) = 9,000,000,000 GB. Large-file support
(LFS) has been implemented by redefining the stat
and seek
functions and types to their 64-bits equivalents. For fseek
and
ftell
, separate LFS versions, fseeko
and ftello
,
based on fsetpos
and fgetpos
, are provided in LibGw32C.
More information about LFS on Unix can be found at Freshmeat, in
the Single
Unix Specification, and the documents of the
Large File Summit.
fork
is the function that implements subprocesses on Unix. It
does not exist on MS-Windows, and has to be replaced by a series
of different API calls, such as
spawn
or
CreateProcess
.
Chapter 9 of the
Unix
Application Migration Guide, topics
Interprocess
Communication and
Appendixes E and F,
gives some examples.
The MS-Windows equivalent of the Unix inode number is the FileIndex
from the
BY_HANDLE_FILE_INFORMATION
structure, returned by the Win32
API function
GetFileInformationByHandle
.
The FileIndex
is a 64-bit number that on WnNT systems (NT, 2000,
XP, 2003, Vista, 2008) indicates the position of the file in the
Master File Table (MFT).
On Windows XP and higher, one can also obtain this number by using the command fsutil usn readdata <path>
. It is stable between successive starts of the
system, provided the MFT does not overflow and therefore has to be rebuilt. It is not stable for
files on network drives; successive calls to GetFileInformationByHandle
return different values. For FAT file systems, the MSDN documentation for
BY_HANDLE_FILE_INFORMATION
says: "In
the FAT file system, the file ID is generated from the first cluster of the
containing directory and the byte offset within the directory of the entry for
the file. Some defragmentation products change this byte offset. (Windows in-box
defragmentation does not.) Thus, a FAT file ID can change over time. Renaming a
file in the FAT file system can also change the file ID, but only if the new
file name is longer than the old one." Because of this, on FAT systems the
file index for directories is zero. Note that in the
Windows FileId API Library, the file index is named
FileId
.
The FileIndex
consists of two
parts: the low 48 bits are the socalled file reference number and contain the actual index
in the MFT; the high 16 bits are the socalled sequence number: each time an
entry in the MFT is reused for another file, the sequence number is increased by
one. This behavior of the sequence number can be observed by creating a file,
printing its FileIndex
, deleting it, creating a new file and printing its
FileIndex
; the FileIndex
of the newest file is equal to that of the first file,
with the sequence number, in the left most part of the FileIndex
, increased by
one. So the file reference number appears to be the
equivalent of the Unix inode.
Linux-NTFS has some
documentation about NTFS as well as some programs that can be used to
investigate the MFT and which show the described behavior of the FileIndex
. For
example, the docs say, and the programs confirm this, that the root directory of
a volume always has a file reference number of 5, because that is its index in
the MFT.
An inode number for regular files, and for directories on WinNT, might be created as follows:
#include <sys/stat.h>
#include <io.h>
#include <stdint.h>
#include <windows.h>
#define LODWORD(l) ((DWORD)((DWORDLONG)(l)))
#define HIDWORD(l) ((DWORD)(((DWORDLONG)(l)>>32)&0xFFFFFFFF))
#define MAKEDWORDLONG(a,b) ((DWORDLONG)(((DWORD)(a))|(((DWORDLONG)((DWORD)(b)))<<32)))
#define INOSIZE (8*sizeof(ino_t))
#define SEQNUMSIZE (16)
ino_t getino (char *path)
{
BY_HANDLE_FILE_INFORMATION FileInformation;
HANDLE hFile;
uint64_t ino64, refnum;
ino_t ino;
if (!path || !*path) /* path = NULL */
return 0;
if (access (path, F_OK)) /* path does not exist */
return -1;
/* obtain handle to "path"; FILE_FLAG_BACKUP_SEMANTICS is used to open
directories */
hFile = CreateFile (path, 0, 0, NULL, OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS |
FILE_ATTRIBUTE_READONLY,
NULL);
if (hFile == INVALID_HANDLE_VALUE) /* file cannot be opened */
return 0;
ZeroMemory (&FileInformation, sizeof(FileInformation));
if (!GetFileInformationByHandle (hFile, &FileInformation)) { /* cannot obtain FileInformation */
CloseHandle (hFile);
return 0;
}
ino64 = (uint64_t) MAKEDWORDLONG (
FileInformation.nFileIndexLow, FileInformation.nFileIndexHigh);
refnum = ino64 & ((~(0ULL)) >> SEQNUMSIZE); /* strip sequence number */
/* transform 64-bits ino into 16-bits by hashing */
ino = (ino_t) (
( (LODWORD(refnum)) ^ ((LODWORD(refnum))
>> INOSIZE) )
^
( (HIDWORD(refnum)) ^ ((HIDWORD(refnum))
>> INOSIZE) )
);
CloseHandle (hFile);
return ino;
}
An inode for fstat
can be implemented similarly, by obtaining the handle
from the file descriptor:
/* obtain handle to file descriptor "fd" */
hFile = _get_osfhandle (fd);
Do not close the handle after obtaining the FileInformation, since otherwise
fd
will also be closed.
For directories on Win9x and for network files, one might use a hashed value of the full path of the file.
For cross-compiling on a Linux system, see Volker Grabsch's cross-compiling pages.