Example usage of file | |
Developer(s) | AT&T Bell Laboratories |
---|---|
Initial release | 1973 | as part of Unix Research Version 4; 1986 open-source reimplementation
Repository | github |
Written in | C |
Operating system | Unix, Unix-like, Plan 9, IBM i |
Platform | Cross-platform |
Type | File type detector |
License | BSD license, CDDL Plan 9: MIT License |
Website | darwinsys |
The file
command is a standard program of Unix and Unix-like operating systems for recognizing the type of data contained in a computer file.
The original version of file
originated in Unix Research Version 4[1] in 1973. System V brought a major update with several important changes, most notably moving the file type information into an external text file rather than compiling it into the binary itself.
Most major BSD and Linux distributions use a free, open-source reimplementation which was written in 1986–87 by Ian Darwin[2] from scratch; it keeps file type information in a text file with a format based on that of the System V version. It was expanded by Geoff Collyer in 1989 and since then has had input from many others, including Guy Harris, Chris Lowth and Eric Fischer; from late 1993 onward its maintenance has been organized by Christos Zoulas. The OpenBSD system has its own subset implementation written from scratch, but still uses the Darwin/Zoulas collection of magic file formatted information.
The file
command has also been ported to the IBM i operating system.[3]
The Single UNIX Specification (SUS) specifies that a series of tests are performed on the file specified on the command line:
file
program will indicate that the file was processed but its type was undetermined.file
must be able to determine the types directory, FIFO, socket, block special file, and character special filefile
is to use position-sensitive testsfile
is to use context-sensitive testsdata
filefile
's position-sensitive tests are normally implemented by matching various locations within the file against a textual database of magic numbers (see the Usage section). This differs from other simpler methods such as file extensions and schemes like MIME.
In the System V implementation, the Ian Darwin implementation, and the OpenBSD implementation, the file
command uses a database to drive the probing of the lead bytes. That database is implemented in a file called magic
, whose location is usually in /etc/magic
, /usr/share/file/magic
or a similar location.
The SUS[4] mandates the following options:
file
.Other Unix and Unix-like operating systems may add extra options than these. Ian Darwin's implementation adds -s 'special files', -k 'keep-going' or -r 'raw' (examples below), among many others.[5]
The command tells only what the file looks like, not what it is (in the case where file looks at the content). It is easy to fool the program by putting a magic number into a file the content of which does not match it. Thus the command is not usable as a security tool other than in specific situations.
$ file file.c file.c: C program text
$ file program program: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), stripped
$ file /dev/hda1 /dev/hda1: block special (0/0)
$ file -s /dev/hda1 /dev/hda1: Linux/i386 ext2 filesystem
Note that -s is a non-standard option available only on the Ian Darwin branch, which tells file
to read device files and try to identify their contents rather than merely identifying them as device files. Normally file
does not try to read device files since reading such a file can have undesirable side effects.
$ file -k -r libmagic-dev_5.35-4_armhf.deb # (on Linux) libmagic-dev_5.35-4_armhf.deb: Debian binary package (format 2.0) - current ar archive - data
Through Ian Darwin's non-standard option -k the program does not stop after the first hit found, but looks for other matching patterns. The -r option, which is available in some versions, causes the unprintable new line character to be displayed in its raw form rather than in its octal representation.
$ file compressed.gz compressed.gz: gzip compressed data, deflated, original filename, `compressed', last modified: Thu Jan 26 14:08:23 2006, os: Unix
$ file -i compressed.gz # (on Linux) compressed.gz: application/x-gzip; charset=binary
$ file data.ppm data.ppm: Netpbm PPM "rawbits" image data
$ file /bin/cat /bin/cat: Mach-O universal binary with 2 architectures /bin/cat (for architecture ppc7400): Mach-O executable ppc /bin/cat (for architecture i386): Mach-O executable i386
$ file /usr/bin/vi /usr/bin/vi: symbolic link to vim
Identifying symbolic links is not available on all platforms and will be dereferenced if -L is passed or POSIXLY_CORRECT is set.
As of version 4.00 of the Ian Darwin/Christos Zoulas version of file
, the functionality of file
is incorporated into a libmagic
library that is accessible via C (and C-compatible) linking;[7][8] file
is implemented using that library.[9][10]
file
used in major BSD and Linux distributions.
Original source: https://en.wikipedia.org/wiki/File (command).
Read more |