VizagInfo.com Home | VizagInfo Mirror of LG


LINUX GAZETTE

September 2003, Issue 94       Published by Linux Journal

Front Page  |  Back Issues  |  FAQ  |  Mirrors
The Answer Gang knowledge base (your Linux questions here!)
Search (www.linuxgazette.com)


Linux Gazette Staff and The Answer Gang

TAG Editor: Heather Stern
Senior Contributing Editor: Jim Dennis
Contributing Editors: Ben Okopnik, Dan Wilder, Don Marti

TWDT 1 (gzipped text file)
TWDT 2 (HTML file)
are files containing the entire issue: one in text format, one in HTML. They are provided strictly as a way to save the contents as one file for later printing in the format of your choice; there is no guarantee of working links in the HTML version.
Linux Gazette[tm], http://www.linuxgazette.com/
This page maintained by the Webmaster of Linux Gazette, webmaster@linuxgazette.com

Copyright © 1996-2003 Specialized Systems Consultants, Inc.

LINUX GAZETTE
...making Linux just a little more fun!
News Bytes
By Michael Conry

News Bytes

Contents:

Selected and formatted by Michael Conry

Submitters, send your News Bytes items in PLAIN TEXT format. Other formats may be rejected without reading. You have been warned! A one- or two-paragraph summary plus URL gets you a better announcement than an entire press release. Submit items to bytes@linuxgazette.com


 September 2003 Linux Journal

[issue 113 cover image] The September issue of Linux Journal is on newsstands now. This issue focuses on Community Networks. Click here to view the table of contents, or here to subscribe.

All articles in issues 1-102 are available for public reading at http://www.linuxjournal.com/magazine.php. Recent articles are available on-line for subscribers only at http://interactive.linuxjournal.com/.


Legislation and More Legislation


 European Software Patents

On the 1st of September 2003, the European Parliament will hold a vote which may have very far reaching and long lasting effects on the software industry and community within the European Union. The central issue being addressed in this vote is the patentability of software. In the past, there has been some vagueness in the attitude of the European Patent Office towards the patenting of software. Although official regulations appeared to make software, mathematics, algorithms and business methods essentially unpatentable, working practise in the EPO has been to bypass the legal framework intended to constrain it and to allow such innovations to be patented. The new directive on the patentability of computer-implemented inventions is supposed to be a measure aimed at resolving this confusion by regularising the rules regarding patentability. However, what the EU blurb glosses over is that the provisions in the new directive significantly alter the legislation currently governing software patentability. Rather than vindicating the existing legal situation, the legislation is being recast in the image of the current operations of the EPO. This is strikingly borne out by some research performed by the FFII. The FFII intended to show that the infamous "one click" Amazon.com patent would be acceptable under the proposed new regulations. During the course of these investigations, it emerged that Amazon.com had already been granted a closely related patent covering computerised methods of gift delivery.

Of course, when considering these changes we must ask ourselves whether perhaps these changes may be desirable. Though there are naturally those who support the initiative, there is a very broad constituency that strongly opposes this move towards European software patents. An unscientific measure of the opposition to the software patent proposals can be obtained by doing a search on Google News for the terms "european software patents". The vast majority of headlines are hostile or gloomy regarding the proposal. There is a striking absence of outright support, all the more striking given that this is a search of news outlets rather than personal or lobby-group websites. This scepticism is shared by many economists who fear that the legal changes will lead to a reduction in innovation and cutbacks in R&D expenditure. These fears are felt very acutely among small and medium size software companies who have perhaps the most to lose. Equally, open source developers may be left in a vulnerable position by these proposed changes. As has been seen in the operation of software patents in the United States, the patent system tends to work best for parties with large financial resources, such as multinational corporations. Such deep pockets allow an organisation to acquire a stock of patents, and then to defend the patents through the courts. A well resourced holder of even a very spurious patent can thus intimidate would-be competitors out of the market simply by virtue of the differences in scale. The only group which will benefit to a greater degree than large corporations is the legal fraternity.

It remains to be seen whether the protests and lobbying organised by anti-patent groups will prove to be effective. Though actions such as closing down websites make an impact online, the real world effect can be quite small. As was pointed out by the Register, even though open source produces great code, it does not necessarily produce great lobbying. The key for open-source groups elsewhere and in the future is to share information about what works and does not work in the political sphere, and to apply this information in future struggles.


 SCO

Writing an article on the SCO lawsuit(s) is getting steadily more difficult as the volume of material on the subject mounts up. Much of it is simply noise and it will not be until the case is dropped or reaches court that we will have a chance to properly judge the true nature of SCO's plans. This is especially true given SCO's reluctance to release any of the source code they claim is infringing their "intellectual" property (the words "SCO" and "intellectual" seem more mutually exclusive to me each day). Perhaps to impress investors, SCO did deign to display a couple of samples at their annual reseller show. This was very nice of them and illustrates why they should perhaps release more of the "disputed" code. Analysis done by Linux Weekly News and by Bruce Perens indicated that the origination of the code was entirely legal and did not infringe on SCO's property. SCO spokesman Blake Stowell's rather pointless response was to show a typically SCO-like disdain for facts and to assert that "at this point it's going to be his [Perens'] word against ours". Unfortunately for Blake, Perens' word is backed up by verifiable documentation and historical record not to mention the fact that people who worked on and remember the code are still alive. Meanwhile, SCO's assertions are, at least at this stage, no more than random bleatings.

Reaction to the SCO case has been mostly muted, though it is likely that some more-cautious corporate types are somewhat reluctant to engage further with Open Source and Free Software under the shadow of the court case. Few though are likely to be so nervous as to stump up the licence fees requested by SCO. The advice of Australian lawyer John Collins sounds about right:

"If you don't know whether or not you have a valid license because there is uncertainty as to the providence of the software and who actually owns the copyright, then to walk up and drop your pants to the person who is likely to sue you sounds a little counter-intuitive and a bit uncommercial,"

Some have speculated that the true purpose of SCO's actions may be connected to the (mostly positive) effect on its share price these developments have had. An example of these arguments can be found in the writings of Tim Rushing, though ultimately everybody is still speculating. Further analyses can be found at GrokLaw and at sco.iwethey.org, though keeping up with the twists and turns, not to mention the irrational behaviour of SCO execs, is rather taxing on the grey matter.


Linux Links

ActiveState has made freely available the ActiveState Field Guide to Spam. This is a living compilation of advanced tricks used by spammers to hide their messages from spam filters.

Some links from Newsforge

Some interesting links from the O'Reilly stable of websites:

Ernie Ball guitar string company dumps Microsoft for Linux after BPA audit.

Linus says SCO is smoking crack.

The Register reported on the launch of Open Groupware.org, an application which claims to complete the OpenOffice productivity software set.

Some links of interest from Linux Today:

Bruce Perens analyzes SCO's code samples in detail.

Debian Weekly News highlighted an article by Ian Murdock arguing that Linux is a process, not a product.


Upcoming conferences and events

Listings courtesy Linux Journal. See LJ's Events page for the latest goings-on.

LinuxWorld UK
September 3-4, 2003
Birmingham, United Kingdom
http://www.linuxworld2003.co.uk

Linux Lunacy
Brought to you by Linux Journal and Geek Cruises!
September 13-20, 2003
Alaska's Inside Passage
http://www.geekcruises.com/home/ll3_home.html

Software Development Conference & Expo
September 15-18, 2003
Boston, MA
http://www.sdexpo.com

PC Expo
September 16-18, 2003
New York, NY
http://www.techxny.com/pcexpo_techxny.cfm

COMDEX Canada
September 16-18, 2003
Toronto, Ontario
http://www.comdex.com/canada/

IDUG 2003 - Europe
October 7-10, 2003
Nice, France
http://www.idug.org

Linux Clusters Institute Workshops
October 13-18, 2003
Montpellier, France
http://www.linuxclustersinstitute.org

Coast Open Source Software Technology (COSST) Symposium
October 18, 2003
Newport Beach, CA
http://cosst.ieee-occs.org

LISA (17th USENIX Systems Administration Conference)
October 26-30, 2003
San Diego, CA
http://www.usenix.org/events/lisa03/

HiverCon 2003
November 6-7, 2003
Dublin, Ireland
http://www.hivercon.com/

COMDEX Fall
November 17-21, 2003
Las Vegas, NV
http://www.comdex.com/fall2003/

Southern California Linux Expo (SCALE)
November 22, 2003
Los Angeles, CA
http://socallinuxexpo.com/

Linux Clusters Institute Workshops
December 8-12, 2003
Albuquerque, NM
http://www.linuxclustersinstitute.org

Storage Expo 2003, co-located with Infosecurity 2003
December 9-11, 2003
New York, NY
http://www.infosecurityevent.com/


News in General


 GNU Server breach

It emerged over the past month that the main file servers of the GNU project were compromised by a malicious cracker in mid-march. Although the breach was only noticed in July, it appears that no source code was tampered with. Nonetheless, it is important that individuals and organisation who may have downloaded from the compromised server verify for themselves that the code they received was intact and untainted. This incident should also bring home to users the importance of keeping up to date with patches and software updates, and also the necessity to have established security procedures and backups in place.

Original reporting on this story can be found here:


 Alan Cox Sabbatical

Kerneltrap reported that Alan Cox is to take a one year sabbatical. He plans to spend his year studying for an MBA and learning Welsh.


 GNU/Linux Security Certification

Slashdot recently highlighted the story that IBM has succeeded in getting Linux certified under the Common Criteria specification. This has implications for government bodies considering Linux when making purchasing decisions. The Inquirer reports that this has been a bit of a black-eye for Red Hat, whose certification effort is stalled, held up indefinitely by the UK-based testing laboratory Red Hat selected to do the work.


Distro News


 Ark

Tux Reports have taken a look at Ark Linux. This RPM based distribution particularly aims to provide a comprehensive and useful desktop environment.


 Debian

Debian Weekly News linked to Jan Ivar Pladsen's document which describes how to install Debian GNU/Linux on Indy.


On August 16th, the Debian Project celebrated its 10th birthday. Linux Planet published a Debian 10-year retrospective to mark the occasion.


 Knoppix

Klaus Knopper describes the Philosophy behind Knoppix.


 Libranet

Linuxiran has reviewed Libranet GNU/Linux 2.8. Evidently they were impressed: "Only one word can describe Libranet's installer: 'awesome...'" (Courtesy Linux Today).


 Mepis

As higlighted by DWN, Mepis Linux is a LiveCD derived from Debian GNU/Linux. LinuxOnline has some articles on this distribution, including this LiveCD. The first is an overview, a full review and an interview with Mepis creator Warren Woodford.


 SuSE

SGI and SuSE Linux today announced plans to extend the Linux OS to new levels of scalability and performance by offering a fully supported 64-processor system running a fully supported, enterprise-grade Linux operating system. Expected to be available in October, SGI will bundle SuSE Linux Enterprise Server 8 on SGI Altix 3000 servers and superclusters.


Siemens Business Servicess has decided to use SuSE Linux Enterprise Server 8 to underpin its mySAP HR management system, processing payrolls for more than 170,000 employees worldwide. The open source operating system and the platform independence of the SAP R/3 software enable an easy migration to an open, powerful, and efficient Intel architecture. Linux-based application servers can be operated independently alongside existing Unix-based servers. Thus, the RM systems can continue to run until they were amortized and gradually replaced by Linux servers.


Software and Product News


 Biscom Announces Linux FAXCOM Server

Biscom, a provider of enterprise fax management solutions, has announced the market release of its Linux FAXCOM Server. The new product integrates the reliability and efficiency of the Windows FAXCOM Server with the stability and security of the Linux operating system. Linux FAXCOM Server has been thoroughly tested is currently available for market release. Linux FAXCOM Server features support for multiple diverse document attachments via on-the-fly document conversion, and up to 96 ports on one fax server. Expanded fax routing destination options for inbound faxes include: fax port, dialed digits, sender's Transmitting Station Identifier (TSID) and Caller ID. Furthermore, if appropriate, the same fax may be routed to multiple destinations, including one or more printers.


 GNU Scientific Library 1.4 released

Version 1.4 of the GNU Scientific Library is now available at:

ftp://ftp.gnu.org/gnu/gsl/gsl-1.4.tar.gz
and from mirrors worldwide (see http://www.gnu.org/order/ftp.html).

The GNU Scientific Library (GSL) is a collection of routines for numerical computing in C. This release is backwards compatible with previous 1.x releases. GSL now includes support for cumulative distribution functions (CDFs) contributed by Jason H. Stover. The full NEWS file entry is appended below.


 Mod_python 3.1.0 Alpha

The Apache Software Foundation and The Apache HTTP Server Project have announced the 3.1.0 ALPHA release of mod_python.

Some feature highlights:

Mod_python 3.1.0a is available for download from: http://httpd.apache.org/modules/python-download.cgi


 Samba

Linux Today has carried the news that Samba-3.0.0 RC2 is now available for download

 

Mick is LG's News Bytes Editor.

[Picture] Born some time ago in Ireland, Michael is currently working on a PhD thesis in the Department of Mechanical Engineering, University College Dublin. The topic of this work is the use of Lamb waves in nondestructive testing. GNU/Linux has been very useful in this work, and Michael has a strong interest in applying free software solutions to other problems in engineering. When his thesis is completed, Michael plans to take a long walk.


Copyright © 2003, Michael Conry. Copying license http://www.linuxgazette.com/copying.html
Published in Issue 94 of Linux Gazette, September 2003

LINUX GAZETTE
...making Linux just a little more fun!
Ecol
By Javier Malonda

The Ecol comic strip is written for escomposlinux.org (ECOL), the web site tha t supports, es.comp.os.linux, the Spanish USENET newsgroup for Linux. The strips are drawn in Spanish and then translated to English by the author.

These images are scaled down to minimize horizontal scrolling. To see a panel in all its clarity, click on it.

[cartoon]
[cartoon]
[cartoon]

All Ecol cartoons are at tira.escomposlinux.org (Spanish), comic.escomposlinux.org (English) and http://tira.puntbarra.com/ (Catalan). The Catalan version is translated by the people who run the site; only a few episodes are currently available.

These cartoons are copyright Javier Malonda. They may be copied, linked or distributed by any means. However, you may not distribute modifications. If you link to a cartoon, please notify Javier, who would appreciate hearing from you.

 


Copyright © 2003, Javier Malonda. Copying license http://www.linuxgazette.com/copying.html
Published in Issue 94 of Linux Gazette, September 2003

LINUX GAZETTE
...making Linux just a little more fun!
From C To Assembly Language
By Hiran Ramankutty



VizagInfo.com Home | VizagInfo Mirror of LG


1. Overview

What is a microcomputer system made up of? A microcomputer system is made up of a microprocessor unit (MPU), a bus system, a memory subsystem, an I/O subsystem and an interface among all components. A typical answer one can expect.

This is only the hardware side. Every microcomputer system requires a software so as to direct each of the hardware components while they are performing their respective tasks. Computer software can be thought about at system side (system software) and user side (user software).

The user software may include some in-built libraries and user created libraries in the form of subroutines which may be needed in preparing programs for execution.

The system software may encompass a variety of high-level language translators, an assembler, a text editor, and several other programs for aiding in the preparation of other programs. We already know that there are three levels of programming and they are Machine language, Assembly language and High-level language.

Machine language programs are programs that the computer can understand and execute directly (think of programming in any microprocessor kit). Assembler language instructions match machine language instructions on a more or less one-for-one basis, but are written using character strings so that they are more easily understood, and high-level language instructions are much closer to the English language and are structured so that they naturally correspond to the way programmers think. Ultimately, an assembler language or high-level language program must be converted into machine language by programs called translators. They are referred to as assembler and compiler or interpreter respectively.

Compilers for high-level languages like C/C++ have the ability to translate high-level language into assembly code. The GNU C and C++ Compiler option of -S will generate an assembly code equivalent to that of the corresponding source program. Knowing how the most rudimentary constructs like loops, function calls and variable declaration are mapped into assembly language is one way to achieve the goal of mastering C internals. Before proceeding further, you must make it a point that you are familiar with Computer Architecture and Intel x86 assembly language to help you follow the material presented here.

2. Getting Started

To begin with, write a small program in C to print hello world and compile it with -S options. The output is an assembler code for the input file specified. By default, GCC makes the assembler file name by replacing the suffix `.c', with `.s'. Try to interpret the few lines at the end of the assembler file.

The 80386 and above family of processors have myriads of registers, instructions and addressing modes. A basic knowledge about only a few simple instructions is sufficient to understand the code generated by the GNU compiler.

Generally, any assembly language instruction includes a label, a mnemonic, and operands. An operand's notation is sufficient to decipher the operand's addressing mode. The mnemonics operate on the information contained in the operands. In fact, assembly language instructions operate on registers and memory locations. The 80386 family has general purpose registers (32 bit) called eax, ebx, ecx etc. Two registers, ebp and esp are used for manipulating the stack. A typical instruction, written in GNU Assembler (GAS) syntax, would look like this:

movl $10, %eax

This instruction stores the value 10 in the eax register. The prefix `%' to the register name and `$' to the immediate value are essential assembler syntax. It is to be noted that not all assemblers follow the same syntax.

Our first assembly language program, stored in a file named first.s is shown in Listing 1.

#Listing 1
.globl main
main:
  movl $20, %eax
  ret

This file can be assembled and linked to generate an a.out by giving the command cc first.s. The extensions `.s' are identified by the GNU compiler front end cc as assembly language files and invokes the assembler and linker, skipping the compilation phase.

The first line of the program is a comment. The .globl assembler directive serves to make the symbol main visible to the linker. This is vital as your program will be linked with the C startup library which will contain a call to main. The linker will complain about 'undefined reference to symbol main' if that line is omitted (try it). The program simply stores the value 20 in register eax and returns to the caller.

3. Arithmetic, Comparison, Looping

Our next program is Listing 2 which computes the factorial of a number stored in eax. The factorial is stored in ebx.

#Listing 2
.globl main
main: 
	movl $5, %eax
	movl $1, %ebx
L1:	cmpl $0, %eax		//compare 0 with value in eax
	je L2			//jump to L2 if 0==eax (je - jump if equal)
	imull %eax, %ebx	// ebx = ebx*eax
	decl %eax		//decrement eax
	jmp L1			// unconditional jump to L1
L2: 	ret

L1 and L2 are labels. When control flow reaches L2, ebx would contain the factorial of the number stored in eax.

4. Subroutines

When implementing complicated programs, we split the tasks to be solved in systematic order. We write subroutines and functions for each of the tasks which are called when ever required. Listing 3 illustrates subroutine call and return in assembly language programs.

#Listing 3
.globl main
main:
	movl $10, %eax
	call foo
	ret
foo:
	addl $5, %eax
	ret

The instruction call transfers control to subroutine foo. The ret instruction in foo transfers control back to the instruction after the call in main.

Generally, each function defines the scope of variables it uses in each call of the routine. To maintain the scopes of variables you need space. The stack can be used to maintain values of the variables in each call of the routine. It is important to know the basics of how the activation records can be maintained for repeated, recursive calls or any other possible calls in the execution of the program. Knowing how to manipulate registers like esp and ebp and making use of instructions like push and pop which operate on the stack are central to understanding the subroutine call and return mechanism.

5. Using The Stack

A section of your program's memory is reserved for use as a stack. The Intel 80386 and above microprocessors contain a register called stack pointer, esp, which stores the address of the top of stack. Figure 1 below shows three integer values, 49,30 and 72, stored on the stack (each integer occupying four bytes) with esp register holding the address of the top of stack.

Figure 1

Unlike the stack analogous to a pile of bricks growing up wards, on Intel machines stack grows down wards. Figure 2 shows the stack layout after the execution of the instruction pushl $15.

Figure 2

The stack pointer register is decremented by four and the number 15 is stored as four bytes at locations 1988, 1989, 1990 and 1991.

The instruction popl %eax copies the value at top of stack (four bytes) to the eax register and increments esp by four. What if you do not want to copy the value at top of stack to any register? You just execute the instruction addl $4, %esp which simply increments the stack pointer.

In Listing 3, the instruction call foo pushes the address of the instruction after the call in the calling program on to the stack and branches to foo. The subroutine ends with ret which transfers control to the instruction whose address is taken from the top of stack. Obviously, the top of stack must contain a valid return address.

6. Allocating Space for Local Variables

It is possible to have a C program manipulating hundreds and thousands of variables. The assembly code for the corresponding C program will give you an idea of how the variables are accommodated and how the registers are used for manipulating the variables without causing any conflicts in the final result that is to be obtained.

The registers are few in number and cannot be used for holding all the variables in a program. Local variables are allotted space within the stack. Listing 4 shows how it is done.

#Listing 4
.globl main
main:
	call foo
	ret
foo:
	pushl %ebp
	movl %esp, %ebp
	subl $4, %esp
	movl $10, -4(%ebp)
	movl %ebp, %esp
	popl %ebp
	ret

First, the value of the stack pointer is copied to ebp, the base pointer register. The base pointer is used as a fixed reference to access other locations on the stack. In the program, ebp may be used by the caller of foo also, and hence its value is copied to the stack before it is overwritten with the value of esp. The instruction subl $4, %esp creates enough space (four bytes) to hold an integer by decrementing the stack pointer. In the next line, the value 10 is copied to the four bytes whose address is obtained by subtracting four from the contents of ebp. The instruction movl %ebp, %esp restores the stack pointer to the value it had after executing the first line of foo and popl %ebp restores the base pointer register. The stack pointer now has the same value which it had before executing the first line of foo. The table below displays the contents of registers ebp, esp and stack locations from 3988 to 3999 at the point of entry into main and after the execution of every instruction in Listing 4 (except the return from main). We assume that ebp and esp have values 7000 and 4000 stored in them and stack locations 3988 to 3999 contain some arbitrary values 219986, 1265789 and 86 before the first instruction in main is executed. It is also assumed that the address of the instruction after call foo in main is 30000.

Table 1

6. Parameter Passing and Value Return

The stack can be used for passing parameters to functions. We will follow a convention (which is used by our C compiler) that the value stored by a function in the eax register is taken to be the return value of the function. The calling program passes a parameter to the callee by pushing its value on the stack. Listing 5 demonstrates this with a simple function called sqr.

#Listing 5
.globl main
main:
	movl $12, %ebx
	pushl %ebx
	call sqr
	addl $4, %esp       //adjust esp to its value before the push
	ret
sqr:
	movl 4(%esp), %eax
	imull %eax, %eax    //compute eax * eax, store result in eax 
	ret

Read the first line of sqr carefully. The calling function pushes the content of ebx on the stack and then executes a call instruction. The call will push the return address on the stack. So inside sqr, the parameter is accessible at an offset of four bytes from the top of stack.

8. Mixing C and Assembler

Listing 6 shows a C program and an assembly language function. The C function is defined in a file called main.c and the assembly language function in sqr.s. You compile and link the files together by typing cc main.c sqr.s.

The reverse is also pretty simple. Listing 7 demonstrates a C function print and its assembly language caller.

#Listing 6
//main.c
main()
{
	int i = sqr(11);
	printf("%d\n",i);
}

//sqr.s
.globl sqr
sqr:
	movl 4(%esp), %eax
	imull %eax, %eax
	ret

#Listing 7
//print.c
print(int i)
{
	printf("%d\n",i);
}

//main.s
.globl main
main:
	movl $123, %eax
	pushl %eax
	call print
	addl $4, %esp
	ret

9. Assembler Output Generated by GNU C

I guess this much reading is sufficient for understanding the assembler output produced by gcc. Listing 8 shows the file add.s generated by gcc -S add.c. Note that add.s has been edited to remove many assembler directives (mostly for alignments and other things of that sort).

#Listing 8
//add.c
int add(int i,int j)
{
	int p = i + j;
	return p;
}

//add.s
.globl add
add:
	pushl %ebp
	movl %esp, %ebp
	subl $4, %esp		//create space for integer p
	movl 8(%ebp),%edx	//8(%ebp) refers to i
	addl 12(%ebp), %edx	//12(%ebp) refers to j
	movl %edx, -4(%ebp)	//-4(%ebp) refers to p
	movl -4(%ebp), %eax	//store return value in eax
	leave			//i.e. to movl %ebp, %esp; popl %ebp ret

The program will make sense upon realizing the C statement add(10,20) which gets translated into the following assembler code:

pushl $20
pushl $10
call add

Note that the second parameter is passed first.

10. Global Variables

Space is created for local variables on the stack by decrementing the stack pointer and the allotted space is reclaimed by simply incrementing the stack pointer. So what is the equivalent GNU C generated code for global variables? Listing 9 provides the answer.

#Listing 9
//glob.c
int foo = 10;
main()
{
	int p foo;
}

//glob.s
.globl foo
foo:
	.long 10
.globl main
main:
	pushl %ebp
	movl %esp,%ebp
	subl $4,%esp
	movl foo,%eax
	movl %eax,-4(%ebp)
	leave
	ret

The statement foo: .long 10 defines a block of 4 bytes named foo and initializes the block with zero. The .globl foo directive makes foo accessible from other files. Now try this out. Change the statement int foo to static int foo. See how it is represented in the assembly code. You will notice that the assembler directive .globl is missing. Try this out for different storage classes (double, long, short, const etc.).

11. System Calls

Unless a program is just implementing some math algorithms in assembly, it will deal with such things as getting input, producing output, and exiting. For this it will need to call on OS services. In fact, programming in assembly language is quite the same in different OSes, unless OS services are touched.

There are two common ways of performing a system call in Linux: through the C library (libc) wrapper, or directly.

Libc wrappers are made to protect programs from possible system call convention changes, and to provide POSIX compatible interface if the kernel lacks it for some call. However, the UNIX kernel is usually more-or-less POSIX compliant: this means that the syntax of most libc "system calls" exactly matches the syntax of real kernel system calls (and vice versa). But the main drawback of throwing libc away is that one loses several functions that are not just syscall wrappers, like printf(), malloc() and similar.

System calls in Linux are done through int 0x80. Linux differs from the usual Unix calling convention, and features a "fastcall" convention for system calls. The system function number is passed in eax, and arguments are passed through registers, not the stack. There can be up to six arguments in ebx, ecx, edx, esi, edi, ebp consequently. If there are more arguments, they are simply passed though the structure as first argument. The result is returned in eax, and the stack is not touched at all.

Consider Listing 10 given below.

#Listing 10
#fork.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main()
{
	fork();
	printf("Hello\n");
	return 0;
}

Compile this program with the command cc -g fork.c -static. Use the gdb tool and type the command disassemble fork. You can see the assembly code used for fork in the program. The -static is the static linker option of GCC (see man page). You can test this for other system calls and see how the actual functions work.

There have been several attempts to write an up-to-date documentation of the Linux system calls and I am not making this another of them.

11. Inline Assembly Programming

The GNU C supports the x86 architecture quite well, and includes the ability to insert assembly code within C programs, such that register allocation can be either specified or left to GCC. Of course, the assembly instruction are architecture dependent.

The asm instruction allows you to insert assembly instructions into your C or C++ programs. For example the instruction:

asm ("fsin" : "=t" (answer) : "0" (angle));

is an x86-specific way of coding this C statement:

answer = sin(angle);

You can notice that unlike ordinary assembly code instructions asm statements permit you to specify input and output operands using C syntax. Asm statements should not be used indiscriminately. So, when should we use them?

#Listing 11
#Name : bit-pos-loop.c 
#Description : Find bit position using a loop

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
	long max = atoi (argv[1]);
	long number;
	long i;
	unsigned position;
	volatile unsigned result;

	for (number = 1; number <= max; ; ++number) {
		for (i=(number>>1), position=0; i!=0; ++position)
			i >>= 1;
		result = position;
	}
	return 0;
}

#Listing 12
#Name : bit-pos-asm.c
#Description : Find bit position using bsrl

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
	long max = atoi(argv[1]);
	long number;
	unsigned position;
	volatile unsigned result;

	for (number = 1; number <= max; ; ++number) {
		asm("bsrl %1, %0" : "=r" (position) : "r" (number));
		result = position;
	}
	return 0;
}

Compile the two versions with full optimizations as given below:

$ cc -O2 -o bit-pos-loop bit-pos-loop.c
$ cc -O2 -o bit-pos-asm bit-pos-asm.c

Measure the running time for each version by using the time command and specifying a large value as the command-line argument to make sure that each version takes at least few seconds to run.

$ time ./bit-pos-loop 250000000

and

$ time ./bit-pos-asm 250000000

The results will be varying in different machines. However, you will notice that the version that uses the inline assembly executes a great deal faster.

GCC's optimizer attempts to rearrange and rewrite program' code to minimize execution time even in the presence of asm expressions. If the optimizer determines that an asm's output values are not used, the instruction will be omitted unless the keyword volatile occurs between asm and its arguments. (As a special case, GCC will not move an asm without any output operands outside a loop.) Any asm can be moved in ways that are difficult to predict, even across jumps. The only way to guarantee a particular assembly instruction ordering is to include all the instructions in the same asm.

Using asm's can restrict the optimizer's effectiveness because the compiler does not know the asms' semantics. GCC is forced to make conservative guesses that may prevent some optimizations.

12. Exercises

  1. Interpret the assembly code for C program in Listing 6. Modify it for eliminating errors that are obtained when generating assembly code with -Wall option. Compare the two assembly codes. What changes do you observe?
  2. Compile several small C programs with and without optimization options (like -O2). Read the resulting assembly codes and find out some common optimization tricks used by the compiler.
  3. Interpret assembly code for switch statement.
  4. Compile several small C programs with inline asm statements. What differences do you observe in assembly codes for such programs.
  5. A nested function is defined inside another function (the "enclosing function"), such that:

    Nested functions can be useful because they help control the visibility of a function.

    Consider Listing 13 given below:

    #Listing 13 /* myprint.c */ #include <stdio.h> #include <stdlib.h> int main() { int i; void my_print(int k) { printf("%d\n",k); } scanf("%d",&i); my_print(i); return 0; }

    Compile this program with cc -S myprint.c and interpret the assembly code. Also try compiling the program with the command cc -pedantic myprint.c. What do you observe?

 

[BIO] I have just given my final year B.Tech examinations in Computer Science and Engineering and a native of Kerala, India.


Copyright © 2003, Hiran Ramankutty. Copying license http://www.linuxgazette.com/copying.html
Published in Issue 94 of Linux Gazette, September 2003