Neeraj Gupta: April 2010

Friday, April 30, 2010

Song

Thursday, April 22, 2010

How TCP/IP selects a source IP address

How TCP/IP selects a source IP address

TCP/IP uses the following sequence to select the source IP address for an outbound packet. For detailed information about the TCP/IP profile (PROFILE.TCPIP) and configuration statements, see z/OS Communications Server: IP Configuration Reference.

Sendmsg that specifies the source address in the ancillary IPV6_PKTINFO data
If this is a UDP or RAW socket, and IPV6_PKTINFO ancillary data is specified on sendmsg() with a nonzero source IP address, this address is used as the source IP address.

Setsockopt IPV6_PKTINFO
If this is a UDP or RAW socket, and the IPV6_PKTINFO socket option is set and it contains a nonzero source IP address, this address is used as the source IP address.

Explicit bind to a specific local address
If the socket is already bound to a specific local IP address other than INADDR_ANY or |in6addr_any, TCP/IP uses this specific local IP address.

PORT profile statement with the BIND parameter
If the socket is bound to a port and to the INADDR_ANY or |in6addr_any IP address, and there is a corresponding PORT profile statement with the BIND parameter specified, TCP/IP uses the address specified by the BIND parameter.

SRCIP profile statement
If this is a TCP socket and either the socket is not yet bound or the socket is bound to the INADDR_ANY or |in6addr_any IP address, TCP/IP checks the job name and destination IP address against the SRCIP entries in the following order:

JOBNAME entries, other than JOBNAME *
DESTINATION entries
JOBNAME * entries
If a match is found, TCP/IP uses the designated source in the |most specific matching entry to provide the source IP address to be used.

TCPSTACKSOURCEVIPA parameter on the IPCONFIG or IPCONFIG6 profile statement
All of the following conditions must be met:

This is a TCP socket.
SOURCEVIPA is enabled on IPCONFIG or IPCONFIG6.
SOURCEVIPA is not disabled for the socket.
The application has not issued a specific bind for this socket, even to the INADDR_ANY or |in6addr_any IP address.
The address specified on the TCPSTACKSOURCEVIPA parameter is a static VIPA or active dynamic VIPA.
If these conditions are met and this is an IPv4 packet, TCP/IP uses the address specified on the TCPSTACKSOURCEVIPA parameter.

If these conditions are met and this is an IPv6 packet, TCP/IP uses the default source address selection algorithm to select one of the addresses configured for the VIPA interface referenced by the TCPSTACKSOURCEVIPA parameter. For information about the default source address selection algorithm, see z/OS Communications Server: IPv6 Network and Application Design Guide.

Guideline: Because the SRCIP profile statement provides all of the functionality of the TCPSTACKSOURCEVIPA parameter and additional granularity, consider using the SRCIP statement instead of specifying the TCPSTACKSOURCEVIPA parameter. Specifying JOBNAME * in a SRCIP profile statement provides the same result as specifying the TCPSTACKSOURCEVIPA parameter for implicit bind scenarios, and also applies to applications that issue a bind to the INADDR_ANY or |in6addr_any IP address.

SOURCEVIPA: static VIPA address from the HOME list (IPv4) or from the SOURCEVIPAINTERFACE parameter (IPv6)
All of the following conditions must be met:

SOURCEVIPA is enabled on IPCONFIG or IPCONFIG6.
SOURCEVIPA is not disabled for the socket.
Either the socket is not yet bound, or the socket is bound to the INADDR_ANY or |in6addr_any IP address.
If these conditions are met, TCP/IP determines the interface over which the initial packet will be sent.

For an IPv4 packet, TCP/IP does the following:
Locates that interface in the HOME list.
Searches backward in the HOME list for a static VIPA.
If a static VIPA is found in the HOME list, TCP/IP uses the first static VIPA found as the source IP address.
For an IPv6 packet, TCP/IP does the following:
Determines whether a SOURCEVIPAINTERFACE parameter was specified for the selected interface.
If so, uses the default source address selection algorithm to select one of the addresses configured for the VIPA interface that is referenced by the SOURCEVIPAINTERFACE parameter. For information about the default source address selection algorithm, see z/OS Communications Server: IPv6 Network and Application Design Guide.
HOME IP address of the link over which the packet is sent
For an IPv4 packet, TCP/IP uses the HOME IP address of the link over which the initial packet is sent.

For an IPv6 packet, TCP/IP uses the default source address selection algorithm to select one of the addresses configured for the interface over which the initial packet is sent. For information about the default source address selection algorithm, see z/OS Communications Server: IPv6 Network and Application Design Guide.

Wednesday, April 21, 2010

Dalhousie

Hotel :
http://www.grandviewdalhousie.in/gallery.html
http://www.hotelmountview.com/default.asp

MSS (http://tools.ietf.org/html/rfc879)

The TCP Maximum Segment Size Option

TCP provides an option that may be used at the time a connection is
established (only) to indicate the maximum size TCP segment that can
be accepted on that connection. This Maximum Segment Size (MSS)
announcement (often mistakenly called a negotiation) is sent from the
data receiver to the data sender and says "I can accept TCP segments
up to size X". The size (X) may be larger or smaller than the
default. The MSS can be used completely independently in each
direction of data flow. The result may be quite different maximum
sizes in the two directions.

The MSS counts only data octets in the segment, it does not count the
TCP header or the IP header.

A footnote: The MSS value counts only data octets, thus it does not
count the TCP SYN and FIN control bits even though SYN and FIN do
consume TCP sequence numbers.

I need to point out that the name “maximum segment size” is in fact misleading. The value actually refers to the maximum amount of data that a segment can hold—it does not include the TCP headers. So if the MSS is 100, the actual maximum segment size could be 120 (for a regular TCP header) or larger (if the segment includes TCP options).

Maximum Segment Size Selection

The selection of the MSS is based on the need to balance various competing performance and implementation issues in the transmission of data on TCP/IP networks. The main TCP standard, RFC 793, doesn't discuss MSS very much, which opened the potential for confusion on how the parameter should be used. RFC 879 was published a couple of years after the TCP standard to clarify this parameter and the issues surrounding it. Some issues with MSS are fairly mundane; for example, certain devices are limited in the amount of space they have for buffers to hold TCP segments, and therefore may wish to limit segment size to a relatively small value. In general, though, the MSS must be chosen by balancing two competing performance issues:

Overhead Management:

The TCP header takes up 20 bytes of data (or more if options are used); the IP header also uses 20 or more bytes. This means that between them a minimum of 40 bytes are needed for headers, all of which is non-data “overhead”. If we set the MSS too low, this results in very inefficient use of bandwidth. For example, suppose we set it to 40; if we did, a maximum of 50% of each segment could actually be data; the rest would just be headers. Many segment datagrams would be even worse in terms of efficiency.

IP Fragmentation: TCP segments will be packaged into IP datagrams. As we saw in the section on IP, datagrams have their own size limit issues: the matter of the maximum transmission unit (MTU) of an underlying network. If a TCP segment is too large, it will lead to an IP datagram is too large to be sent without fragmentation. Fragmentation reduces efficiency and increases the chances of part of a TCP segment being lost, resulting in the entire segment needing to be retransmitted.

Note: The exchange of MSS values during setup is sometimes called MSS negotiation. This is actually a misleading term, because it implies that the two devices must agree on a common MSS value, which is not the case. The MSS value used by each may be different, and there is in fact no negotiation at all.

Maximum segment size
From Wikipedia, the free encyclopedia
The maximum segment size (MSS) is the largest amount of data, specified in bytes, that a computer or communications device can handle in a single, unfragmented piece. For optimum communications, the number of bytes in the data segment and the headers must not add up to more than the number of bytes in the maximum transmission unit (MTU).
The MSS is an important consideration in Internet connections. As data is routed over the Internet, it must pass through multiple gateway routers. Ideally, each TCP segment can pass through every router without being fragmented. If the data segment size is too large for any of the routers through which the data passes, the oversized segments are fragmented. This slows down the connection speed as seen by the computer user, in some cases dramatically. The likelihood of such fragmentation can be minimized by keeping the MSS as small as reasonably possible. For most computer users, the MSS is set automatically by the operating system.

The Relationship between IP Datagram and TCP Segment Sizes

The relationship between the value of the maximum IP datagram size
and the maximum TCP segment size is obscure. The problem is that
both the IP header and the TCP header may vary in length. The TCP
Maximum Segment Size option (MSS) is defined to specify the maximum
number of data octets in a TCP segment exclusive of TCP (or IP)
header.

To notify the data sender of the largest TCP segment it is possible
to receive the calculation of the MSS value to send is:

MSS = MTU - sizeof(TCPHDR) - sizeof(IPHDR)

On receipt of the MSS option the calculation of the size of segment
that can be sent is:

SndMaxSegSiz = MIN((MTU - sizeof(TCPHDR) - sizeof(IPHDR)), MSS)

Friday, April 16, 2010

Mutex / semaphore

why do we need mutex if we have semaphore (binary semaphore)

Mutexes can be applied only to threads in a single process and do not work between processes as do semaphores.

Provide a level of safety for mutual exclusion, not possible with counting or binary semaphores

by Ralph Moore
smx Architect

Introduction

In the following sections, we will discuss the problems caused by using a counting semaphore or a binary semaphore for protection, and show how the mutual exclusion semaphore ("mutex") solves them.

Protection Failure

A typical counting semaphore has an internal counter, which can be counted higher than one by signals. This is useful when counting events or regulating access to multiple resources. However, it can backfire when using a counting semaphore for access protection for a single, non-reentrant resource. If for some reason, an extra signal (spurious signal) is sent to the counting semaphore, then it will allow two tasks to access such a resource, thus failing to do its job.

Both a binary semaphore and a mutex have only two states: free and owned. When in the free state, spurious signals are ignored. Hence, safety is improved, because two tasks will not accidentally be given access to a protected resource just because a spurious signal occurred.

Task Lockup

Consider the following code example, which uses a counting or binary semaphore:

functionA()
{
SemWait(semA, INF);
functionX();
SemSignal(semA);
}

FunctionX()
{
SemWait(semA, INF);
...
SemSignal(semA);
}
In this example, both functionA and functionX need access to a resource that is protected by semA. functionA, while accessing this resource, calls functionX. When this happens the current task (which is running functionA) is suspended indefinitely [1] on semA because functionX tests semA a second time. This happens because semA does not know that the current task already owns it.

For this, a mutex has an advantage over a counting or binary semaphore. When owned, the mutex knows which task owns it. A second test by the same task will not fail and the task will not be suspended. Hence, the above code, which could easily occur when using a thread-safe library, will not cause the current task to freeze.

Another kernel provides a resource semaphore, which is the same as a mutex, except that it gives an error if a task tests it a second time. This is intended to catch unmatched tests vs. signals. However, this is not the more likely problem - it is natural to put matched tests and signals into a routine. Non-matches can be found simply by searching for SemSignal and SemTest for the same semaphore - the two counts should be equal. The more difficult problem to detect is the one described above because the user of a library may not be aware that one function calls another and that both test the same mutex.

Premature Release

We are still not out of the woods with respect to the above example. It is not acceptable for semA to be released by the signal in functionX, because functionA, which called functionX, is still in its critical section. For every test, there must be a matching signal, before the mutex can become free. This is assured by having an internal nesting counter in the mutex. This counter is incremented by every test from the owner task and decremented by every signal from the owner task. Only tests and signals from the owner task can change the nesting counter. When it reaches zero, the mutex is freed.

Unbounded Priority Inversion

The next problem with using counting or binary semaphores for controlling access to critical resources is called unbounded priority inversion. Priority inversion occurs when a low-priority task owns a semaphore, and a high-priority task is forced to wait on the semaphore until the low-priority task releases it. If, prior to releasing the semaphore, the low priority task is preempted by one or more mid-priority tasks, then unbounded priority inversion has occurred because the delay of the high-priority task is no longer predictable. This defeats Deadline Monotonic Analysis (DMA) because it is not possible to predict if the high-priority task will meet its deadline.

Sharing a critical resource between high and low priority tasks is not a desirable design practice. It is better to share a resource only among equal priority tasks or to limit resource accesses to a single resource server task. Examples are a print server task and a file server task. We have long advocated this practice. However, with the layering of increasingly diverse and complicated middleware onto RTOSs, it is becoming impractical to enforce such simple strategies. Hence, in the interest of safety, it is best to implement some method of preventing unbounded priority inversion.

Task Priorities

Dealing with priority inversion is not a simple matter. In the following sections, we will discuss the two principal means for implementing priority promotion, followed by the approach chosen for smx, and then we will discuss the equally difficult problem of priority demotion.

Priority Inheritance

The most common approach is priority inheritance. Since the mutex knows its current owner, it is possible to promote the priority of the owner whenever a higher-priority task starts waiting on the mutex. The current owner temporarily assumes the priority of the highest priority task waiting on the mutex. This allows the owner to resume running if it was preempted by a mid-priority task, and to continue running should a mid-priority task become ready to run. When the owner releases the mutex, it drops back to its original priority.

Priority inheritance is used by most RTOSs that offer protection against unbounded priority inversion.

Problems with Priority Inheritance

Unfortunately, priority inheritance has some problems:

Since promotion of low-priority tasks to higher priority levels is caused by random sequences of events, it is not possible to predict how long mid-priority tasks will be blocked by lower-priority promoted tasks. Hence, Deadline Monotonic Analysis cannot guarantee that mid-priority tasks will meet their deadlines.
Priority inheritance can cause extra task switching. For example, if L, M, and H are three tasks of priorities low, medium, and high, respectively, and L(x) is task L promoted to priority x, then the following task switches could occur: L —> M —> L(M) —> H —> L(H) —> H —> M —> L = 7 task switches. More mid-level priorities could result in even more task switches. If L were boosted to H when it first got the mutex, then only 4 task switches would occur: L —> L(H) —> H —> M —> L.
If a promoted task is waiting for another mutex, then that task's owner must have its priority promoted. This is called priority propagation and it must be implemented for priority inheritance to be effective, if tasks can own multiple mutexes simultaneously. However it increases complexity and reduces determinacy. Thus many RTOSs do not implement it.
The extra task switching shown in #2 can become even worse if an even higher-priority task preempts the low-priority task and waits for the mutex. Then the low-priority task's priority must be raised again and possibly propagated again.
As is clear from the foregoing, complex systems consisting of many mutexes and many tasks that can own several mutexes at once, can experience significant reductions in determinacy. This is not wrong, but it should be taken into account.
Priority inheritance increases the likelihood of deadlocks, because priorities change unpredictably.
Priority Ceiling Promotion

An alternative to priority inheritance is priority ceiling promotion. With it, a priority ceiling is specified for a mutex, when it is created. The ceiling is set equal to the priority of the highest-priority task that is expected to get the mutex. When a lower-priority task obtains the mutex, its priority is immediately boosted to the mutex's ceiling. Hence, a mid-priority task cannot preempt as long as the low-priority task owns the mutex, nor can any task preempt that wants the mutex. Interestingly, priority ceiling is a simply an automatic method for forcing tasks to be of the same priority while using a resource - i.e. it enforces good design practice.

Priority ceiling permits Deadline Monotonic Analysis because any task may be blocked only once by a lower priority task, before it can run. Task switching is also reduced. The following task switches would occur for the previous example: L —> L(H) —> H —> M —> L = 4 task switches vs. 7, maximum, for priority inheritance.

If all mutexes, in a group, have the same ceiling, then once a task gains access to one mutex it has access to all mutexes in the group. (This is because: 1. All other tasks wanting these mutexes are blocked from running, and 2. This task could not be running if any other task already owned one of the mutexes.) This reduces task switching and improves determinacy.

Problems with Priority Ceiling Promotion

Unfortunately, priority ceiling also has some problems:

It causes a low-priority task to be elevated to higher priority whenever it owns a resource shared by a higher- priority task. If the higher-priority task seldom uses the resource and the low-priority task frequently uses the resource, this needlessly blocks mid-priority tasks from running, when they are ready. In this situation, priority inheritance is more attractive.
A mutex ceiling is static. It is possible that a user task may have had its priority increased by the programmer, or increased dynamically, to a value above the mutex's ceiling. Then priority ceiling promotion is not working fully.
A Combined Solution

To achieve the best results, smx mutexes implement a combination of both methods. This is done as follows: A mutex is created with a ceiling, ceil, and priority inheritance enabled or disabled by a flag in the mutex. If the priority of a new owner is less than ceil, its priority is immediately promoted to ceil. If, another task waits on the mutex, its priority exceeds ceil, and priority inheritance is enabled, then the owner's priority is promoted to that of the new waiting task.

This approach permits the main advantages of ceiling priority for most mutexes, yet allows priority inheritance to be used when appropriate for best performance or protection. If the mutex ceiling is set to the priority of the highest task that can possibly get it, then inheritance is effectively disabled and the mutex operates purely in ceiling mode. Alternatively, ceiling mode may be disabled by setting the ceiling to zero, when the mutex is created. Then the mutex will operate purely in inheritance mode, if inheritance is enabled. If ceiling and inheritance are not enabled, the mutex will operate without priority promotion.

In the event that a waiting task is promoted, by some other means, to a priority above the mutex ceil, priority promotion will still occur for the task due to priority inheritance, if enabled. Hence, safety is improved by the judicious combination of ceiling and inheritance priority promotion.

Staggered Priority Demotion

Priority demotion, when a task releases a mutex, is a non-trivial operation. Ideally the task's priority should not be higher than necessary, else high priority tasks may be needlessly blocked. But, each mutex owned by the task may require a different promotion level and mutexes can be released in any order. If the task's priority is demoted too far too soon, the protection of priority promotion will be lost for one or more of the remaining mutexes.

Further complicating demotion, smx supports two other mechanisms for dynamic priority change: bump_task() and message reception from a pass exchange. Either of these can occur while a task owns one or more mutexes.

In order to deal with these problems, each task maintains a list of the mutexes that it owns. When releasing a mutex, the task priority is demoted to the greatest of: (1) the highest ceiling among remaining owned mutexes, (2) the highest waiting task priority among remaining owned mutexes with inheritance enabled, or (3) the task's normal priority, normpri. The latter stores the task's original priority, before any mutexes were acquired, as modified, subsequently, by bump_task() or message reception from a pass exchange.

Staggered priority demotion will be implemented in the next release of smx. Currently, a task's priority is maintained at its highest level until all mutexes have been released. Then it is lowered to the task's normpri.

Tuesday, April 13, 2010

free online audio books

One of my favorite things to do is curl up with a good book (or even better a great book!). However, with a long commute and a couple of young, energetic boys (ages 2 & 3) it’s rare that I get time to kick back and read. That’s why I love audio books so much – they allow me to enjoy books while I’m on the go.

I typically listen to 3-4 audio books per month. Two of these audio books I get because I’m a member of Audible (the world’s largest provider of digital audio books for download). In the past I also downloaded the other 1-2 audio books from Audible, but eventually I got tired of paying extra for these and decided to see what free audio books were available online for download. I was pleasantly surprised by what I found.

1. Public Domain Audio Books
The first category of audio books available for free download are those in the public domain. This means that no one holds a copy right on these books and therefore anyone is free to distribute them. The following sites are the best places to download free online audio books in the public domain:

LibriVox has a huge selection of audio books available for download. According to the site, their goal is to make all public domain books available as free audio books. I don’t, however, find the site particularly user-friendly which is why I like the following site.

Librophile provides a clean, simple interface for browsing audio books available from LibriVox.

Project Gutenberg has a decent selection of both human-read and computer-generated audio books (I suggest you don’t waste your time with the latter).

Learn Out Loud offers more than 2,000 free audio and video titles. In addition to audio books, Learn Out Loud has lectures, speeches, sermons and interviews available for download. The site is very user friendly, allowing you to both filter results and sort by most popular, member rating, etc.

Free Classic Audio Books offers a small selection of classic audio books.

It’s worth noting that public domain audio books are typically read by volunteers. For this reason the quality of the narration and production can vary. That’s not to say there aren’t good ones out there, rather you just need to be prepared to look a little harder.

2. Audio Books Shared Under a Creative Commons License
PodioBooks offers serialized audio books distributed via RSS, much like a podcast. There are over 300 audio books available which are spread through a wide variety of genres including chick lit, fantasy, humor, magical realism and thrillers. Unlike the above free audio book websites which offer recordings of public domain books, audio books available from Podiobooks are recently written works that the authors are freely sharing under a Creative Commons license. The audio books are typically recorded by the author so, like audio books distributed in public domain, the quality can vary.

3. Audio Books from Your Local Library
The third way to download audio books for free is via your local library. Many libraries have an agreement with either NetLibrary or OverDrive. These companies provide the infrastructure for libraries to distribute digital content (ie not just audio books, but also eBooks, music, and video). The best way to look further into this is to visit the website of your local library. Another way is to use OverDrive’s Library Search Engine that allows you to search for a particular audio book or library. I couldn’t find an equivalent service offered by NetLibrary – their site is geared towards libraries and publishers.

The great thing about this way is that you can find the latest best sellers and the audio books are usually professionally narrated and produced. Often the audio book you want won’t be available for immediate download, but all you need to do is place a hold on the audio book and you will receive an email when it’s available. You can then download the title and listen to it on your computer, or transfer it onto a portable listening device (eg iPod). Normally you can loan the audio book for a period of 14 days. At the end of this loan period, titles will expire and be automatically ‘returned’ to the library (this means you will never accrue late fees with titles!).

* * *

Do have any questions about free audio books? Or would you like some audio book suggestions? Let me know in the comments below.

Peter

Neeraj Gupta