Thursday, 7 July 2016

Linux Memory management

Introduction

This article concentrates on some useful commands that come in handy while analyzing memory related performance issues on Linux.

The article is inspired from this video. I've listed down all the commands mentioned in that video. The video is pretty long and this article is an attempt to create a gist of items mentioned there.

Commands:-


1. vmstat:-

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 4698476 172856 1552720    0    0     3    12   82   79  2  1 98  0  0


2. dstat --top-oom:- This command helps us to find out which process would be terminated first in case of high memory consumption. In such a case, the kernel terminates the highest memory consumer. This command helps us find that out.


3. getconf PAGE_SIZE:- This returns the default page size on the machine. On my Ubuntu 14.04 64 bit, I see this as follows:-

mukund@mukund-desktop:~$ getconf PAGE_SIZE
4096

4. pmap:- This returns the address mapping of a process. Here, in the output, we can see the virtual address and the location of the actual file. The 'anon' are heap and stack memory.

mukund@mukund-desktop:~$ ps -ae | grep term
 2205 ?        00:00:20 gnome-terminal
mukund@mukund-desktop:~$ pmap 2205
2205:   gnome-terminal
0000000000400000    284K r-x-- gnome-terminal
0000000000646000      4K r---- gnome-terminal
0000000000647000     12K rw--- gnome-terminal
0000000000b93000   8400K rw---   [ anon ]
00007f309c000000    136K rw---   [ anon ]
00007f309c022000  65400K -----   [ anon ]
00007f30a4000000    132K rw---   [ anon ]
00007f30a4021000  65404K -----   [ anon ]
00007f30a958d000      4K -----   [ anon ]
00007f30a958e000   8192K rw---   [ anon ]
00007f30a9d8e000  35272K r---- icon-theme.cache
00007f30ac000000    136K rw---   [ anon ]
00007f30ac022000  65400K -----   [ anon ]
00007f30b0000000    324K rw---   [ anon ]
00007f30b0051000  65212K -----   [ anon ]
00007f30b4113000      4K -----   [ anon ]
00007f30b4114000   8192K rw---   [ anon ]
00007f30b4914000    356K r-x-- libibus-1.0.so.5.0.505
00007f30b496d000   2048K ----- libibus-1.0.so.5.0.505
00007f30b4b6d000      8K r---- libibus-1.0.so.5.0.505
00007f30b4b6f000      4K rw--- libibus-1.0.so.5.0.505
00007f30b4b70000      4K rw---   [ anon ]
00007f30b4b71000     24K r-x-- im-ibus.so
00007f30b4b77000   2048K ----- im-ibus.so
00007f30b4d77000      4K r---- im-ibus.so
00007f30b4d78000      4K rw--- im-ibus.so
00007f30b4d79000    132K r-x-- liblzma.so.5.0.0
00007f30b4d9a000   2044K ----- liblzma.so.5.0.0
00007f30b4f99000      4K r---- liblzma.so.5.0.0
00007f30b4f9a000      4K rw--- liblzma.so.5.0.0
00007f30b4f9b000   1392K r-x-- libxml2.so.2.9.1
00007f30b50f7000   2048K ----- libxml2.so.2.9.1
...
 total           662024K

5. top:- This command can prove really useful at times. Here, 'VIRT' shows the amount of virtual memory being used and 'RES' shows the amount of memory present in RAM. There's a catch with the 'RES' memory. There are some libraries, like libC that are used by several processes. There's a single copy of this lib in physical memory, which is mapped into virtual memory of all processes. However, top counts this per process. As a result, the sum of all 'RES' memory is generally greater than the actual physical memory.


PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
2003 mukund    20   0 2210808 1.009g  89752 S  42.9 13.0  88:17.77 firefox

1151 root      20   0  359768  68188  32756 S   5.7  0.8  18:06.65 Xorg
1887 mukund    20   0 1630664 182452  75164 S   3.0  2.2  10:08.64 compiz
1305 mukund    20   0  379444  11852   7008 S   1.0  0.1   0:40.29 ibus-daemon
1678 mukund     9 -11  513332  12108   9728 S   1.0  0.1   1:04.37 pulseaudio
2205 mukund    20   0  662024  34672  24640 S   1.0  0.4   0:21.38 gnome-terminal


Example:-

Here, we see libc mapped as follows, for PID 2003

mukund@mukund-desktop:~$ pmap 2003 | grep libc
00007f9277d1d000   1772K r-x-- libc-2.19.so
00007f9277ed8000   2044K ----- libc-2.19.so
00007f92780d7000     16K r---- libc-2.19.so
00007f92780db000      8K rw--- libc-2.19.so


For PID 2205, we see the same libc mapped as follows:-

mukund@mukund-desktop:~$ pmap 2205 | grep libc
00007f30c6580000   1772K r-x-- libc-2.19.so
00007f30c673b000   2044K ----- libc-2.19.so
00007f30c693a000     16K r---- libc-2.19.so
00007f30c693e000      8K rw--- libc-2.19.so


So, we can see that there is only one physical copy of the file, however mapped into the virtual address space of two different processes. However, top counts this as two separate 'RES' memory.
This script can be used to determine a pretty accurate picture of the actual physical memory being used.

6. cat /proc/<PID>/smaps:- This command can be used to find the clean, dirty pages per process.

7.  cat /proc/meminfo: This can be used to find the total memory related information in the system. This below command can be used to find inactive and dirty pages.

mukund@mukund-desktop:~/Desktop$ cat /proc/meminfo | egrep -i "inactive|dirty"
Inactive:         725508 kB
Inactive(anon):    16384 kB
Inactive(file):   709124 kB
Dirty:                88 kB


The anonymous memory is the memory not mapped onto files, like the heap or the stack memory. This memory can only be moved to the swap space to page out.

8. min_free is the amount of memory after which the kernel(kswapd0 therad) starts seeing OOM and starts freeing memory.

mukund@mukund-desktop:~/Desktop$ ps -aux|grep swap
root        52  0.0  0.0      0     0 ?        S    Jul06   0:00 [kswapd0]


mukund@mukund-desktop:~/Desktop$ sudo sysctl -a | grep -i min_free
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
vm.min_free_kbytes = 11350


9. sysq-trigger:- This is an interesting command to do special requests to the Linux kernel. The documentation can be found here.

Example:-

echo f > sysrq-trigger

This triggers the OOM killer.

10. lscpu:- This is a nice utility to get great information about your CPU.


mukund@mukund-desktop:~/Desktop$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 42
Stepping:              7
CPU MHz:               2964.828
BogoMIPS:              5587.24
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0-3



11.ps -o min_flt,maj_flt <pid>: This command can be used to find the major and minor page faults of a process.


Sunday, 19 May 2013

Linear Algebra - 6

Hi,

So far, we have covered little bits about vectors, matrices, etc. Now, let's discuss about functions. These are the ideas that later lead us to transformations. So, let's begin!

What are functions?

In simple terms, Functions are basically a mapping from one set to another. If we take examples like sin, cos, log, etc, these are basically functions that take an input and give an output. So, it takes a number and gives a number. And if you're familiar with computer programming, here is what a C function looks like:

int sample_function(int num)
{
    return num * num;
}
 

This takes a number, and gives the square of it. Mathematically, this can be written as:$$f(x) = x ^ 2$$

Scalar valued and Vector valued functions:

Functions like sin, cos and the above example are all called as scalar valued functions. What this means is, these functions return a value in R. So, the output is generally a real number. But, functions are not restricted to just this. Functions can be what we call as vector valued functions. These are functions that return tuples of Rn. For example: $$f(x_1, x_2, x_3) = (x_1 + x_2, x_1 + x_3)$$
Now, this is a perfectly valid vector valued function. These functions generally do a mapping from$$R^m -> R^n$$ 
In the above example, it is a mapping from R3 to R2. These functions are called as Transformations. It is just a different name given to the same concept we had for scalar valued functions. The general notation used for transformation is T.


Linear Transformations

A linear transformation is a function(or transformation) iff $$T(\vec a + \vec b) = T(\vec a) + T(\vec b)$$ $$T(c\vec a) = cT(\vec a)$$
If a transformation obeys the above two points, it is a linear transformation. A linear transformation generally would involve linear combinations of components.
Now, let us take an example. Let $$T(x_1, x_2) = (x_1 + x_2, 3x_1)$$
Now, let $$\vec a = (a_1, a_2)   and   \vec b = (b_1, b_2)$$ $$T(\vec a) + T(\vec b) = (a_1 + a_2, 3a_1) + (b_1 + b_2, 3b_1) = (a_1 + a_2 + b_1 + b_2, 3a_1 + 3b_1)$$
 Now,$$T(\vec a + \vec b) = (a_1 + b_1 + a_2 + b_2, 3(a_1 + b_1)$$
Clearly, both these are same. Similarly, we can prove that $$T(c\vec a) = cT(\vec a)$$
 Thus, the above transformation is a linear transformation. Now, it can be shown that $$T(x_1, x_2) = (x_1^2, 0)$$ is a non-linear transformation. So, generally, transformations would be linear if they have linear combinations of components.




Linear Algebra - 5

Hi,

Now that we have seen a little about matrices, let's now discuss about matrix multiplication. I will not be discussing how matrix multiplication is done. That is essentially a method. However, i will try to express what matrix multiplication essentially is. Please correct me if there are mistakes in my posts.

Matrix Multiplication by a vector

Before we dive into multiplication of matrices with vectors, let's closely examine what truly a matrix is.
Now, a matrix is essentially a bunch of vectors. We can visualize a matrix in two ways. 

a) Matrix as a bunch of row vectors:$${\begin{pmatrix} \vec r_1  \\ \vec r_2 \end{pmatrix}}$$

Here,$$ \vec r_1 = (a_1, a_2, a_3..a_n) $$$$ \vec r_2 = (b_1, b_2, b_3..b_n) $$  

Thus, the matrix is an  2 x n matrix. since it has two rows and n columns. 

b)Matrix as a bunch of column vectors
Now, the same matrix can be expressed in another way:$${\begin{pmatrix} \vec c_1 \vec c_2 .... \vec c_n \end{pmatrix}}$$
where $$c_1 = (a_1, b_1)$$$$c_2 = (a_2, b_2)$$$$c_n = (a_n, b_n)$$

Thus, we have two ways of looking at the same matrix!Now, let's take a look at matrix multiplication. A matrix multiplication with a vector is essentially the dot product of the vector with each of the row vectors of the matrix. Or, it can also be seen as the sum of the scalar products or the column vectors.

Let's take an example. Let's take a 2x2 matrix $$A = {\begin{pmatrix} 1   2  \\ 3   4 \end{pmatrix}}$$ or
$$A = {\begin{pmatrix} \vec r_1  \\ \vec r_2 \end{pmatrix}}$$
where $$ \vec r_1 = (1, 2)$$$$\vec r_2 = (3, 4)$$
Now, let $$\vec x = (x_1, x_2) $$Now, A.X can be written as $${\begin{pmatrix} \vec r_1   dot  \vec x\\ \vec r_2  dot   \vec x \end{pmatrix}}$$The same can also be written as:$$x_1 * \vec c_1 + x_2 * \vec c_2$$ 

where c1 and c2 are column vectors of A, i.e c1 = (1, 3) and c2 = (2, 4).Thus, this is an interesting way in which matrices can be looked at.
 

Wednesday, 8 May 2013

Linear Algebra - 4

Hi,

So, we have covered some stuff about vectors and their linear combinations, let us now take a look at matrices. Matrices are mathematical objects used to represent arrays of numbers in a neat way. They are mostly used to represent a system of linear transformations. Please note that these are brief overview of concepts and not an in-depth analysis containing proofs. As mentioned in Part-1 of these blogs, these are meant as "notes" for the Khan Academy videos. Please watch the videos for detailed explanation of these concepts.

For example, $$ a * \vec x + b * \vec y = \vec c  |  \vec a, \vec b \in R^2$$$$ a * (x_1, x2_) + b * (y_1, y_2) = (c_1, c_2)  |  \vec a, \vec b \in R^2$$

This can be represented using matrices as, and let AB be the product. 
 

$${\begin{pmatrix} x_1 & y_1  \\  x_2 & y_2 \end{pmatrix}*\begin{pmatrix} a \\ b \end{pmatrix}=AB}$$

Now the right hand side of the equation previous to the above can be represented as,$${\begin{pmatrix} c_1 \\  c_2 \end{pmatrix}=AB}$$

Thus, using matrices, we can neatly represent the above linear system of equations.Operations that can be performed on matrices are:
- Addition
- Subtraction
- Scalar Multiplication
- Matrix Multiplication


Null Space of  a Matrix:

Since we have seen briefly what a matrix is, let us define the null space of matrix. We can visualise a matrix as a set of row vectors or as a set of column vectors.

Here, we can visualize the matrix as consisting of vectors(x1, x2) and (y1, y2) or as (x1, y1) and (x2, y2) $${\begin{pmatrix} x_1 & y_1  \\  x_2 & y_2 \end{pmatrix}}$$

Now, the Null-Space of a matrix A is the set of all vectors of the matrix such that
$$A \vec X = \vec 0 $$
If the above condition is true, then vector X belongs to the null-space of matrix A.
 
It is interesting to note that if the column vectors of a matrix are linearly independent, then 0 vector is the only element of the null-space of the matrix. 

Column Space of  a Matrix:

The column space of a matrix is the set of all possible combinations of all the column vectors of the matrix. Hmm, linear combinations of vectors? Ring a bell?
Yes, it is the span. Thus, column space of a matrix is nothing but the span of all the column vectors of A.

Consider a matrix$${A=\begin{pmatrix} x_1 & y_1  \\  x_2 & y_2 \end{pmatrix}}$$
Thus, $$C(A) = span(x, y) $$

Note: if the column vectors of a matrix are linearly independent, then they can be the basis for the column space of A.

Linear Algebra - 3

Hi,

Now that we have seen what span of a set of vectors is, let us see some more concepts. 

What is linear dependence of vectors?

Vectors are linearly dependent iff one vectors can be obtained from another(one vector is the scalar product of the other). For example, (3, 0) can be obtained from (1, 0). Thus, these two vectors are linearly dependent. There are different ways to check for linear dependency of vectors.

One way is as follows: 
Consider a linear combination of vectors a and b as follows:
$$ c_1 * \vec a+ c_2 * \vec b  = \vec 0 $$  
Now, if c1 or c2 is non-zero, the a and b are linearly dependent.

Linear Subspace of Rn

The linear subspace of Rn is defined as follows:

S is the linear subspace of Rn iff:

  1. S contains the 0 vector.
  2. For any vector X in S, c * X is also in S. (Closure under scalar multiplication).
  3. For any 2 vectors a and b, (a + b) is in S. (Closure under addition).
Thus, if any set S in Rn satisfies the above 3 conditions, the it is a linear subspace of Rn.

Let us consider an example:

Is 0 vector a valid linear subspace under Rn?

Now, let us verify the 3 conditions:

- Does it have the 0 vector? yes
- Is it closed under scalar addition?
      Let us check: c * 0vec = 0vec. Thus c * 0vec belongs to the set. So, yes
- Is it closed under addition?
      0vec + 0vec = 0vec. Thus, it is closed under addition. So, yes

Thus, 0vector is a valid linear subspace under Rn.


Basis of a (linear) Subspace:


Consider a set:$$V=span( V_1 , V_2, V_3 ... V_n  |  V_i \in R^n)$$

Now, if this set is linearly independent, then (V1, V2, .. Vn) is a basis for V. 

So, what this means is, we can obtain any vector in the span of (V1, V2, .. Vn), by using a linear combination of the vectors (V1, V2, .. Vn).
Also, note that the basis should be a minimum set such that it covers the subspace.

Let us take an example and see this: Now, let $$ S = span( (1, 0) , (0, 1) ) $$
Clearly,  (1, 0) , (0, 1) are linearly independent. So, this can be a basis for span((1, 0) , (0, 1)). Now, let us take another example:


$$ S = span( (1, 0) , (0, 1), (4, 0)) $$
 Here,  (1, 0) , (0, 1), (4, 0) are not linearly independent since one of the vectors here is redundant(in that one can be generated from the other). Although
span((1, 0) , (0, 1), (4, 0)) would be a valid subspace, this is not the minimum set with which we can cover the subspace. Thus, in this case, S cannot be the basis for the span((1, 0) , (0, 1), (4, 0)).

Note: It is quite interesting to note that although (1, 0) and (0, 1) are chosen as the most common basis vectors for R2, there can be several other vectors that can be the basis for Rn!
 

 

Linear Algebra - 2

Hi,

In the last post, we say some basics of vectors. Now, let us dive a little more into the topic.  Now, we saw that vectors can be added, subtracted and multiplied(cross and dot). Vectors can also be multiplied by constants.

So,$$c*V=\{ c*X_1 , c* X_2,  c*X_3 ... c*X_n  |  c_i \in R \}$$ 

Now, a linear combination of entities is an expression obtained by multiplying each entity by a constant and adding results.

eg: ax + by(where a and b are constants) is a linear combination of x and y.


What is a linear combination of vectors?

A linear combination of vectors is defined as follows:

$$\{ c_1*X_1  + c_2 *  X_2 + c_3 * X_3 + ... c_n * X_n  |   c_i \in R \}$$ 

This represents all the  linear combinations of vectors (X1...Xn).


Span of a set of vector(s)

Now that we saw what it means by a linear combination of vectors, now let us define an interesting concept called as span. The span of a set of vectors is defined as follows:

$$span(X_1, X_2, ... X_n)=\{ c_1*X_1  + c_2 *  X_2 + c_n * X_3 + ... c_n * X_n \}$$ 

Thus, the span of a set of vectors is the set of all linear combination of the vectors.

Now, if we take the example. $$span( (1, 0), (0, 1) )$$
So, by definition, span is the set of all the linear combination of (1, 0) and (0, 1).  Now, we can make an interesting question.


Can the span of vectors constitute Rn?

What this mean is, can the span cover the entire space? Now, let us see our above example of (1, 0) and (0, 1). Clearly, the span gives the following:

$$span( (1, 0), (0, 1) )=\{ c_1 * (1, 0)  + c_2 *  (0, 1) + c_3 * (1, 0) + c_4 * (0, 1) + ... \}$$

So, we can have all sorts of constants, any number of times. Now, let us see if this covers the entire two dimensional space R2. Can we derive an arbitrary vector (x, y) 
from these two vectors? Let us see.

Now, if we need to derive any vector (x, y) from the above two vectors, then we have
$$ c_1 * (1, 0) + .. +  c_m * (1, 0) + c_2 * (0, 1) + ... + c_n * (0, 1) = (x, y) $$
$$ c_1 + ... + c_m = x $$
$$ c_2 + ... + c_n  = y $$

Now, since all of the constants "c"s belong to R, it is definitely possible to obtain a solution for x and y, as long as there is at least one non-zero constant. Thus, if we obtain a non-zero solution for the equation above, this means that the span( (1, 0), (0, 1) ) indeed covers R2.


Can span() of any vectors constitute Rn?

No, the span() would constitute Rn iff every vector of Rn can be obtained by the vectors. Now, clearly, span((1, 0), (0, 1)) constitutes R2. However, take the following example: $$span((1, 0), (2, 0))$$

Clearly, any linear combination of (1, 0) and (2, 0) cannot generate for example, the vector (1, 1). Thus, the span of these two vectors cannot constitute Rn.







Linear Algebra - 1

Hi,

I have been watching online video lectures about Linear Algebra, and i thought i can make some "notes" about it. So, i will make small notes in multiple blog entries.

The videos are by the Khan Academy(taught by Sal Khan). These videos are just awesome and please share it with everyone you know who might be interested in this.


Link

Of course, i will not make notes for every video. I will just be making some important points.

Here we go:
Now, let's start with basics and define every concept.

So, what is Linear Algebra?

Quoting Wikipedia,

Linear algebra is the branch of mathematics concerning vector spaces, often finite or countably infinite dimensional, as well as linear mappings between such spaces.

 Now, let us go in depth and describe this. 

What is a vector?


A vector can be defined as an ordered set of Real numbers.
So, simply stated, $$V=\{ X_1 , X_2, X_3 ... X_n | X_i \in R \}$$ 

What this means is, a vector is an n-component entity where each of the components belong to the set of real numbers.

For example, consider the 3-Dimensional vector {1, 0, 0}. Now, this is part of the 3-Dimensional space. Now, let us define what a vector space is:


What is a vector space?

A vector space is defined as:$$R^n=\{ (x_1 , x_2, x_3 ... x_n)  |  x_i \in R \}$$ 

This is the set of all "n-component entities" such that for each component, the definition above for a vector holds good. In other words, this is the set of all n-component vectors. The "n" represents the dimensionality of the space.

For example the 3-Dimensional space can be represented as: $$R^3=\{ (x_1 , x_2, x_3 )|  x_i \in R \}$$  
Thus, this set represents all 3-Dimensional vectors.


Now, vectors are entities that are independent of an origin. i.e the vector (1, 0, 0) can be from (0, 0, 0) to (1, 0, 0) or it can be a vector from (2, 0, 0) to (1, 0, 0). They represent an entity that has a magnitude and a direction. 

Vectors can be defined with the following operations:
- Addition
- Subtraction
- Multiplication(Cross and Dot product)


Let us cover some other interesting aspects of  vectors in the next blog entry.
I will try to make sure not to fill one blog entry with too many concepts.