Debugging with git bisect

This is a post  for appreciating “git bisect” and how it can be one of the most powerful tool to find out root cause of a broken build or a broken branch.

Here is simple example of  how “git bisect” can be used to find  a bad commit.

Lets assume that we have a git repository which has hundreds of commits and currently the HEAD of the master branch is broken. Our objective is to find out which commit introduced the bug in to the code base.

Before starting the git bisect process we need to know a couple of things. First we need know a good commit i.e. a old commit at which code worked as expected. This is not very difficult to find out as it is most likely the last release of the code.  Also we need to know the steps to test the code and reproduce the issue. It will help us to find out if certain commits are good or bad during the git bisect process.

Git bisect uses binary search algorithm between the good and bad commit to  find out the commit  that introduced the bug.

Here are the commands to start the git bisect work flow. Lets call the current HEAD commit as “original HEAD”

$ git bisect start

$ git bisect bad

$ git bisect good  <commit ID>

Bisecting: 130 revisions left to test after this (roughly 4 steps)

Once the above commands are executed, git bisect will change the HEAD to  the middle of the commits between the “original HEAD” and good commit. Read about binary search if you want to know how it decides to which commit the HEAD needs to be moved.

At this point we are expected to test the code and find out if we are able to reproduce the issue. After the testing we need to again  tell git bisect if it is bad commit (see below) i.e. we are able to reproduce the issue else it is a good commit.

$git bisect bad
Bisecting: 65 revisions left to test after this (roughly 3 steps)

Or

$ git bisect good
Bisecting: 65 revisions left to test after this (roughly 3 steps)

We need to continue the process few times and git bisect will give you the commit which introduced the issue/bug.

In my experience I always get to the commit (which introduced the issue) in 4 to 5 steps of git bisect.  Which I think is an awesome thing.

So go ahead and try git bisect if you have not tried it yet and do not forget to use it when you broken builds.

 

 

 

A Docker Workshop

In Fudcon Pune 2015 we had conducted a Docker introductory workshop. It was well attended and  we got positive feedback about it.  While preparing for the Linux container track in Fudcon we had decided that we will put all the documentation in github. The idea was to keep the content open for collaboration so that others can contribute and reuse the content. Thanks to Neependra for the idea.

Here is the github link which has the workshop content.

The github project also contains some useful material e.g. Hands on Kubernetes which you might useful.

This workshop will take around 3 hours to complete. This is really useful If you are new to docker and wants to learn by doing some hands on.

Cherry pick a PR (pull request) from github

Sometime you might want to test pull requests (from github) in local machine  by cherry picking it. This usually happens before it get merged in the upstream repo and released by the project.

I searched the internet but  did not get good reference about how to do it. After little bit of trial and error I came up with below steps.

Cherry picking a pull request:

For example you want to cherry pick https://github.com/fgrehm/vagrant-cachier/pull/164

Cherry picking a commit:

vagrant-cachier in Fedora 23 with KVM Libvirt

Vagrant cachier is a very useful plugin for Vagrant users.  It helps to reduce time and  the amount of packages get downloaded from internet between each “vagrant destroy”.

For example, you are using a CentOS 7 image in Vagrant setup and want it to update with the latest packages every time you start working in the guest then the usual work flow is “vagrant up” -> “vagrant ssh” > “sudo yum update -y” -> “Do your stuff” -> “vagrant destroy” .  But the amount of packages get downloaded during yum update and the time consumed for it is somehow undesirable .

vagrant-cachier  keeps the downloaded packages in the file system of the host machine and uses this for the guest as cache. The yum update in the guest gets the packages from the cache  and the time and internet usage is drastically reduced.  Which is really cool!

I tried to install vagrant-cachier on my Fedora 23 laptop with KVM and libvirt and got in to below issue.

Issue:

[root@dhcp35-203 ~]# vagrant plugin install vagrant-cachier
Installing the 'vagrant-cachier' plugin. This can take a few minutes...
Bundler, the underlying system Vagrant uses to install plugins,
reported an error. The error is shown below. These errors are usually
caused by misconfigured plugin installations or transient network
issues. The error from Bundler is:

An error occurred while installing ruby-libvirt (0.5.2), and Bundler cannot continue.
Make sure that `gem install ruby-libvirt -v '0.5.2'` succeeds before bundling.

Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

/usr/bin/ruby -r ./siteconf20151027-20676-13hfub7.rb extconf.rb
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers. Check the mkmf.log file for more details. You may
need configuration options.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
extconf.rb:73:in `<main>': libvirt library not found in default locations (RuntimeError)

extconf failed, exit code 1

Gem files will remain installed in /root/.vagrant.d/gems/gems/ruby-libvirt-0.5.2 for inspection.
Results logged to /root/.vagrant.d/gems/extensions/x86_64-linux/ruby-libvirt-0.5.2/gem_make.out

After installing “libvirt-devel” package the issue got resolved.

[root@dhcp35-203 ~]# dnf install libvirt-devel

[root@dhcp35-203 ~]# vagrant plugin install vagrant-cachier
Installing the 'vagrant-cachier' plugin. This can take a few minutes...
Installed the plugin 'vagrant-cachier (1.2.1)'!

However the vagrant up command again failed.

$ vagrant init centos/7

Then we need to modify the vagrantfile as vagrant-cachier by-default uses NFS to mount the host filesystem in to the guest.

$ cat Vagrantfile
Vagrant.configure(2) do |config|
  config.vm.box = "centos/7"
  if Vagrant.has_plugin?("vagrant-cachier")
    config.cache.scope = :box

    config.cache.synced_folder_opts = {
      type: :nfs,
      mount_options: ['rw', 'vers=3', 'tcp', 'nolock']
    }
  end

end

Next step was

$ vagrant up
xxxxxxxxxxxxxxxxxxxx
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

mount -o 'rw,vers=3,tcp,nolock' 192.168.121.1:'/home/lmohanty/.vagrant.d/cache/fedora/23-cloud-base' /tmp/vagrant-cache

Stdout from the command:

Stderr from the command:

mount.nfs: Connection timed out

After little troubleshooting it turned out to be a firewall i.e. iptable issue. iptable was blocking the nfs service of host for the operation. As a temporary workaround I removed all the iptable rules from the host.

$ iptables -F

After that “vagrant up” worked fine and I can see the changes vagrant-cachier did in the guest to make the caching work.

Here are the things done by vagrant-cachier for the caching to work.

  • Mounts the ~/.vagrant.d/cache/<guest-name> from host  in the guest on /tmp/vagrant-cache/
  • In Guest
    • It enables the yum caching i.e. sed -i ‘s/keepcache=0/keepcache=1/g’ /etc/yum.conf
    • It creates a symlink of /tmp/vagrant-cache/yum to /var/cache/yum
vagrant@localhost ~]$ ls -l /var/cache
total 8
drwx------. 2 root root 4096 Nov 15 00:08 ldconfig
drwxr-xr-x. 2 root root 4096 Jun  9  2014 man
lrwxrwxrwx. 1 root root   22 Nov 15 00:06 yum -> /tmp/vagrant-cache/yum

vagrant-cachier works fine with CentOS7 guests. However I found an issue with Fedora 23 guests as the default package manager is dnf instead of yum. I have filed an issue with vagrant-cachier and also working on a fix.

Bangalore CentOS Dojo, 2014

The first CentOS Dojo in India took place in Bangalore on 15th November(Saturday) 2014 at Red Hat Bangalore office. Red Hat had sponsored the event.

I was  a co-organizer of the Dojo along with Dominic and Karanbir Singh.  Around 90 people RSVPed  for the event but around 40 (mostly system administrators and new users) attended the event.

The First talk was by Aditya Patawari on “An introduction to Docker and Project Atomic”. The talk included a demo and introduced audience to docker and Atomic host. Most of the attendees had questions on docker as they had used or have heard about it. There were some questions about differences between CoreOS and Project Atomic. The slides are available at http://www.slideshare.net/AdityaPatawari/docker-centosdojo. Overall this talk gave fair idea about Docker and Atomic project.

Second talk was “Be Secure with SELinux Gyan” by Rejy M Cyriac. This session about troubleshooting SELinux issues and introduction to creating custom SELinux policy modules.  Rejy made the talk interesting by distributing SELinux stickers to attendees who asked interesting questions or answered questions. Slides can be found here.

After these two talks we took a lunch break for around 1 hour.  During the lunch break we distributed the CentOS t shirts and got a chance to socialize with the attendees.

The first session post launch was “Scale out storage on CentOS using GlusterFS” by Raghavendra Talur. The talk introduced the audience to GlusterFS, important high level concepts and a demo was shown using packages from CentOS storage SIG. Slides can be found at slideshare.

The next session was “Network Debugging” by Jijesh Kalliyat. This talk covered all most all basic concepts/fundamental, network Diagnostic tools required to troubleshoot a network issue. Also  it included a demo of use Wireshark and Tcpdump to debug network issues. Slides are available here.

Before the next talk, we took break for some time and clicked some group pictures of all present for the Dojo.

The last session was on “Systemd on CentOS” by Saifi Khan. The talk covered a lot of areas e.g. comparison between SysVinit and systemd,  Concurrency at scale, how systemd is more scalable than other available init systems, some similarity of design principles with CoreOS and how it is suited better for Linux container technology. Saifi also talked about how systemd has saved his system from being unusable.  His liking for systemd was quite evident from the talk and enthusiasm.

Overall it was an awesome experience participating in the Dojo as it covered wide variety of topics which are important for deploying CentOS for various purposes.

Bangalore Dojo link: http://wiki.centos.org/Events/Dojo/Bangalore2014

Group Photo. You can see happy faces there 🙂

DSC07574_mod

Bangalore Dojo, 2014

GlusterFS VFS plugin for Samba

Here are the topics this blog is going to cover.

  • Samba Server
  • Samba VFS
  • Libgfapi
  • GlusterFS VFS plugin for Samba and libgfapi
  • Without GlusterFS VFS plugin
  • FUSE mount vs VFS plugin

About Samba Server:

Samba server runs on Unix and Linux/GNU operating systems. Windows clients can talk to Linux/GNU/Unix systems through Samba server. It provides the interoperability between Windows and Linux/Unix systems. Initially it was created to provide printer sharing and file sharing mechanisms between Unix/Linux and Windows. As of now Samba project is doing much more than just file and printer sharing.

Samba server works as a semantic translation engine/machine. Windows clients talk in Windows syntax e.g. SMB protocol. And Unix/Linux/GNU file-systems understand requests in  POSIX. Samba converts Windows syntax to *nix/GNU syntax and vice versa.

This article is about Samba integration with GlusterFS.  For specific details I have taken example of GlusterFS deployed on Linux/GNU.

If you have never heard of Samba project before, you should read about it more , before going further in to this blog.

Here are important link/pointers for further study:

  1. what is Samba?
  2. Samba introduction

Samba VFS:

Samba code is very modular in nature. Samba VFS code is divided in to two parts i.e. Samba VFS layer and VFS modules.

The purpose of Samba VFS layer is to act as an interface between Samba server and  below layers. When Samba server get requests from Windows clients through SMB protocol requests, it passes it to Samba VFS modules.

Samba VFS modules i.e. plugin is a shared library (.so) and it implements some or all functions which Samba VFS layer i.e. interface makes  available.  Samba VFS modules can be stacked on each other(if they are designed to be stacked).

For more about Samba VFS layer, please refer http://unix4.com/w/writing-a-samba-vfs-richard-sharpe-2-oct-2011-e6438-pdf.pdf

Samba VFS layer passes the request to VFS modules. If the Samba share is done for a native Linux/Unix file-system, the call goes to default VFS module. The default VFS module forwards call to System layer i.e. operating system. For User space file-system like GlusterFS, VFS layer calls are implemented through a VFS module i.e. VFS plugin for GlusterFS .The plugin redirects the requests (i.e fops) to GlusterFS APIs i.e. libgfapi. It implements or maps all VFS layer calls using libgfapi.

libgfapi:

libgfapi (i.e. glusterfs api) is set of APIs which can directly talk to GlusterFS. Libgfapi is another access method for GlusterFS like NFS, SMB and FUSE. Libgfapi bindings are available for C, Python, Go and more programming languages. Applications can be developed which can directly use GlusterFS without a GlusterFS volume mount.

 GlusterFS VFS plugin for Samba and libgfapi:

Here is the schematic diagram of how communication works between different layers.

gluster-samba-vfs-plugin

Samba Server:  This represents Samba Server and Samba VFS layer

VFS plugin for GlusterFS: This implements or maps relevant VFS layer fops to libgfapi calls.

glusterd: Management daemon of Glusterfs node i.e. server.

glusterfsd: Brick process of Glusterfs node i.e. server.

The client requests come to Samba server and Samba servers redirects the calls to GlusterFS’s VFS plugin through Samba VFS layer. VFS plugin calls relevant libgfapi fucntions. Libgfapi acts as a client, contacts glusterd for vol file information ( i.e. information about gluster volume, translators, involved nodes) , then forward requests to appropriate glusterfsd i.e. brick processes where requests actually get serviced.

If you want to know specifics about the setup to share GlusterFS’s volume through Samba VFS plugin, refer below link.

https://lalatendumohanty.wordpress.com/2014/02/11/using-glusterfs-with-samba-and-samba-vfs-plugin-for-glusterfs-on-fedora-20/

Without GlusterFS VFS plugin: 

Without GlusterFS VFS plugin, we can still share GlusterFS volume through Samba server. This can be done through native glusterfs mount i.e. FUSE (file system in user space). We need to mount the volume using FUSE i.e .glusterfs native mount in the same machine where Samba server is running, then share the mount point using Samba server. As we are not using the VFS plugin for GlusterFS here, Samba will treat the mounted GlusterFS volume as a native file-system. The default VFS module will be used and the file-system calls will be sent to operating system. The flow is same as any native file system shared through Samba.

FUSE mount vs VFS plugin:

If you are not familiar with file systems in user space,  please read about FUSE i.e. file system in user space.

For FUSE mounts, file system fops from Samba server goes to user space FUSE mount point -> Kernel VFS -> /dev/fuse -> GlusterFS and comes back in the same path. Refer to below diagrams for details. Consider Samba server as an application which runs on the fuse mount point.

Fuse_Mount

Fuse mount architecture for GlusterFS

You can observe the process context switches happens between user and kernel space in above architecture. It is going to be a key differentiation factor when compared with libgfapi based VFS plugin.

For Samba VFS plugin implementation, see the below diagram. With the plugin Samba calls get converted to libgfapi calls and libgfapi forward the requests  to GlusterFS.

libgfapi

Libgfapi architecture for GlusterFS

The above pictures are copied from this presentation:

Advantage of libgfapi based Samba plugin Vs FUSE mount:

  • With libgfapi , there are no kernel VFS layer context switches. This results in performance benefits compared to  FUSE mount.
  • With a separate Samba VFS module i.e. plugin , features ( e.g: more NTFS functionality) can be provided in GlusterFS and it can be supported with Samba, which native Linux file systems do not support.

 

 

 

Using GlusterFS With GlusterFS Samba vfs plugin on Fedora

This blog covers the steps and implementation details to use GlusterFS Samba VFS plugin.

Please refer below link, If you are looking for architectural information for GlusterFS Samba VFS plugin,  difference between FUSE mount vs Samba VFS plugin

https://lalatendumohanty.wordpress.com/2014/04/20/glusterfs-vfs-plugin-for-samba/

I have setup  two node GlusterFS cluster with Fedora 20 (minimal install) VMs. Each VM has 3 separate XFS partitions with each partitions 100GB each.
One of the Gluster node is used as a Samba server in this setup.

I had originally tested this with Fedora 20. But this example should work fine with latest Fedoras i.e. F21 and F22

GlusterFS Version: glusterfs-3.4.2-1.fc20.x86_64

Samba version:  samba-4.1.3-2.fc20.x86_64

Post installation “df -h” command looked like below in the VMs
$df -h
Filesystem                            Size  Used Avail Use% Mounted on
/dev/mapper/fedora_dhcp159–242-root   50G  2.2G   45G   5% /
devtmpfs                              2.0G     0  2.0G   0% /dev
tmpfs                                 2.0G     0  2.0G   0% /dev/shm
tmpfs                                 2.0G  432K  2.0G   1% /run
tmpfs                                 2.0G     0  2.0G   0% /sys/fs/cgroup
tmpfs                                 2.0G     0  2.0G   0% /tmp
/dev/vda1                             477M  103M  345M  23% /boot
/dev/mapper/fedora_dhcp159–242-home   45G   52M   43G   1% /home
/dev/mapper/gluster_vg1-gluster_lv1           100G  539M  100G   1% /gluster/brick1
/dev/mapper/gluster_vg2-gluster_lv2           100G  406M  100G   1% /gluster/brick2
/dev/mapper/gluster_vg3-gluster_lv3           100G   33M  100G   1% /gluster/brick3

You can use following commands to create xfs partitions
1. pvcreate /dev/vdb
2. vgcreate VG_NAME /dev/vdb
3. lvcreate -n LV_NAME -l 100%PVS VG_NAME /dev/vdb
4. mkfs.xfs -i size=512 LV_PATH

Following are the steps and packages need to be performed/installed on each node (which is Fedora 20 for mine)

#Change SELinux to either “permissive” or “disabled” mode

# To put SELinux in permissive mode
$setenforce 0

#To see the current mode of SELinux

$getenforce

SELinux policy rules for Gluster is present in recent Fedora releases e.g. F21, F22 or later. So SELinux should work fine with Gluster.

#Remove all iptable rules, so that it does not interfare with Gluster

$iptables -F

yum install glusterfs-server
yum install samba-vfs-glusterfs
yum install samba-client

#samba-vfs-glusterfs RPMs for CentOS, RHEL, Fedora19/18 are avialable at http://download.gluster.org/pub/gluster/glusterfs/samba/

#To start glusterd and auto start it after boot
$systemctl start glusterd
$systemctl enable glusterd
$systemctl status glusterd

#To start smb and auto start it after boot
$systemctl start smb
$systemctl enable smb
$systemctl status smb

#Create gluster volume and start it. (Running below commands from Server1_IP)

$gluster peer probe Server2_IP
$gluster peer status
Number of Peers: 1

Hostname: Server2_IP
Port: 24007
Uuid: aa6f71d9-0dfe-4261-a2cd-5f281632aaeb
State: Peer in Cluster (Connected)
$gluster v create testvol Server2_IP:/gluster/brick1/testvol-b1 Server1_IP:/gluster/brick1/testvol-b2
$gluster v start testvol

#Modify smb.conf for Samba share

$vi /etc/samba/smb.conf

#
[testvol]
comment = For samba share of volume testvol
path = /
read only = No
guest ok = Yes
kernel share modes = No
vfs objects = glusterfs
glusterfs:loglevel = 7
glusterfs:logfile = /var/log/samba/glusterfs-testvol.log
glusterfs:volume = testvol

#For debug logs you can change the log levels to 10 e.g: “glusterfs:loglevel = 10”

# Do not miss “kernel share modes = No” else you won’t be able to write anything in to the share

#verify that your changes are correctly understood by Samba
$testparm -s
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section “[homes]”
Processing section “[printers]”
Processing section “[testvol]”
Loaded services file OK.
Server role: ROLE_STANDALONE
[global]
workgroup = MYGROUP
server string = Samba Server Version %v
log file = /var/log/samba/log.%m
max log size = 50
idmap config * : backend = tdb
cups options = raw

[homes]
comment = Home Directories
read only = No
browseable = No

[printers]
comment = All Printers
path = /var/spool/samba
printable = Yes
print ok = Yes
browseable = No

[testvol]
comment = For samba share of volume testvol
path = /
read only = No
guest ok = Yes
kernel share modes = No
vfs objects = glusterfs
glusterfs:loglevel = 10
glusterfs:logfile = /var/log/samba/glusterfs-testvol.log
glusterfs:volume = testvol

#Restart the Samba service. This not a compulsory step as Samba takes latest smb.conf for new connections. But to make sure it uses the latest smb.conf, restart the service.
$systemctl  restart smb

#Set smbpasswd for root. This will be used for mounting the volume/Samba share on the client
$smbpasswd -a root

#Mount the cifs share using following command and it is ready for use 🙂
mount -t cifs -o username=root,password=<smbpassword> //Server1_IP/testvol /mnt/cifs

GlusterFS volume tuning for volume shared through Samba:

  • Gluster volume needs to have: “gluster volume set volname server.allow-insecure on”
  • /etc/glusterfs/glusterd.vol of each of gluster node
    add “option rpc-auth-allow-insecure on”
  • Restart glusterd of each node.

For setups where Samba server and Gluster nodes need to be on different machines:

# put “glusterfs:volfile_server = <server name/ip>” in the smb.conf settings for the specific  volume

e.g:

[testvol]
comment = For samba share of volume testvol
path = /
read only = No
guest ok = Yes
kernel share modes = No
vfs objects = glusterfs
glusterfs:loglevel = 7
glusterfs:logfile = /var/log/samba/glusterfs-testvol.log

glusterfs:volfile_server = <server name/ip>
glusterfs:volume = testvol

#Here are the packages that were installed on the nodes

rpm -qa | grep gluster
glusterfs-libs-3.4.2-1.fc20.x86_64
glusterfs-api-3.4.2-1.fc20.x86_64
glusterfs-3.4.2-1.fc20.x86_64
glusterfs-cli-3.4.2-1.fc20.x86_64
glusterfs-server-3.4.2-1.fc20.x86_64
samba-vfs-glusterfs-4.1.3-2.fc20.x86_64
glusterfs-devel-3.4.2-1.fc20.x86_64
glusterfs-fuse-3.4.2-1.fc20.x86_64
glusterfs-api-devel-3.4.2-1.fc20.x86_64

[root@dhcp159-242 ~]# rpm -qa | grep samba
samba-client-4.1.3-2.fc20.x86_64
samba-4.1.3-2.fc20.x86_64
samba-vfs-glusterfs-4.1.3-2.fc20.x86_64
samba-libs-4.1.3-2.fc20.x86_64
samba-common-4.1.3-2.fc20.x86_64

Note: The same smb.conf entries should work with CentOS6 too.

Does open source/community model is the better way?

After reading my previous blogs ( blog1 and blog2 ) , you might be wondering if open source/free software/community development model helps to create a better software?  and I am going to shed some light on it in this post.

Before going to further discussion, I want to to talk about community development model.  In a Community development model anybody can participate in the software development irrespective of race, religion, nationality, gender, educational qualification and social status. Anybody who uses the software, develops the software, does bug fix, creates documentation ,  maintain the infrastructure for the project and contributes to the success of the project  is part of the community. In a community project , community decides the road map for the project. This actually changes the nature of the project. We will discuss about how the nature of the project changes in further discussion. However, for the community i.e. everybody to participate in the project,  the source code must be made available to them and this is how source code availability becomes a very important and a bare necessity . Source code  access i.e. open source is a precondition for community development model.  Without access to the source code, we can’t follow a community development model.

I am not sure if all of you understand how  a  software is developed in a company .If you understand it, you can skip the below paragraph.  Else lets first discuss how typically  a software product gets developed in a proprietary company. Then we will compare how it is different from community development.

A company  sells software to solve a problem or a set of problems or a better solution over an existing one. Before selling the product , they develop/create  it. As part of the development process , they hire people to do market study/research on what are the competitive products available for solving the problem,  what they are also trying to solve ? What should be their approach to the problem?  Then they hire software engineers, put them to a RnD lab or “a development lab”  to create it. These  engineers are responsible for writing code and testing for the product. They are not allowed to share the information about the product , the code with outside world. When they are done with development of the product and it is ready,  the company starts selling it to its customers. After the 1st version of software, there might  be new requirements for new features, improvements  to be put in to the product for the consequent versions of the software to make it better or make it competitive  with  other similar products. So that the company can make more profit selling it.

However an opensource/community project usually get started by an  individual or a group of people initiative to solve a problem for themselves. However they make the  source code available for others, in the belief that it will be helpful for others too. If others find it useful, they use it. When people use a software, they might find issues with it. They report the issues to the developer group or  they fix it themselves. Some of them add new features to it according to their need. As a gratitude of the initial help they received as form of the software , they merge the new code/feature with the original software and make it available for others to use. Gradually a community is formed. Person having interest and most knowledge in the project take up  the role of maintaining the project. A maintainer essentially is a project leader whose responsibility is to oversee the project growth, to collaborate among community members, to understand expectation of the community on the project  among lots of other things.  As the project grows, community members decide which features need to be put in to the software, which hardware they want to run it, what would be the future road map in a democratic way.  This leads to development of features which people need most i.e. which solves their problems, not some fancy feature which some company executive thought would be useful for them. This also leads to better support for a wide ranging hardware as it is easier to port the code for the community members to different hardware when the source code is available. Where as proprietary companies support selective hardware which give them maximum user base and profit. But in community we need to support everybody’s hardware so that everybody should be beneficial from it, rather money or profits.

Most of the time open source/community   software has better inter portability with other open source software because the goal is to collaborate, get benefited from each other which ultimately benefits the community.  This leads to better integration between different software projects  with each-other and result in a better product or software ecosystem. However this is not the case with proprietary software. Their decision depends on profit margin, future scope, relationship with each other (i.e. if the software are from different companies)  and so on. Have you stared seeing the difference? 🙂

Even though community projects start with minimum required features but it gradually becomes a incubation ground for innovation or new ideas. Researchers, academicians, computer scientists, corporations,  governments use existing  open source projects to develop something new for their purpose. Lets take an example. A computer scientist doing research on distributed computing and he came up with a new algorithm which improves distributed computing. Now he want to implement and test his algorithm. Does he need to develop a new distributed system to implement his idea? or put his algorithm to a existing open source distributed system.The answer is pretty simple. He takes source code from a opensource project (something like Linux/GNU here), implements his algorithm into it.  However it depends on him whether he wants to merge the code into the existing code base and make it available for others or he want to keep it to himself.  But almost in most of the cases people give it back to the source, from where they took the initial code. Giving code for free doesn’t mean  they are not gaining anything. A code in a popular community project  gives far more credibility, popularity, reach, respect to the author along with his research publication and still if he wants, he can create money out of it . There are lots of examples of Phd  papers/subjects  becoming famous community/opensource/free software projects.

The graph shows how community developed software overtakes proprietary software in-terms of innovation in long run.

CommunityDevelopedSoftware

I copied the below lines form Debian Linux/GNU’s about page [1]

You may be wondering: why would people spend hours of their own time to write software, carefully package it, and then give it all away? The answers are as varied as the people who contribute. Some people like to help others. Many write programs to learn more about computers. More and more people are looking for ways to avoid the inflated price of software. A growing crowd contribute as a thank you for all the great free software they’ve received from others. Many in academia create free software to help get the results of their research into wider use. Businesses help maintain free software so they can have a say in how it develops — there’s no quicker way to get a new feature than to implement it yourself! Of course, a lot of us just find it great fun

When you are in a culture where others help you without any selfish motive, your attitude towards others also changes. You become helpful to others too. However not everybody is kind enough to give back the enhancement they make in the source code. For those we have open source licenses like GPL[2] to force them  to give it back to the the community  which gave them the initial source code if they are selling/commercializing it with enhancements.

Some times organisations  contribute to community projects or starts community projects e.g: Linux/GNU, Mozilla firefox, Fedora, Open suse, Chrome, openstack, Xen virtualization, because they understand the benefit of community development model . We have examples of individuals or group of people/companies starting in open source projects.

Following are the positive sides of a community driven/free software/opensource project.

  • More choice of hardware, platform. Most of the open-source software projects support all possible hardware.
  • The life span of the software will be very long. As it is easier to fix and contribute a feature rather then creating a new project/software.
  • It will be easier to customize open source software according to your needs and taste. You can remove unwanted  features. That will make its IT foot print optimal.
  • It wont have virus, spyware as the source code is available for everyone to see and any suspicious code  never gets into the project or can be easily removable.
  • Better inter portability as it is easier to integrate it with other software.
  • The quality of the code in open source projects are far better then closed source ones as code is reviewed/read by more people. Also the source is better modular because of its distributed way of development.
  • Helps to spread knowledge as source code is a great source of knowledge. You can learn from others work.
  • It helps to avoid vendor lock in. If any company giving you commercial support for a open source/free software, they can’t show monopoly on the software. You are always free to move the support to some other company or hire engineers to support the software as the source code is publicly available.
  • Cost is always less for community driven software when you need commercial supports for the software. This helps organisations to cut down their IT cost which in turn lowers the cost of their product or service.
  • Minimizes software piracy. The model allows everybody to use the community version of the software with no cost, so no need of piracy.
  • Does not take away freedom of users regarding how they want to use it or where they want to use it.
  • Helps to create better culture, where collaboration with others plays a key role.
  • Encourages innovation as there is no need to reinvent the wheel again and we can focus on new stuffs.

I am quoting Linus Torvalds  on open source. He has actually summarized it nicely.

“Me, I just don’t care about proprietary software. It’s not “evil” or “immoral,” it just doesn’t matter. I think that Open Source can do better, and I’m willing to put my money where my mouth is by working on Open Source, but it’s not a crusade – it’s just a superior way of working together and generating code.

It’s superior because it’s a lot more fun and because it makes cooperation much easier (no silly NDA’s or artificial barriers to innovation like in a proprietary setting), and I think Open Source is the right thing to do the same way I believe science is better than alchemy. Like science, Open Source allows people to build on a solid base of previous knowledge, without some silly hiding.

But I don’t think you need to think that alchemy is “evil.” It’s just pointless because you can obviously never do as well in a closed environment as you can with open scientific methods”

to-compete-or-collaborate

The topic is a very big one and it is hard to discuss it in a single blog post.  It is very much possible that I may have missed some obvious points.  So if you have any suggestion , kindly put them in comments. I would be happy to pick them and put it into the post.

[1] http://www.debian.org/intro/about#what

[2] http://www.gnu.org/licenses/gpl.html

How much open source software we really use?

You might have guessed from last blog that I love free and open source software. Let me tell you that you are absolute correct. When I think about the journey of free software/open source, I feel  mesmerized about it. I bet you will feel the same, when you realize how a revolution started by a single person named Richard Matthew Stallman (yes we call it free software revolution, now sets rule and standard for technology.
It is not an over statement but a fact. The story so far has been incredible. The journey is about how an idea of free software has changed our world. The saying “you can’t stop an idea whose time has come” thats hold true for free software/open source software in current context.
I will talk about this revolution in my future blogs for sure. But I am here to talk about a very basic question.

Before you go further you should know that in free software, the “free” word means freedom, not free goods (like it would be available to you for free) but it should be read as freedom. So when you read free software, you should read freedom software.

In my last post I talked about open source software/Free software and how good it is for our society and for our freedom.I think some of you have liked the idea and appreciate it.

As of now its not a new concept, there are thousands  of open source software Oh wait! let me correct , millions of open source software currently being used or being developed . Interesting isn’t it? Take a look at github[1], a code hosting/sharing website mainly used for open source community projects. As per latest stats[2] it hosts more than 6 million projects and most of the projects are open source projects. Github[1] is based on a software git[3] which is again an open source software.

So “How much open source software we really use?” or “What is the reach of open source projects other than Linux, Mozzila Firefox and Android?”. Some of you might have thought of this question. Lets discuss about it.

How many of you have used wikipedia? or Have you ever used wikipedia? You must be thinking I am joking, right ? .The answer is we use wikipedia everyday. It is a inevitable part of today’s internet. Anybody who has  used internet some time or used internet to get some information must have used wikipedia. Wikipedia is not just a simple website. It is worlds biggest encyclopaedia. You can find information about any topic in it.Millions of people access it every day, millions of pages get updated every day. The web application which is used for wikipedia is called “MediaWiki” [4][5] and it is developed by Wiki foundation. You might have already guessed it, “MediaWiki” is a open source project. You can get the source code for wikipedia and deploy it in your home or office and create a small wikipedia for your self. You can put articles in to it and it would behave just like wikipedia does. Awesome isn’t it? Many organisation love to have a local wikipedia for their company related knowledge base and they can just deploy wikipedia. They don’t have to develop a software for it. Even if they develop, there is no guarantee that it would be as good as wikipedia and it takes  hell lot of time to fully develop a web application like wikipedia.

In my engineering days we used to see lots of movies (which I think is true for majority of engineering students ;)), which were of different video formats. We sometimes can’t play some of the formats in Windows media player (that time it was Windows XP, not sure how it is in Windows 8). So we rely on “vlc”[6] media player to play the files. It was the de-facto media player for everybody using Windows. When ever we format our PC, vlc player is one of the first software we install on a priority basis on that machine. I used to get amazed how vlc is able to run any video format we throw at it  without any hiccup. I didn’t get an answer that time. Now I know the answer. Because it is a open source project and developed by a community , not by any particular company. People from all over world contributes to vlc development.So if you are an engineering student and have interest in video encoding and want to play with media streaming technology, you can get the source code for vlc, hack the code, know more about media streaming technology, if you have some ideas for improvement you can play around with the code with your ideas. That means if you want some improvement in the media player, you dont have to develop it from scratch but you can just put the necessary code and you are done :).  It would be your contribution to the the development of vlc. Of course you will not be paid for it, but you will get a satisfaction for contributing to vlc player which is being used by billions of people on earth. How good is that? For me nothing can be compared to the satisfaction of doing something which will benefit billions of people.  Now take a pause and think about it. It is a simple video application but the example it is setting has tremendous impact. We can apply the same idea  to almost every aspect of social life. We can  solve problems by coming together where we all can contribute to something what we are good at and solve bigger problems. There are other open source media player other than vlc and I have just taken vlc as an example

Lets talk about wordpress, on which my current blog is. WordPress is now world’s leading blogging website. Millions of people write blogs everyday in wordpress and it is pretty good website for writing a blog. WordPress is also a very profitable company. Its revenue is improving every year .If we take count of future of internet, smart phone penetration, its financial future looks very bright. WordPress is also a open source project and you can find the source code at wordpress.org. If you are a web developer and want to know about the engineering work and want to take a look in to the code, you are most welcome. You can deploy wordpress instance at your home, office or anywhere you want. It follows a community development model and you can also contribute to wordpress development.

There is another software which I want to talk about. Thats google chrome browser. Chrome has very quickly became favourite browser of lots of people. You might have noticed the pace of chrome browser growth. So how did they develop Chrome in such a pace? It is an interesting  story. Lets begin :). It started with CromeOS. Google being a dominant web company wanted to develop a web based operating system. The OS would use cloud technologies. The applications will be accessed through a browser. That means users dont have to update the applications and operating system. It would be taken care of by Google. Regarding the computing power, it needs less power as it needs to just run a browser. Chrome OS is very important for  Google’s future plans as we expect the world will move to cloud based technologies. So they wanted to have a browser which will be used for Chrome OS. Also a good browser was needed for Android eco-system. Like every company they wanted it fast. They had a choice to develop a browser from scratch, but they didn’t want to reinvent the wheel as there are open source browsers available and it would have taken a lot of time for developing it from scratch. Eventually they used existing opensource modules/frameworks for building a web browser and started a project named chromium. Chromium follows a community development model and it is an open source project. Google does some more testing on chromium browser, packages it for various operating systems and gives you as Chrome browser. The initial versions of Chrome OS has been released and and Chrome browser is integral part of it. According to recent reports Chrome OS based laptops topped Amazon’s chart as best selling laptops for 119 days in 2012. It is an achievement given that the OS is at its initial days and  not yet reached its full potential.

Now just take a step back and think all this with open source operating system Linux which powers website like google search , facebook, Linkdin, Twitter , your favourite browsers like firefox, Chrome, your favourite mobile operating system Android, wikipedia, vlc like players and many more. If you have seen the video I have posted with my last post, you know that 90% of world’s supercomputer run Linux. These supercomputers are being used for analysis of satellites data, weather forecast, genome mapping  project, research on the climate and global warming, molecular modelling, solving complex mathematical problems for scientific research , military research and applications. How do you feel about the question when we started the discussion. Yup, it is touching our life one way or the other.

Before concluding the discussion I wanted to tell , if you deal with programming tasks, then will love to know that most of your favourite programming language are open source :).
May be you haven’t realized that before. I am giving some examples of open source Programming language and databases.

Java

Python

Perl

PHP

MongoDB

MySQL/MariaDB

Postgres SQL and many more 🙂

[1] https://github.com/

[2] http://thenextweb.com/insider/2013/04/11/code-sharing-site-github-turns-five-and-hits-3-5-million-users-6-million-repositories/

[3] http://git-scm.com/

[4] http://www.mediawiki.org/

[5] https://en.wikipedia.org/wiki/MediaWiki

[6] http://www.videolan.org/vlc/index.html

Open Source

There is a difference exists between free software and  open source software, but for my blog context,  you can treat them as similar.  So whenever I have used “open source” you read them as “open source or free software” or vice verse

What is Open Source  or Free software?

If you are clue less about the terms in the above question, lets read about it

http://www.gnu.org/philosophy/free-sw.html

http://en.wikipedia.org/wiki/Open_source

http://opensource.org/docs/osd

Now that you have some background information, lets start the discussions 🙂

Let me give some analogy first. What do you do when you want to know about ” a recipe for a dish”, you ask somebody who knows it. You get the recipe and prepare the dish . After/during you prepare it, you change the dish according to your taste. You can increase/decrease/change the ingredients according to your taste or requirement. Isn’t it awesome?Lets think the other way around.  How do you feel if the person who have the knowledge of dish, refuses to give you the recipe or if he/she gives you,  it would be on a precondition that you can’t change the recipe at all and you can’t share the recipe with anybody. That means if the dish is little sweet and people like me who don’t like sweet food at all , wont have any choice . My south Indian friends who love spicy food, can’t make that dish more spicy. Also even if your friends and family like the dish, you can’t share the recipe with them. Which is pretty bad. Feels like somebody has restricted our freedom, even if you can buy the dish from that particular person every time you want it but it feels bad when you can’t know how it is prepared, so that you can prepare it at home or you can’t modify it according to your taste. Similar thing happens when you use proprietary software. When you install a proprietary software you can’t modify it according to your need or give to somebody else.  If you buy a proprietary software for you , the ownership is only tied to you as a license and even if you like software and want to give it somebody else, it will be illegal. Obviously you can’t see what is inside of the software :P. So if somebody sells you a proprietary software along with a virus/spyware/malware you wont be able to know. Free software is exactly opposite of proprietary software in this context.

When you share a code of a software along with the freedom to modify, redistribute , it starts infinite possibility. We will talk about infinite possibility after a little while because it is not a small topic to discuss :-).

Lets get back to the principle on which we think open source or free software is right thing to do.  Our society evolved on knowledge, know-hows passed from our ancestors to us and it is also getting passed to our future generations. Our knowledge grow when we share it and one idea give birth to another . When our knowledge grow, we grow, new ideas born. When you share an idea with others, they also contributes to the idea and it become more stronger and better. Every inventions draw idea or inspiration from past. We wouldn’t have any electric bulb without electricity , no microprocessors without a transistor.  When you put these principles on a source code,  a free software project or open source project is born :).

Some people  argue that even if we have proprietary software, still companies have came up with better software. That’s correct , a software can developed from scratch and may be a better one. But think about the time we could have saved by just enhancing/fixing  the existing software, rather recreating it from the scratch.  Its like reinventing the wheel .

There is another argument that if we the make the source code available for free then we can’t make money out of it or we can’t have a sustainable business model around it. Do you think it is a strong argument? For  first time it looks like a strong argument. But surprisingly lots lots of money can be made out of open source or free software. You want me to give a example 🙂 sure there are many. How many of you have Android cell phones? I am sure you have one. What is interesting is, Android is an open source project and Google makes billions out of it. How many of you use firefox web browser? it is also a open source project and Mozzila makes money out of it. Have you heard about Linux/GNU? It is one of the biggest open source project and companies like IBM, Oracle, Google, Red Hat, Novell and thousands of companies make money out of it and also make our life easier with the help of Linux. Interesting isn’t ? You must be thinking then why companies continue to create proprietary software and people buy them too. It is something for you to think.

Now let’s talk about infinite possibility of a open source project. We can compare open source projects as seeds which can become a huge tree which again can create so many trees. Lets take an example. Linus Torvalds had started a small hubby os project when he can’t(he was not allowed to) modify code of a proprietary operating system called. minix. I would love to quote  few lines from his initial mails to his university minix user group.

Hello everybody out there using minix –

I’m doing a (free) operating system (just a hobby, won’t be big and
professional like gnu) for 386(486) AT clones.  This has been brewing
since april, and is starting to get ready.  I’d like any feedback on
things people like/dislike in minix, as my OS resembles it somewhat
(same physical layout of the file-system (due to practical reasons)
among other things).

With this mail he also shared the code of his hobby project and rest is my friends,  history!! :). You can understand from his mail that he never expected the project to grow this magnitude,  where 99% of the supercomputer runs Linux, Biggest datacenters, majority cell phones (Android is based on Linux), web companies like Google, facebook, Linkdin, twitter run their servers using Linux and recently all most all technologies being invented  for cloud computing is based on Linux. Amazing isn’t it. But there is more to the story :). As we know Android is based on Linux and it is a open source project. That means anybody can get the source code of worlds most popular mobile operating system. So now all the companies who wanted to create smart phones and they didn’t have resources to create an os, can use Android source code rather reinventing the wheel again. This will intern bring the smart phone price down and people who can’t afford expensive phones like Apple iphones can use cheaper smart phones in developing countries like India. You can imagine the socio economic impact when billions of people are connected to internet. This can’t get better than this ,where technology is helping people to change their life for better.  This is just one example and there are many 🙂

Here is a video about Linux to get you more interested in it.

I hope this article has increased your understanding of open source software in general.