Being Scouts


Hi Friends,

boy_scout_with_oath-716624

The Boy Scouts have a rule: “Always leave the campground cleaner than you found it.” What if we followed a similar rule in our code: “Always check a module in cleaner than when you checked it out.” No matter who the original author was, what if we always made some effort, no matter how small, to improve the module. What would be the result?Uncle Bob

We all know how the software which has any significant code base deteriorates as the time passes, because we focus on writing new code and fix just the minimum code responsible for the issue.

If your code base is clean and you follow all the relevant best practices to make it beautiful and easy for maintenance, you should skip this post.

When most people fix the code for an issue, they fear that if they touch anything other than that, it might backfire as they either may miss out some impact to check or it might take much time than worth for themselves because the code base is worked on by so many people for so much time. And in reality, very few teams follow full TDD and where the available tests cover all possible scenarios.

But why can’t we have zero risk changes for Java towards the clean code which also take pretty less time?

I would propose following changes to do when we encounter some code to understand while working on fixing an issue:

Bad naming:

What’s in a name? If Shakespeare were born in today’s era of software, he perhaps wouldn’t have said that. Naming is that important in software maintenance as the software experts say that we should write code for humans not for machines considering the amount of time and work goes in the maintenance.

This is the most important among all suggestions. Whenever you find any variable name or private method name which doesn’t match its intention, change it to make it meaningful. In Eclispe it’s so simple to rename all its relevant references. If any variable is immutable and final, name should be in caps to drive the intention. Following is also worth considering: Naming Tips.

Outdated comments:

I’ve already talked about how bad comments can be. When you see the comment is not matching with the code which is common in any long maintenance project, remove it. If you see some code as commented, remove it.

Unused Code:

Whenever you see the unused variable or unused imports, go and remove it. It is very easy to detect in Eclipse.

Set of constants:

If you see some private set of constants, use java enums instead.

Generics:

When you see a collection (private/local) without using Generics and you know the type of data it can use, use Java Generics.

for-each loops:

If you find any tradition loop which is meant for only iterating elements and does not depend on loop index for any other work, change it to advanced for-each loop.

Bad Format:

Formatting is so easy and customizable in Eclipse, but many times we ignore this. Just use it on the files you worked on, before any check-in/commit. Ideally it should be part of automated build process.

This is not to say that we should improve only these things, but it should be good starting if each developer from the team follows the boy scout rule.

If your code is easy to understand, it is easy to maintain and improve further.

Let me know if I’ve missed something.

Happy learning,

Vishal

Comment on Comments: why I removed my own comments


Hello Friends,

Recently I learned many things about comments, making me ashamed of my own code and comments. Many people may already know much about comments, if you are not among them, read on.

comment

Initially I did not use to write comments with the code thinking that those are not required, just occasionally I would write so that I can understand. We usually want to finish coding as soon as possible, as that is what matters in the program output. Later when I heard that comments are very important so that others can also understand the code, I started putting comments in my code liberally. But when recently I encountered some articles/book about comments I was shocked for how bad comments can be and why I didn’t think of it earlier. Below are my learning I want to share.

Comments are easier to write poorly than well, and commenting can be more damaging than helpful.” - Code Complete

Why comments are bad?

Difficult to maintain: Comments are difficult to maintain and very easy to ignore. They can easily get outdated.

//path is considered matched if the name matches with pattern and if it is folder
if(pattern == null || file.getName().matches(pattern))
{
if(type == null || ((file.isDirectory() && type.equals("Folder"))
 || (file.isFile() && type.equals("File")) || type.equals("Both")))
		matchedPaths.add(file.getPath());
}

In maintenance phase specially, we tend to forget to update them as the code changes as when we change them we know the code (well.. little bit) and comments don’t contribute to the code. We write code in a way it is easy to write, not easy to read. Even if we think of others and update the comments whenever we touch the code, others may not feel the same. And as they are not part of the code, the available tools like static code analyzer cannot detect the discrepancies between code and comments.

Some comments can be specially hard to maintain.

// Variable Meaning
// -------- -------
// xPos .......... XCoordinate Position (in meters)
// yPos .......... YCoordinate Position (in meters)
// ndsCmptng...... Needs Computing (= 0 if no computation is needed,
// = 1 if computation is needed)
// ptGrdTtl....... Point Grand Total
// ptValMax....... Point Value Maximum
// psblScrMax..... Possible Score Maximum

Introduces redundancy: Many a times I’ve seen code having comments like (including mine :|)

// set product to "base"
product = base;

// loop from 2 to "num"
for ( int i = 2; i <= num; i++ ) {
// multiply "base" by "product"
product = product * base;
}
System.out.println( "Product = " + product );

Not only it wastes the space, it can add confusion if it is not accurate. The code itself should be clear enough to understand. This was the main reason why I removed my earlier comments.

Crutches for bad code: We should have the mentality while writing code as if comments do not exist, so that the code itself is easy to follow. Someone said “Comments are like deodorant to the stinky code“. Some people use them as crutches for their bad code.

// write out the sums 1..n for all n from 1 to num
cur = 1;
pre = 0;
s = 1;
for ( int i = 0; i < num; i++ ) {
System.out.println( "Sum = " + s );
s = cur + pre;
pre = cur;
cur = s;
}

Hinders readability: Some comments look so dominant that the real code is obstructed

/*************************************************
' Name: CopyString
'
' Purpose: This routine copies a string from the source
' string (source) to the target string (target).
'
' Algorithm: It gets the length of "source" and then copies each
' character, one at a time, into "target". It uses
' the loop index as an array index into both "source"
' and "target" and increments the loop/array index
' after each character is copied.
'
' Inputs: input The string to be copied
'
' Outputs: output The string to receive the copy of "input"
'
' Interface Assumptions: None
'
' Modification History: None
'
' Author: Dwight K. Coder
' Date Created: 10/1/04
' Phone: (555) 222-2255
' SSN: 111-22-3333
' Eye Color: Green
' Maiden Name: None
' Blood Type: AB-
' Mother's Maiden Name: None
' Favorite Car: Pontiac Aztek
' Personalized License Plate: "Tek-ie"
'*************************************************/

The beauty should be in code not in comments. I’ve also seen at many places old code lies here and there commented. If it is not required now, it should be removed rather than lying and confusing others.

Then what?

Good names: Rather than comments explaining things, the code should have good, meaningful, self-explanatory names for variables, methods and classes.

What does below variables mean?

private String myString;

	public Foo(String theString) {
  	myString = theString;

  	int aPos1 = 0;
   int aPos2;
   // ...
	}

Extraction to methods/classes: Rather than having a comment for a piece of code, the code should be extracted to a method. The code should be properly re-factored time to time. Code should be properly modularized to different components, classes and methods.

For example:

private double squareRootApproximation(num) {
  root = n / 2;
  while ( abs( root - (num/root) ) > t ) {
    root = 0.5 * ( root + (num/root) );
  }
  return root;
}

instead of:

....
// square root of n with Newton-Raphson approximation
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
  r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );
....

Relying on version control system: For recording the history of code, version control systems are to be relied on, not the commented code. Many people keep the old code as commented while writing new code, which is totally unnecessary now a days.

/* public doSomething(String pre, String post)
{
  ....
}*/
public doSomething(String pre, String mid, String post)
{
  //new functioning
  ....
}

Where beneficial?

Intentions: You should mention your intention while writing code for example when you take some decision out of different alternatives. You should mention the why, not the what/how about the code. You can also mention the important TODOs in the comments.

//Using technique A because of limitations of B and C
....
....
//get current employee information
....
....
try{
...
}catch(ABCException ignored)
{
//not handled because ...
}

Consequences: You should warn about the important consequences of using some code.

//be careful, below code may impact ... as ...
....

Names insufficient: When you can’t use a function name to explain something, because otherwise_it_will_be_very_long_like_a_sentence.

Legal notices: Legal notices, like copyrights can be used as comments in the code.

 /*
  * Copyright (c) 1997, 2011, Oracle and/or its affiliates. All rights reserved.
  *
  ...

Standard documentation: For example Javadocs should be used in public APIs.

/**
  * Swaps the elements at the specified positions in the specified list.
  * (If the specified positions are equal, invoking this method leaves
  * the list unchanged.)  
  *
  * @param list The list in which to swap elements.
  * @param i the index of one element to be swapped.
  * @param j the index of the other element to be swapped.
  * @throws IndexOutOfBoundsException if either i or j
  *         is out of range (i < 0 || i >= list.size()
  *         || j < 0 || j >= list.size()).
  * @since 1.4
  */

Unchangeable code: If you can’t change some code for example library call results, but you want to put clarification, you should use the comments. 

//this gets the info related to ...
....

You may also find below articles interesting:

http://www.codinghorror.com/blog/2008/07/coding-without-comments.html

http://www.codeodor.com/index.cfm/2008/6/18/Common-Excuses-Used-To-Comment-Code-and-What-To-Do-About-Them/2293

http://pointlessprogramming.wordpress.com/2011/03/14/on-commenting-source-code-why-commenting-is-not-bad-practice/

Happy learning, bye.

Mind it: Synchronization is Risky


Hello Friends,

Most of the developers know the benefits of threads (responsiveness, exploiting multicores, etc), most of them also know the risks of threads (data inconsistency, deadlocks, context switch overhead, etc), but not all of them know how to minimize the risks while retaining the benefits. So here’s my humble attempt to simplify the understanding in the context of Java.

We know that thread synchronization is needed when multiple threads access some changeable data and one of them might change it. This mechanism in java enables us to enforce 2 important things: atomicity and visibility. Without atomicity (i.e. multiple actions as a single action without any interference of others in between), we may have race conditions which occur when the correctness of a computation depends on the relative timing or interleaving of multiple threads by the runtime. Memory visibility is also very important as without synchronization, a thread may not see the latest value of a variable that is changed by some other thread (this is to facilitate optimizations, like instructions reordering or caching variable values, by compiler, runtime and processor in the context of concurrency).

But here I’m going to suggest NOT to use synchronization (or at least not using locks) if possible. Why? because it can open a can of dangerous worms if you use it without utmost care. I’m not against the proper use of synchronization, but that is very hard to achieve and the explicit synchronization (i.e. lock based, using synchronized keyword) should be used as a last resort (in my opinion).

To start with, if you synchronize a big part of your code (like all methods synchronized), you may not get the benefit of concurrency at all, as all threads will execute it one after the other, and scalability suffers badly because of this. If you reduce the lock scopes too much and use too many locks, you increase the performance overload due to high context switching. Applying synchronization may also prevent the various optimizations done by compiler and runtime (like caching, reordering of instructions), thus limiting the performance. Then there are some serious liveness problems (like deadlocks, there’s no way to recover other than aborting the application). The indiscriminate use of locking may result into lock ordering deadlocks. For example 2 threads trying to acquire the same locks but in different order (T1: lockA -> lockB, and T2: lockB -> lockA), which may cause cyclic locking dependency and thus deadlock. Just as threads can deadlock when they are each waiting for a lock that the other holds and will not release, they can also deadlock when waiting for resources. Such programs also increase complexity and difficult to understand (which in turn may cause other problems). And last but highly important, such programs are very hard to test for correctness and performance.

So what are alternatives/ better ways?

Thread Confinement

images (1)

The best way to avoid coordination between threads is not to share. If an object is restricted to a thread it’s automatically safe from all those hazards. Its not just superficial to achieve, rather this model has been intentionally implemented in systems like Swing and others, because the problems it could have created are just too costly for those systems (deadlocks in GUI toolkits). Rather than sharing, we can use local variables (scoped within a method) as far as possible as threads keep local copies of them, avoiding any risk of sharing (but take care of not escaping the local objects from the method,  like assigning it to an instance variable).  Java also provides ThreadLocal class which makes it easy to use a variable in multithreading as it internally manages the copies of the variable for each thread. We can use a shared ThreadLocal variable and use its getter/setter methods to get/set value without worrying about which value is associated with which thread (get method always provides the value associated with the current thread). Check Thread Confinement.

Immutability

images (2)

The other best way is to share but make them immutable. If threads cannot change the state of an object, there is no risk. Initially its applicability may sound very little, but it’s not so. The popularity of Functional Programming lies in it as immutable data has no chance of side effects. You can also have a new immutable object when required, think of String in Java. Though making object immutable isn’t just about making every field final as the object a field variable refers to may be mutable. Immutable objects are those which state cannot be changed once they are constructed. So we need to be careful in the constructor that it does not escape. Check To mutate or not to mutate?. If you do not want to/ can not make your classes immutable (still suggest make the parts of it immutable as far as possible to reduce the side effect), you can make deep copies of object and pass to threads (if its affordably small in memory consumption), and later at the end can merge those copies if required. (Recently in my current project, I wanted to achieve concurrency for some tasks, so rather than synchronizing I identified the minimum changeable data required and provided new copies of it while entering in concurrent tasks, and when they finished, I merged those copies to have a single one as before to move ahead in the process flow. At the end I was very happy with its simplicity)

Volatile Variables

images (4)

In simple cases like shared status flag variable where you do not require atomicity of operations, you can use volatile variable. Read of a volatile variable always gives us the latest write by any thread (i.e. threads always see the latest value, which is not guaranteed in java without volatile or synchronization). It obtains no locks, so none of those hazards but its use is limited. Check Managing volatility.

Built-in Concurrent Collections

images (5)

Java (5.0+) provides some very useful collection classes specially designed for concurrency. They are powerful in terms of performance and scalability with very little risk compared to Collections.synchronizedXxx methods/ Vector/ Hashtable/ your own lock based synchronization. They use finer grained locking mechanism (like lock striping) and add support for some useful common compound actions like put-if-absent, replace and conditional remove. Some important classes are ConcurrentHashMap, CopyOnWriteArrayList (creates a new copy of the collection internally every time it is modified, well, thanks to immutability), ConcurrentLinkedQueue, LinkedBlockingQueue (blocking queue provides internal waiting on insertion and retrieval operations when the queue is full or empty respectively). Check Concurrent Collections and more.

Built-in Synchronizers/ Coordinators

images (7)

Synchronizer object is something that coordinates the control flow of threads based on its state. Java (5.0+) identified some common synchronization patterns and provided classes for that like latches, semaphores, barriers etc. Blocking queue is a special collection as it also provides coordination in producer-consumer pattern (through blocking). Now the benefit is they use the minimum synchronization required and they are well tested, so we can rely on them. A latch allows threads to wait until a certain number of events have occurred. The set of events could be initialization of certain resources, starting of certain services or readiness of certain users on which threads want to wait before proceeding. Barrier is like a latch but rather than waiting for events, it waits for other threads to come at certain point (barrier). With a barrier, all threads must come together at a barrier point at the same time to proceed. They are useful for example when we want to execute one step’s tasks in parallel but all these tasks must be completed before starting next step’s tasks in parallel (because the combined results of step one are required in the next step), kind of MapReduce. Counting semaphore is useful when you want to implement some resource pool or put a bound on a collection. For example you can implement database connection pool where it blocks if the pool is empty and unblocks when it becomes non empty. Similarly we can use semaphore to convert a collection into a blocking bounded collection, e.g. bounded HashSet. Check Synchronization Utilities. You can also check ReentrantLock and Thread Pools.

Atomic Variables and Nonblocking Sysnchronization

images (9)

Many of the java.util.concurrent classes are significantly better in performance and scalability as they use atomic variables and nonblocking synchronization. Atomic variables (like AtomicInteger, AtomicReference, etc) are like volatile variables but also provide some useful methods (like  incrementAndGet(), compareAndSet(), etc) to update them atomically without synchronization. Nonblocking algorithms use low level atomic machine instructions (like compare-and-swap) instead of locks to ensure data integrity. They offer high scalability and liveness advantages but are hard to design and implement. Using atomic variables in Java 5.0+ it is possible to build efficient nonblocking algorithms. Check Going atomic and Intro to nonblocking algorithms.

I know all these alternatives also have some trade-offs  but it is always better to be aware of them so that we can use the right thing in the right context. Also if you’ve still not read a wonderful book, Java Concurrency in Practice, stop exploring internet and read it (a must for every Java developer).

I hope it helps somebody. Let me also know if it can be improved.

Happy simplicity, bye.

Factory Method vs Abstract Factory (again?)


Hello Friends,

Recently I was asked by one of my friends what is the difference between Factory Method and Abstract Factory design patterns, but I didn’t seem to convince him easily. Of course I’d already read Head First, GoF, Pattern hatching, Refactoring to Patterns, but not deeply all of them and it was not recent. So I decided to explore more and tried to simply the differences, and here ‘s my attempt.

I assume that you are already familiar with both the patterns, so I’ll focus here on where the most people have confusion with the differences. Let’s revisit the definitions and their structure first:

Factory Method: Define an interface for creating an object, but let subclasses decide which class to instantiate. Factory Method lets a class defer instantiation to subclasses.

FM_DP

Abstract Factory: Provide an interface for creating families of related or dependent objects without specifying their concrete classes.

AF_DP

Differences

One product vs Set of many

Picture1

This is perhaps the easiest one but important certainly. Factory Method is used to create one product only but Abstract Factory is about creating families of related or dependent products.

Inheritance vs Composition

Picture2

This is perhaps the most confusing one (as both seems to be using inheritance). Factory Method depends on inheritance to decide which product to be created. In Creator class (in structure diagram), it has other methods also (implemented, to manipulate the product) which use the only abstract method FactoryMethod() to create the product and it can only be implemented/changed by subclasses.

Here the Creator class is also acting like Client which depends only on a single method to create a product. We have to create a subclass of whole Creator/Client class to create a new different Product. There’s no separate and dedicated class for creation of a Product, had it been the case we could have used it with composition where we can pass an object of factory to client and client can use it without getting into inheritance hierarchy.

On the other side, in Abstract Factory, there’s a separate class dedicated to create a family of related/dependent Products and its (any concrete subclass factory) object can be passed to the client which uses it (composition). Here the Client gets a different object (concrete factory) to create the Products, instead of creating itself (e.g. using factoryMethod() and forcing inheritance), and thus uses composition.

If we think of just a product creation facility and the client that uses it, it is clear that in Factory Method, we are restricted to use inheritance (class based) and in Abstract Factory we have the flexibility of composition (object based) to create specific Products.

//Factory Method
 class Client {
   public void anOperation() {
     Product p = factoryMethod();
     p.doSomething();
   }
   protected Product factoryMethod() {//or it can be abstract as well
     return new DefaultProduct();
   }
 }
 class NewClient extends Client {
   protected Product factoryMethod() {//overriding
     return new SpecificProduct();
   }
 }
 //Abstract Factory
 class Client {
   private Factory factory;
   public Client (Factory factory) {
     this.factory = factory;
   } 
   public void anOperation() {
     ProductA p = factory.createProductA();
     p.doSomething();//other products and operations as well
   }
 }
 interface Factory {
   ProductA createProductA();
   ProductB createProductB();
 }
 //concrete factories also, implementing Factory interface
 

Method vs full class (object)

Picture3

Factory Method is just a method while Abstract Factory is an object. The purpose of a Class having factory method is not just create objects, it does other work also, only a method is responsible for creating object. In Abstract Factory, the whole purpose of the class is to create family of objects.

Level of abstraction

level-of-abstraction-640x390

Abstract Factory is one level higher in abstraction than Factory Method. Factory Method abstracts the way objects are created, while Abstract Factory also abstracts the way factories are created which in turn abstracts the way objects are created.

One inside another

Picture4

As Abstract Factory is at higher level in abstraction, it often uses Factory Method to create the products in factories.

I also believe that we should not be obsessed with design patterns, these are all built on good basic design principles, and often mixed while using in real world.

I hope it helps somebody else also. Let me also know if it can be improved.

Happy patterns, bye.

When to Extend Which Extension Point in Eclipse Plugin


Hello Friends,

There’s no need to tell how much popular and widely used the Eclipse platform is. And the huge success of it lies in its extensibility.

It is extensible through the means of extension points (well defined exposed places/hooks for others to provide extended functionality) and plugins ( providing extended functionality using existing extension points and optionally exposing new extension points). Eclipse itself is made up of many and many of plugins built around the small core runtime engine capable of dynamic discovery, loading, and running of plug-ins.

In this post I’m not going to provide a hello-world plugin tutorial or introduction of Eclipse platform (as you can find many good ones on net). I’m going to share what difficulty I faced when I started plugin development. After reading some introduction tutorials I wanted to know quickly which extension points I need to use for some particular tasks. Once we know the name of extension point we can find the details in official eclipse documentation (Extension Points Reference) or even better within platform itself (Plugin Development Environment) when trying to add an extension point it gives the description and sample implementation (if available) to understand.

plugin1

So I’ll focus on quick introduction (to help you get started with further links) of some common extension points and when we can use them (not how) (considered release is Eclipse Juno). If you need introduction for Eclipse Plugin Development, you can read first http://www.vogella.com/articles/EclipsePlugIn/article.html for example or request google for other ones.

But before understanding the various extension points, we need to understand the Platform structure and few terms.

eclipse

The various subsystems (on top of Platform Runtime) as described above define many extension points to provide the additional related functionality.

Below are the various components of the workbench UI.

workbench

Now below describes the common functionality you want to extend/customize and which extensions points are your friends.

Add menus/buttons:

menus

declare an abstract semantic behavior of an action (command) with optional default handler: org.eclipse.ui.commands

add menus, menu items and toolbar buttons in UI: org.eclipse.ui.menus

add specific handlers to a command: org.eclipse.ui.handlers

declare key bindings (or set of them called as schemes) to commands: org.eclipse.ui.bindings

More help at: Configuring and adding menu items in Eclipse V3.3

Add Views:

view

define additional views  for the workbench: org.eclipse.ui.views

More help at: creating an Eclipse view (old)

Add Editors:

editor

add new editors to the  workbench: org.eclipse.ui.editors

More help at: Eclipse Editor Plugin Tutorial

Configure the launching of Applications:

launch1

launch2

define/configure a type for launching applications : org.eclipse.debug.core.launchConfigurationTypes

associate an image with a launch configuration type: org.eclipse.debug.ui.launchConfigurationTypeImages

define group of tabs for a launch configuration dialog: org.eclipse.debug.ui.launchConfigurationTabGroups

define shortcut for launching application based on current selection/ perspective: org.eclipse.debug.ui.launchShortcuts

More help at: Launching Framework in Eclipse

Add resource markers:

markers

add marker (additional information to tag to Resources like Projects, Files, Folders): org.eclipse.core.resources.markers

More help at: Using markers to tell users about problems and tasks

Add Preferences:

preference

add pages to the preference dialog box: org.eclipse.ui.preferencePages

More help at: Eclipse Preferences Tutorial

Add wizards:

wizard

add wizard to create a new resource: org.eclipse.ui.newWizards

add wizard to import a resource: org.eclipse.ui.importWizards

add wizard to export a resource: org.eclipse.ui.exportWizards

More help at: Creating Eclipse Wizards Tutorial

Contribute Help:

expanded_book

add help for an individual plugin: org.eclipse.help.toc

More help at: Contributing a little help

Define specialized searches:

register search pages: org.eclipse.search.searchPages

register search result pages: org.eclipse.search.searchResultViewPages

More help at: Custom Search Page

I hope it helps someone in the similar need. Let me know if it is useful or if you have any suggestions for improvement.

Happy Eclipse, Bye