Wednesday, December 24, 2008

Refactoring can be easier in Groovy than in Java

I love Groovy for the productivity, but I am sceptical to maintenance and refactoring of large code bases in Groovy. Without type safety, you cannot detect refactoring errors at compile time, so you need very high coverage with unit tests.

But the other day, I experienced something that got me thinking: I have a few Java classes and some Groovy scripts that call the Java code something like this:
def var = javaObject.func()

I changed the return type of the Java method, and the Groovy code just kept on working. I didn't have to change it at all. Why? Because the Groovy code did not know the type I was returning. The dynamic nature of Groovy made refactoring easier, not harder!

How can type safety make refactoring harder? Because type safety leads to duplication of information. In Java, the code above would look like this:
String var = javaObject.func()

The return type "String" would be duplicated in calls to this method, in interfaces that specified this method, in subclasses, in variables, and so on. Not so in Groovy. The return type is declared only in one place.

I always wondered why we designed programming languages different than databases. In databases, we try to avoid duplication of information. Duplication is a maintenance nightmare. But in programming, we duplicate information all the time. Not only with return types, but with boilerplate code, design patterns and class, variable and method names. In databases, we define immutable primary keys instead of refering to rows by values that could be modified. But in programming languages, we refer to everything by name. So when we want to change a name, we need a complicated "refactoring" tool to replace this name everywhere it occurs.

I dream of a programming tool that let my type code like: "x = myObject.func1()" but under the hood it would create references to "myObject" and "func1" instead of resolving that at compile time. Then, there would be no "rename refactoring", just change the name and the reference would still be valid. The text should then be updated automatically.

Thursday, December 4, 2008

How to test Java Persistence API in 5 minutes

I am the author of HiberObjects and want to give an example of how useful it can be.

Yesterday, someone asked me a Java Persistence API (JPA) question: If an object is removed from a bidirectional relation, do both objects have to be saved to update the database correctly?

I thought I knew the answer, but better test it first. This is how I did it:

1) Design a class diagram with 2 persistent classes User and Membership and 1 unit test UserPersistentTest:



2) Initialize the database with some objects that I can test against: Right-click on the UserPersistentTest class and select Design Objects, then design objects like this:



3) Save the diagram. This will generate JPA classes, a unit test and a helper class. The unit test will initialize the database with objects like this:
   private void initObjects()
{
EntityManager entityManager = persistenceHelper.getEntityManager();
EntityTransaction tx = entityManager.getTransaction();
tx.begin();
user = new User();
membership = new Membership();
user.setName("Lars");
membership.setX(100);
entityManager.persist(user);
entityManager.persist(membership);
user.getMembership().add(membership);
membership.getUser().add(user);
tx.commit();
entityManager.close();
}
4) Implement the test method:
   public void test1() throws Exception
{
EntityManager em = persistenceHelper.getEntityManager();
em.getTransaction().begin();
User user1 = (User) em.createQuery("from User").getSingleResult();
assertEquals(1, user1.getMembership().size());

Membership membership1 = (Membership) em.createQuery("from Membership").getSingleResult();
assertEquals(1, membership1.getUser().size());

assertSame(membership1, user1.getMembership().iterator().next());
assertSame(user1, membership1.getUser().iterator().next());

user1.removeMembership(membership1);
assertEquals(0, user1.getMembership().size());
assertEquals(0, membership1.getUser().size());
em.getTransaction().commit();

em.getTransaction().begin();
User user2 = (User) em.createQuery("from User").getSingleResult();
assertEquals(0, user2.getMembership().size());
Membership membership2 = (Membership) em.createQuery("from Membership").getSingleResult();
assertEquals(0, membership2.getUser().size());
em.getTransaction().commit();
}
5) Run as JUnit test:



This demonstrates the answer to the question: You don't need to save any of the objects explicitly if you change them in a transaction.

Saturday, July 19, 2008

My favorite Eclipse view

Have you discovered the Display view in the Eclipse debugger? It is my personal favorite. In it, you can execute any Java code in the context of a running debugger.

To open this view, select Window > Show View > Display.

When the debugger hits a break point, you can type some Java code in the Display view. The Java code can use the local variables in the currently selected frame in the Debug view.



To execute the code and display the returned value, push the button with a "J":



If you just want to execute some code that doesn't return a value, push the button with an arrow ">" and a "J":



The standard output will be printed to the Console view:



This is incredibly useful! You can for instance run some code in a debugger and run some complex debugging code, or prototype some code for a new method.

Sunday, July 13, 2008

World War 2, Rationalism and Agile Development

One of the central values of the Agile Manifesto is Responding to change over following a plan. This doesn't mean that it's not agile to make a plan. You should have a plan, but you should realize the limitations of the plan.

Are there any parallels to this in history?

The Soviet Union had 5-year plans. Stalin was quite succesful in industrializing the rural country in a short span of time. But his methods did not work well in the battlefield. In the 2nd world war, the Soviet suffered severe losses in the beginning.

You can plan how to build and manage a factory, because the complexity is limited. But a war is far more complex. You cannot predict what the enemy is going to do. This doesn't mean that you should go to war without a plan. Far from it, you should plan as much as possible, but be prepared to improvize when something unexpected happens. As Dwight D. Eisenhower, a successful American general in World War 2, said: "Plans are nothing; planning is everything."

Building software is also complex. But some managers try to run a software development organization as a factory. They treat their employees like cog-wheels. They think that Employee #1093 can take over the responsibilities of Employee #835 with just some training. Or that 4 programmers can do in 3 months what one programmer can do in 1 year. It doesn't work like that. If software development was so predictable, then robots could do it. But they can't.

I don't like being treated like a cog-wheel. I am not Employee #1093, I am me. I am unique with my own weaknesses and strengths. If I am treated like a replaceable part, I may start to behave like one. Then, I will only do as I'm told, nothing more, nothing less. If nobody listens to my ideas, I will stop thinking creatively, and just do things the way we always did.

Since I don't want to become mediocre, I avoid environments like that. I seek challenging environments where my ideas are heard and I am given responsibility to work as I think is best.

I think that Churchill, the Prime Minister of Great Britain during WW2, was agile too. He was not afraid to change. He said: "To improve is to change; to be perfect is to change often.

He also said, "However beautiful the strategy, you should occasionally look at the results." That sounds like iterative development to me. You make a plan, and then review the results now and then to see how it works. (See more fantastic quotes of Churchill here.)

So how about Hitler? How agile was he? Not agile at all. He had a great vision and a plan that seemed to work very well in the beginning. But as we all know, he failed. I think this is similar to a waterfall project: He was 80% finished with conquering Western Europe, and 75% finished with Eastern Europe. If he had run the war more iteratively, he should have completed Western Europe first, and then started on Eastern Europe. Thank God, he didn't!

What did Hitler and Stalin have in common? Both nazis and communists were rationalists with a strong belief in their own enlightened minds. I dare say that Churchill and Eisenhower had a more humble attitude, accepting that they could not make perfect plans. I would also say that Churchill was empiric in that he wanted to review the results of his plan.

Rationalists think they can plan everything. Empirists believes in experimental evidence. I think that waterfall development is a result of rationalist thinking, but it doesn't work well, because our minds are limited. We cannot predict everything, but we are capable of planning to a large degree. Therefore, I believe iterative development where you make a plan but also review the results regularly is the most effective way.

PS: I am not a historian nor a philosopher, so please, enlighten me if I got something wrong.

Saturday, May 17, 2008

Use examples to make your code easier to understand

Programmers are used to abstract thinking. To program is to generalize: A method is a general specification of what to do during execution. A class is a general specification of objects. A superclass is a generalization of several classes.

Altough our minds are capable of abstract thinking, concrete thinking is much easier, and concrete examples are the foundation for abstractions. For instance, when we were children, our parents didn't try to teach us about cars by explaining to us cars are and what they can do. Instead, they just pointed at a car that was driving by and said ”Look, a car!” When they had done that a number of times, we knew what a car was.

Another example is prejudice. We all have prejudices, because this is the way our minds work. If we have met a few people from Denmark in our lives, and those people were friendly, we ”know” that Danes are friendly. And this works even stronger for negative prejudices.

My point is that we learn by examples. Einstein said that examples is not another way to teach, it is the only way.

As programmers, we are highly skilled in abstract thinking and expressing ourselves in programming languages. But when the code becomes too complex, we try to comprehend it by making examples in our minds of what the code would do in different circumstances. We imagine or write down the values of different variables and what the behavior of different tests and loops will be with those values.

And when it becomes too hard to imagine what is actually going on by reading the abstract code, we use debuggers for help. A debugger is not abstract. It gives a concrete description of what is actually happening at runtime. It gives information on how methods execute in specific instances and which objects exist at a specific moment. This concrete information is so much easier to understand than the abstract code.

Therefore, it's a good idea to design concrete examples of objects and scenarios before implementing the abstract classes.

Unit tests are exactly such examples. They make it easier to understand what happens at runtime. And a well designed unit test is the best way to explain to somebody what the code does. If you ever had to take over somebody else's code, you know how hard it can be to understand it. Where do you start? Unit tests is a good place to start.

So, my suggestion is to think of unit tests as documentation. Write unit tests that are easy to read and that explain the intent of the code.

Saturday, March 8, 2008

I believe in Java for future web applications

Will Java become the dominant platform for web applications? Actually, I think it has a good chance.

How will the future web applications be? I think we can expect the following:
* Highly interactive
* Complex, more work will be done on the client
* Collaborative, not only for games, but also for other applications
* Mobile
* Location aware

AJAX is fantastic compared to "web 1.0", and technology like GWT (Java) makes it relatively easy to develop complex user interfaces. But HTML and HTTP still sucks as an application platform, something they were never meant to be. I believe that if Microsoft hadn't managed to kill the Applet, Java would be the dominant platform for more complex web applications today.

I think Consumer JRE can be that platform. Java can be used to make complex, interactive applications that communicate with each other client-to-client.

This doesn't mean we have to implement in Java. When consumers have a JRE ready, there is nothing stopping us from developing applications in Groovy, JRuby and Scala and deliver them over the web.

I don't know much about Flash, but I am sure it is at least as good as Java for making flashy applications. But when it comes to complex collaborative and mobile applications, I think Java has the upper hand.

And Java has one more powerful force behind it: The community. The Java community has developed hundreds of libraries and frameworks. Too many, perhaps. But this community is capable of revolutionizing the way web applications are developed in ways we cannot predict now.

Tuesday, February 26, 2008

Passwords in phpBB 3

Here is a port of phpBB3's password handling to Java.

import java.io.UnsupportedEncodingException;
import java.security.GeneralSecurityException;
import java.security.MessageDigest;

/**
* Port of phpBB3 password handling to Java.
* See phpBB3/includes/functions.php
*
* @author lars
*/
public class PHPBB3Password {
private static final int PHP_VERSION = 4;
private String itoa64 =
"./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

public String phpbb_hash(String password) {
String random_state = unique_id();
String random = "";
int count = 6;

if (random.length() < count) {
random = "";

for (int i = 0; i < count; i += 16) {
random_state = md5(unique_id() + random_state);
random += pack(md5(random_state));
}
random = random.substring(0, count);
}

String hash = _hash_crypt_private(
password, _hash_gensalt_private(random, itoa64));
if (hash.length() == 34)
return hash;

return md5(password);
}

private String unique_id() {
return unique_id("c");
}

// global $config;
// private boolean dss_seeded = false;

private String unique_id(String extra) {
// TODO Generate something random here.
return "1234567890abcdef";
}

private String _hash_gensalt_private(String input, String itoa64) {
return _hash_gensalt_private(input, itoa64, 6);
}

private String _hash_gensalt_private(
String input, String itoa64, int iteration_count_log2) {
if (iteration_count_log2 < 4 || iteration_count_log2 > 31) {
iteration_count_log2 = 8;
}

String output = "$H$";
output += itoa64.charAt(
Math.min(iteration_count_log2 +
((PHP_VERSION >= 5) ? 5 : 3), 30));
output += _hash_encode64(input, 6);

return output;
}

/**
* Encode hash
*/
private String _hash_encode64(String input, int count) {
String output = "";
int i = 0;

do {
int value = input.charAt(i++);
output += itoa64.charAt(value & 0x3f);

if (i < count)
value |= input.charAt(i) << 8;

output += itoa64.charAt((value >> 6) & 0x3f);

if (i++ >= count)
break;

if (i < count)
value |= input.charAt(i) << 16;

output += itoa64.charAt((value >> 12) & 0x3f);

if (i++ >= count)
break;

output += itoa64.charAt((value >> 18) & 0x3f);
} while (i < count);

return output;
}

String _hash_crypt_private(String password, String setting) {
String output = "*";

// Check for correct hash
if (!setting.substring(0, 3).equals("$H$"))
return output;

int count_log2 = itoa64.indexOf(setting.charAt(3));
if (count_log2 < 7 || count_log2 > 30)
return output;

int count = 1 << count_log2;
String salt = setting.substring(4, 12);
if (salt.length() != 8)
return output;

String m1 = md5(salt + password);
String hash = pack(m1);
do {
hash = pack(md5(hash + password));
} while (--count > 0);

output = setting.substring(0, 12);
output += _hash_encode64(hash, 16);

return output;
}

public boolean phpbb_check_hash(
String password, String hash) {
if (hash.length() == 34)
return _hash_crypt_private(password, hash).equals(hash);
else
return md5(password).equals(hash);
}

public static String md5(String data) {
try {
byte[] bytes = data.getBytes("ISO-8859-1");
MessageDigest md5er = MessageDigest.getInstance("MD5");
byte[] hash = md5er.digest(bytes);
return bytes2hex(hash);
} catch (GeneralSecurityException e) {
throw new RuntimeException(e);
} catch (UnsupportedEncodingException e) {
throw new RuntimeException(e);
}
}

static int hexToInt(char ch) {
if(ch >= '0' && ch <= '9')
return ch - '0';

ch = Character.toUpperCase(ch);
if(ch >= 'A' && ch <= 'F')
return ch - 'A' + 0xA;

throw new IllegalArgumentException("Not a hex character: " + ch);
}

private static String bytes2hex(byte[] bytes) {
StringBuffer r = new StringBuffer(32);
for (int i = 0; i < bytes.length; i++) {
String x = Integer.toHexString(bytes[i] & 0xff);
if (x.length() < 2)
r.append("0");
r.append(x);
}
return r.toString();
}

static String pack(String hex) {
StringBuffer buf = new StringBuffer();
for(int i = 0; i < hex.length(); i += 2) {
char c1 = hex.charAt(i);
char c2 = hex.charAt(i+1);
char packed = (char) (hexToInt(c1) * 16 + hexToInt(c2));
buf.append(packed);
}
return buf.toString();
}
}

Friday, February 22, 2008

Agile practices don't primarily solve problems

But they sure help detecting problems early. And that's extremely valuable.

For instance, iterative development with deliveries early in the project will test the technical feasibility and the development speed. But perhaps most important, we will get feedback on the functionality from the clients, to verify if we have understood their problem correctly.

Failing in any of these areas will lead to project failure. And it is better to discover this when 20% of the budget is used than when 90% of the budget is used. First, you reduce your losses. And even better, you have a fair chance of fixing the project and turning it into success.

Many agile projects succeed not because they are more productive, but because they discover problems in early iteations and then reduce the requirements.

Unit testing is another practice that makes you aware of problems early. This makes it cheaper to fix problems, but it also gives confidence, so you dear restructure code when necessary.

Another agile practice is close communication with business experts, preferably face to face. The non-agile alternative is written requirements. The problem is that developers misunderstand what the business experts want. That's inevitable. What we need, are mechanisms to discover misunderstandings. That mechanism is called feedback. The business expert explain what they want, and the developer explain how they understood that to the business expert, which confirms that this was correct. That's easier to do face to face than with written documents.

Monday, February 11, 2008

Hungarian notation and Groovy

Comments and types are metadata and strictly not necessary to make a program, according to a popular post. Yes, they are not necessary to communicate with the computer, but that's not what we are trying to do here. For all but the smallest projects, we are communicating with people. The code should be understandable to others. And it doesn't hurt if it's understandable to ourselves either.

Hungarian notation was a popular in C to show the type of a variable in its name, for instance:
int nSize;
char cChar;

I never liked this, because if I change the type of a variable I would also have to change its name everywhere. But with dynamically typed languages, I some times get in trouble because I forget what type a variable is. So, I started to think if maybe Hungarian notation could be useful.

For example, consider the following dynamically typed Groovy code:
(1) def x = "100"

the following statically typed Java code:
(2) String x = "100";

and this statically typed Java code:
(3) int x = "100";

Which is most easy to understand?
(3) is easily recognized by both humans and the compiler as an error.
(1) and (2) are valid code. But 20 lines down the code, you have forgotten what type x is and write:
x++

Then (2) will give you a compiler error, but (1) won't be detected until runtime.

Does this mean I'm against dynamically typing? No. But if we remove the static type checking, I think we should use other means to communicate that information. For instance, it is a bad idea to call a String x. This would be much better:
def s = "100"

In Groovy, I often use maps instead of creating a new class:
def map = [:]
map.x = 100
map.y = 200

How am I supposed to remember what the map contains down the road?

I don't want to remember, I want the map to remind me! I have enough things to think about, I don't want to remember the type of every variable too!

I have a lame technique I use to not forget important things: I try to make them remind me instead of having to remember them. For instance, if I take off the lid to fill gas in the car, I put the car keys by the lid. That way, it's impossible for me to forget it. I cannot drive off without seeing the lid and remember to put it on. I use similar techniques at work. This reduces stress and helps me to concentrate, because I don't need to remember so much.

Applying this technique to Groovy gives this code:
def point = [:]
point.x = 100
point.y = 200

So, I am not advocating Hungarian notation, but good naming conventions that communicate what we need to know.

Programming is more about communication than how fast you can crank out code.

Wednesday, February 6, 2008

The Pessimistic Programmer

I decided to change the title of this blog to "The Pessimistic Programmer". Why? Am I a depressed person that thinks nothing will work? No, I am an optimist in life. Something good is going to happen today :-) But in programming, something will surely go wrong.

I don't actually view this as pessimism, but as realism. I want to be prepared for the worst that can possibly happen. Hope for the best, prepare for the worst. But my wife explained to me that pessimists always say that they are just being realistic. So, I might as well face it: I am a pessimist.

I think a good programmer needs to be pessimistic; always thinking about what can go wrong and how to prevent it. I don't say that I am a good programmer myself. No, I make far too many mistakes for that. But I have learnt how to manage my mistakes with testing and double checking.

Über-programmers can manage well without being pessimistic. They have total overview of the code and all consequences of changes. But I'm talking about us mere mortals. But if you are an über-programmer, you should be pessimistic about what the next guy will do to your code. Place some comments and unit tests in there, to keep him out of trouble!

An optimistic programmer doesn't see the need for unit tests. He will run the program and be satisfied when it runs one time without errors. He will say things like "What can possibly go wrong?" Or if something goes wrong anyway, "It must be an error in the input from that other module".

Haven't we all been there? Suddenly, I feel so old...

Passwords in SMF 1.1.4

I'm posting this in case someone else are struggling with SMF password encrypting like I did.

SMF 1.1.4 uses SHA-1 with a salt. You would think that the passwordSalt in the database is used as the salt, but it isn't. That's probably a field that was used in old versions. Instead, the membername is used as the salt, but *before* the password, not *after*, as most search results indicated.

I finally found this link to a php file that is *not* in the distribution.
function smf_registerMember(
[...]
$register_vars = array(
'memberName' => "'$username'",
'realName' => "'$username'",
'passwd' => '\'' . sha1(strtolower($username) . $password) . '\'',
'passwordSalt' => '\'' . substr(md5(rand()), 0, 4) . '\'',


So I tried this Java code:
sha1(username.toLowerCase() + password);
That worked!

Here is the source for the sha1 method:
   public static String sha1(String data)
{
byte[] bytes = data.getBytes();
try
{
MessageDigest md5er = MessageDigest.getInstance("SHA-1");
byte[] hash = md5er.digest(bytes);
return bytes2hex(hash);
}
catch (GeneralSecurityException e)
{
throw new RuntimeException(e);
}
}

private static String bytes2hex(byte[] bytes)
{
StringBuffer r = new StringBuffer(32);
for (int i = 0; i < bytes.length; i++)
{
String x = Integer.toHexString(bytes[i] & 0xff);
if (x.length() < 2)
r.append("0");
r.append(x);
}
return r.toString();
}

Friday, January 25, 2008

Example Driven Development and Unit testing

In a project I once worked, we were required by company standards to write formal test specifications for manual testing. It was a lot of overhead to write these Word documents and get them through the bureaucracy. Finally, we proposed to write the test specifications as comments inside functional unit tests. That way, we could maintain the test documents easily. We saved a lot of work, and the documents got higher quality because it was easier to update them. And it really helped us to write good unit tests that covered the functionality.

In this post, I will take this approach one step further to show how unit tests and manual tests can be unified.

In an earlier post Example driven development, I argued that a few simple examples can be used for requirements, manual testing and unit testing. I don't say that a few examples are sufficient as a requirements specification, but they may be in relatively simple projects. And it is far better than nothing. Examples help you to think clearly, and to communicate accurately with others. Don't we all use examples when we try to explain something? That's the best way to explain something anyway.

The problem is that it is hard to keep the examples up to date, and then they lose the value they had for communication with the client and manual testing.

But if we implement the examples as unit tests, we completely avoid this problem! As long as the unit tests pass, the examples will be in sync with the code. And if the client changes the requirements, we modify the unit tests, and then implement the changes until the unit tests pass.

Unit tests are hard to read for non-programmers, but if we put a lot of effort into it, we can make them readable. They don't need to be writable, as programmers will write them.

Here is an example of a test of a servlet that generates licenses:
   public void testGenerateLicense() throws Exception
   {
      // Call the servlet.
      InputParams params = new InputParams();
      params.productId = "123456";
      params.quantity = 1;
      params.firstName = "Lars";
      params.lastName = "Høidahl";
      params.email = "lars@mycompany.com";
      params.company = "Object Generation";
      params.country = "Sweden";
      File licenseFile = servlet.generateLicense(params);
      
      // Check the returned file.
      assertTrue("Attachment is a license file",
            licenseFile.getName().endsWith(".lic"));
      
      // Check that 1 user was created in the database.
      List users = userDatabase.getAllUsers();
      assertEquals("Users", 1, users.size());
      User user = users.get(users.size()-1);
      assertEquals("User name", "Lars", user.getUsername());
      assertEquals("Email", "lars@mycompany.com", user.getEmail());
      assertEquals("Country", "Sweden", user.getCountry());
   }

   public void testGenerateMultipleLicenses() throws Exception
   {
      // Call the servlet.
      InputParams params = new InputParams();
      params.productId = "123456";
      params.quantity = 3;
      params.firstName = "Lars";
      params.lastName = "Høidahl";
      params.email = "lars@mycompany.com";
      params.company = "Object Generation";
      params.country = "Sweden";
      File licenseFile = servlet.generateLicense(params);
      
      // Check the returned file.
      assertTrue("Attachment is a zip file",
            licenseFile.getName().endsWith(".zip"));
      ZipFile zipFile = new ZipFile(licenseFile);
      assertEquals("Number of entries", 3, zipFile.size());
      
      // Check that 3 users were created in the database.
      List users = userDatabase.getAllUsers();
      assertEquals("Users", 3, users.size());
      User user = users.get(users.size()-1);
      assertEquals("User name", "Lars", user.getUsername());
      assertEquals("Email", "lars@mycompany.com", user.getEmail());
      assertEquals("Country", "Sweden", user.getCountry());
   }

Wednesday, January 23, 2008

Database dump with Java

I need to update a database that is created by PHP. The problem is that I am not a PHP coder, but a Java coder, and I need to use some other Java libraries to get the job done. So how can find out exactly which tables to update and how? It would take me weeks to search the PHP code, and I still wouldn't be sure if I got it right.

The first step is to install a clean application on my computer. There is no user data in the database, so if I perform commands like creating a user etc in the web application, I can look at what changed in the database. I'm sure that could be done in MySQL, but I'm not an expert on that either. When the only tool you have is a hammer, everything looks like a nail. So, I'll use Java for that to.

So, I wrote a small Java application that produces exactly the output that I need. It reads metadata from the database to find all tables and columns, lists that metadata and the content of all the rows.

Here it is:
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.PrintWriter;
import java.sql.Connection;
import java.sql.DatabaseMetaData;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Properties;

public class DumpDatabase {
private Connection conn;
private PrintWriter out;

public static void main(String[] args) {
try {
DumpDatabase app = new DumpDatabase();
String timestamp = new SimpleDateFormat("MMdd-hhmmss")
.format(new Date());
String fileName = "dumpdatabase-" + timestamp + ".txt";
System.out.println("Creating " + fileName);
app.dumpDatabase(fileName);
System.out.println("Done.");
} catch(Exception e) {
e.printStackTrace();
System.exit(1);
}
}

public DumpDatabase() throws IOException,
ClassNotFoundException, SQLException {
// Get database connection from hibernate.properties.
// Or hard-code your own JDBC connection if desired.
InputStream in = getClass().getResourceAsStream(
"/hibernate.properties");
Properties properties = new Properties();
properties.load(in);
String driver = properties.getProperty(
"hibernate.connection.driver_class");
String url = properties.getProperty(
"hibernate.connection.url");
String user = properties.getProperty(
"hibernate.connection.username");
String password = properties.getProperty(
"hibernate.connection.password");

Class.forName(driver);
conn = DriverManager.getConnection(url, user, password);
}

public void dumpDatabase(String fileName)
throws FileNotFoundException, SQLException {
out = new PrintWriter(fileName);
listAll();
out.close();
conn.close();
}

public void listAll() throws SQLException {
DatabaseMetaData metadata = conn.getMetaData();
String[] types = { "TABLE" };
ResultSet rs = metadata.getTables(
null, null, null, types);
while(rs.next()) {
String tableName = rs.getString("TABLE_NAME");
listTable(tableName);
}
}

private void listTable(String tableName) throws SQLException {
PreparedStatement statement = conn.prepareStatement(
"select * from " + tableName + " a");
out.println("----" + tableName + "----");
int rowNo = 0;
ResultSet rs = statement.executeQuery();
while(rs.next()) {
if (rowNo == 0)
printTableColumns(rs);
printResultRow(rs);
rowNo++;
}
}

private void printTableColumns(ResultSet rs)
throws SQLException {
ResultSetMetaData metaData = rs.getMetaData();
for(int i = 0; i < metaData.getColumnCount(); i++) {
int col = i + 1;
out.println(metaData.getColumnName(col) + " "
+ metaData.getColumnTypeName(col) + " " + "("
+ metaData.getPrecision(col) + ")");
}
out.println("");
}

private void printResultRow(ResultSet rs) throws SQLException {
ResultSetMetaData metaData = rs.getMetaData();
for(int i = 0; i < metaData.getColumnCount(); i++) {
String column = metaData.getColumnName(i + 1);
try {
String value = rs.getString(column);
if (value != null && !value.equals("null")
&& !value.equals("") && !value.equals("0"))
out.print(column + ": " + value + ", ");
}
catch(SQLException e) {
out.print(column + ": " + e.getMessage());
}
}
out.println("");
}
}

The application reads the JDBC properties from an hibernate.properties file, for example like this:
hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
hibernate.connection.driver_class=com.mysql.jdbc.Driver
hibernate.connection.url=jdbc:mysql://localhost/users
hibernate.connection.username=<username>
hibernate.connection.password=<password>
hibernate.show_sql=true


It creates a file like dumpdatabase-01-23-10-30-38.txt:
----tb_role----
----tb_user----
user_id BIGINT (20)
username VARCHAR (255)
password VARCHAR (255)
first_name VARCHAR (255)
last_name VARCHAR (255)
email VARCHAR (255)

----tb_user_role----

Then, I create a user and run the program again to get a new dump dumpdatabase-01-23-10-32-05.txt:
----tb_role----
----tb_user----
user_id BIGINT (20)
username VARCHAR (255)
password VARCHAR (255)
first_name VARCHAR (255)
last_name VARCHAR (255)
email VARCHAR (255)

user_id: 1, username: albert, password: pw, first_name: albert,
last_name: albertson, email: albert@dot.com,
----tb_user_role----

Of course, in the real application the dumps are thousands of lines, but with Eclipse "Compare with each other", it' still easy to see what happened, so I know exactly what my Java code needs to do.

Monday, January 21, 2008

Refactor Groovy code?

I love Groovy! It's great for getting things done quickly. From Groovy, I can call Ant and XSLT easily. I can build and parse XML. I can call external applications, read file content and traverse a file hierarchy with very little code. And I can modify my program at runtime, without compiling and deploying it again.

But this dynamics come at a cost. For instance, I really miss code completion. Some times, I go to a Java editor and create an object of some class and push CTRL-SPACE to find out the parameters for a method. Also, I miss that the editor doesn't indicate errors if I misspell a function or variable name.

And I always wondered how well a dynamic language would scale. Now, I think I know the answer: It doesn't scale well! I have a Groovy template of 500 lines, and it's almost impossible to modify it. And today, I have been trying to refactor a Groovy class of 800 lines, and it's really, really hard.

How hard would it be to modify a Java class of 500-800 lines? Not hard at all. Unfortunately, the Java version of these classes would probably be several thousand lines long, so there is no perfect solution that I know of. And I can probably blame myself for hacking these unmaintainable scripts together. But that's why I loved Groovy in the first place! The ability to produce something quickly.

Actually, I write some code quickly in Java too some times, but I never have any problems refactoring it later. Maybe the problem is that there is no refactoring support in Groovy. I am so spoiled by Eclipse for Java in this area. I really hope this is the problem, and that refactoring support and auto completion comes to my editor soon. Because I love the productivity that Groovy gives.

Until then, my advice is this: Use Groovy for relatively small scripts or to replace several other tools with one. For larger programs, I think it's better to stick with Java for now. Unless you know the domain, so you write code right the first time, and never need to refactor it.

Sunday, January 20, 2008

Defensive Programming

After thinking some more about what I wrote in Small trick with Copy & Paste, I saw that that's just an example of an attitude to programming. We can call it defensive programming.

When I program, I always try to think "What can possibly go wrong here? And how can I make sure that doesn't happen?". One example is that I mess up variable names.

Another example is that copy and paste are a source of problems. If I duplicate code, and I need to change it, I may forget to change it everywhere. So, it's usually better to generalize it.

Another is that I change how a method is supposed to be used. For example, let's say I have a function:
    void sendEmail(String to, String cc,
String subject, String body)
But then I change it to
    void sendEmail(String to, String subject,
String body, String attachment)
So all callers have to change their parameters for this to work properly.

To be sure this doesn't break, it's better to rename the method:
    void sendEmailWithAttacment(String to, String subject,
String body, String attachment)
Then, the compiler will make sure it works. At least in a static language like Java. In Groovy or something you wouldn't detect this until runtime. That's why I'm not totally sold into dynamic languages, although I like Groovy a lot. If we use a dynamic language, we have to be defensive by using more unit tests.

A final example is if you fix a bug. There is always a risk that you or someone else will revert the change. How can we be sure this doesn't happen? The best way is to reproduce the bug with a unit test before fixing it. If this is too much work, at least write a comment explaining the fix.

It all comes down to admitting that I do mistakes and trying to prevent them.

Friday, January 18, 2008

Example Driven Development

This article is the first in a series about Test Driven Development. A test is an example of what a program should do. Examples make the requirements concrete, so they become easier to understand. I will show how examples can be used for requirements, manual testing and unit testing.

I am currently working on automatic order handling for a client. I will implement a servlet with the following tasks:
1. Decode input parameters from a web page.
2. Generate an encrypted license with some of those parameters.
3. Store the license in a database. I can model this database as I want.
4. Store customer data in another database. This database schema is "carved in stone".
5. Send the license as an email to the user.

This is a small project, where I will implement, test and deploy the servlet myself. If it doesn't work, my client will lose customers, and I will be responsible.

So how can I make sure this doesn't happen?

I don't trust myself to make so good quality that there won't be any problems. So, I need to do thorough testing. I will make unit tests, but how do I make good ones?

Unit testing is important, but it's not enough. There are always some issues that don't show up until manual testing. I also need to test the integration between my program and other components manually. But how do I make good manual tests? Do I need to write a formal test specification for this simple program?

I think I do. I need to discuss the requirements with my client, and some examples of what the program is supposed to do is the best way to do that. By making a few concrete examples, I can make sure there are no misunderstandings, and that no requirements are missing. I'm sure there will be some misunderstandings anyway, but that's why I'm going to test it.

I'll use this test specification both for manual testing and as basis for the unit tests. That way, I'll have good unit tests that cover the functionality well, and there hopefully shouldn't bee too many surprises when it comes to integration testing.

I start by writing down 2 examples that the servlet should handle. For each example, I will list the input parameters, output values and any changes to the databases. It doesn't take long to write these down on a piece of paper, and it will be very much worth the effort.

Example 1: A customer orders 1 license of type "Enterprise"

Input
product_id   = (ask the client what the product id is)
quantity = 1
company_name = Dot Com Inc
email = john@dotcom.com
I have already identified a very important question I need to ask my client: What are the product ids that the servlet will receive?

License Content
type         = Enterprise
company name = Dot Com Inc
email = john@dotcom.com
expires = 3/25/2009
License Database
Insert a license with the following values:
type         = Enterprise
company name = Dot Com Inc
email = john@dotcom.com
date = 3/25/2008
expires = 3/25/2009
Customer Database
(I need to ask the client about this.)

Email
to         = john@dotcom.com
subject = License
content = Please find attached an Enterprise license to
Dot Com Inc and email john@dotcom.com.
attachment = enterprise.lic
Example 2: A customer orders 1 license of type "Basic"

Input
product_id   = (Ask the client)
quantity = 1
company_name = Dot Com Inc
email = john@dotcom.com
License Content
type         = Basic
company name = Dot Com Inc
email = john@dotcom.com
expires = 3/25/2009
License Database
Insert a license with the following values:
type         = Basic
company name = Dot Com Inc
email = john@dotcom.com
date = 3/25/2008
expires = 3/25/2009
Customer Database
(Ask the client.)

Email
to           = john@dotcom.com
subject = License
content = Please find attached a Basic license to
Dot Com Inc and email john@dotcom.com.
attachment = basic.lic
Now, I need to go through these examples with my client.

When I talked to the client, the he pointed out that it is possible to order multiple licenses of the "Enterprise" product at once. I hadn't though about that. Together, we discussed how to handle this.

In the next article, I will write about how to implement and unit test the servlet.

Small Java trick with copy - paste

Here is a small trick I use to avoid getting into problems.

Some times, I need to copy a block with a variable, for instance:
      Address addr = list.get(0);
assertEquals("city", "Uppsala", addr.getCity());
and paste it to create something similar like this:
      Address addr2 = person.getAddress()
assertEquals("city", "Uppsala", addr.getCity());
Did you see the bug?

The trick I use to avoid bugs like this, is to change the variable name in the original block:
      Address addr1 = list.get(0);
assertEquals("city", "Uppsala", addr1.getCity());
Address addr2 = person.getAddress()
assertEquals("city", "Uppsala", addr.getCity());
Then the bug becomes obvious.

I can't tell you how many times this has saved me.

Saturday, January 12, 2008

Use cases and robustness analysis

Use cases or user stories? Use cases are bigger and require more work. User stories are quicker and easier, more agile. That may be ok if you have other agile practices in place. If you have a customer or functional expert available to the team at all times, user stories may work quite well. But if you don't, you may need something heavier, like use cases.

It also depends on the complexity of the requirements. If the requirements are simple, use cases are not necessary. But for complex requirements, I think use cases are better suited than user stories. They help the customer to think through the requirements earlier, and they give the developers better understanding of what the system is supposed to do.

But use cases are not agile! Yes, they may be. Agile doesn't mean lightweight, it means rightweight. It means you adapt the process to the project. A large, complex project needs a heavier process than small and simple ones.

RUP had an activity called robustness analysis to analyze model, view and controller objects from use cases. I never really understood the purpose of this, until recently I read a book called Use Case Driven Object Modeling with UML. I strongly recommend this book for everyone using use cases and/or object oriented analysis. It argues against many popular opinions about use cases that they should be abstract and not deal with the user interface. These kind of abstract use cases don't give the developers the information they need, leaving them to make their own assumptions about how the system should work.

This doesn't mean that the use cases should include user interface design, but some specification of the user interface is necessary. Not only is this necessary to keep the developers on track, it also helps to make the use cases better.

The book explains how drive the object design from the use cases, resulting in a good object model that is consistent with the requirements.

Monday, January 7, 2008

Why Agile Methods Reduce Cost

The greatest cost of software development is complexity:
  1. The problems we are trying to solve are complex. The customers often don't understand the requirements completely before the software development starts.
  2. The mapping between the requirements and the software is also complex, so that misunderstandings between the customers and the developers are the rule and not exceptions.
  3. The software itself is complex. It is hard for the developers to get an overview of what it does, and the consequences of modifying the software are hard to predict.
This complexity adds risk to software development projects, so it is actually uncommon for software projects to complete in time and on budget. Complexity also results in poor quality, since errors occur all the way in the development process; in the requirements, while mapping the requirements to the implementation, and in the implementation.

The risk and the low quality obviously increases the development cost. The cost is also increased by the the need for more developers and experts who know how to handle the complexity, and the need for substantial testing to find and fix errors.

So how can we deal with this complexity?

Requirements
We cannot reduce the complexity of the requirements, but I beleive a good domain model is important to understand complex requirements. Make the model concrete by using examples. Model the object instances for a use case instance, and go through the variations and changes that can happen with the customer.

Communication
The number one success factor in development projects is communication. We need to improve communication between customers and developers, and within the development team.

Agile methods does this through a number of key practices:
  • Iterative development gives rapid feedback on misunderstandings between customers and developers.
  • Ideally, a customer should be available to the developers all the time.
  • Daily stand-up meetings improve the communication within the team.
  • Pair programming.
Also, the domain model is very valuable as a means for communicating within the team.

Software Complexity
By complexity of software I mean how many thoughts the developer has to hold in his head at once. Early Java frameworks separated the code into lots of losely connected classes and XML files. The purpose was reuse and easy configuration, but it nevertheless made the program fragmented and hard to understand.

I personally hate it when I have to jump from a Java class to struts-config.xml to find the name of a form Java class. I just want to push F3. I find it ironic that software developers that are designing user interfaces for a living can come up with something as awkard as XML configuration files.

It is said that the human mind can hold between 5 and 9 pieces of information at once. If you need to hold the 1 action class, 1 form, 1 DAO, 1 JSP page, 2 XML files and 1 Javascript file, that doesn't leave much space for the feature you are trying to develop.

Luckily, this situation is beeing addressed by a new generation of programming languages and frameworks such as Ruby on Rails.

I beleive the cost of switching to a simpler framework will pay back several times by the improved speed that comes by being able to focus on what you are actually trying to do.