New things to learn

Nice article about some framework that we should all learn and as much as I know a few this list of classics certainly has a few that I should learn before the end of the year.

https://hackernoon.com/12-frameworks-java-web-developers-should-learn-in-2018-edae59315244

Which one are you learning?

Apache Commons Configuration and Map

Until today, the Apache Commons Configuration was a very helpful tool as a quick configuration utility in many projects. For loading simple configuration item it is fine.

I was trying to find a way to store and read a Map to contain some configuration and there is was no easy way that I could find. I came up with a Krusty solution for now but I am hoping that I can find something more elegant.

GitHub Repo with example code.

Jackson and mapping fields

I was trying to use the useragentstring.com api to query and map browsers to my custom object and I did not want to name my fields with agent_type, agent_name and agent_version.

Simple solution that I found on this page was to annotate my properties in the class with:

@JsonProperty(“agent_type”)
@JsonProperty(“agent_name”)
@JsonProperty(“agent_version”)

The mapping is happening automatically.

Very simple.

Using log4j for email alerts in an application

I want to explore how using log4j for email alerts in an application is feasible but how well it works in a real application.

The idea came from the fact that a team I work with change the rules for alerts on a regular basis based on the work and the situations they are caught in. It is clear that we have to stop changing the code to accommodate the new rules they give us.

Log4j is a possibility and I found this blog post that talks about it:

http://www.srccodes.com/p/article/18/send-logs-by-email-notification-using-apache-log4j-smtpappender

Impact of the scope on the maven assembly plugin

For a Spark project I needed to bundle some dependencies and I found a few Stackoverflow answers that explained to add this section to your pom.xml:

<plugin>
	<artifactId>maven-assembly-plugin</artifactId>
	<executions>
		<execution>
			<phase>package</phase>
			<goals>
				<goal>single</goal>
			</goals>
		</execution>
	</executions>
	<configuration>
		<descriptorRefs>
			<descriptorRef>jar-with-dependencies</descriptorRef>
		</descriptorRefs>
	</configuration>
</plugin>

But once I packaged my application I ended up with a huge jar file. It contained everything the application needed.

My mistake was that the pom.xml did not contain the proper scope for each dependency and because of that they were all getting bundled into the jar.

Specified that a few were provided (provided) and it reduced the jar size considerably.

How a small missing piece can change other behaviours drastically.

jmap notes

If you search for how to get a memory dump for a java process many will recommend jmap.

On a server, jmap is the tool to use since jvisualvm requires some ui and that is rarely available in our setup.

The command to use is simple:

jmap -heap.format:b <pid>

The problem is that it “freezes” your java application as the heap is being dumped and when you have a process that uses a bit of memory it takes a long time (very long time). The app I tried to get information from was using 2.3 Gb and after 30 minutes it was not done writing to the heap.bin. Had to abandon.

Random Password Generator in Java

I needed a random password generator done in Java and while reading a few articles here and there I did not find the solution I was looking for. Piecing a few of them I came up with some code.

One of the requirement that made this a bit more complicated is that I needed to have a special character in the password but from a limited list of possible special characters.
I also needed to make sure that it respected some basic complexity rules.

It has a dependency on the commons-lang 3 library from Apache but since I already had it in my project it was easy to accommodate.

Maven dependency:

<dependency>
	<groupId>org.apache.commons</groupId>
	<artifactId>commons-lang3</artifactId>
	<version>3.3.1</version>
</dependency>

Java code:

package com.halogensoftware.hosting.example;

import org.apache.commons.lang3.RandomStringUtils;

public class Random {

	public static void main(String args[]){
		String pwd = "";
		for ( int i = 0; i < 1000; i++ ) {
			do {
				pwd = RandomStringUtils.random(12, "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890%#^*:");
			} while (!valid(pwd));
			System.out.println(i + ".valid pwd: " + pwd);
		}
	}

	private static boolean valid(String pwd){
		return (pwd.matches(".*([0-9]).*") && pwd.matches(".*([a-z]).*") && pwd.matches(".*([A-Z]).*") && pwd.matches(".*([%#^*:]).*"));
	}
}

Installing the Oracle JDK on Fedora

Had to learn this one from this site.

In my use case I only want to compile and run Hadoop application so I have not completed all the steps for the browser setup.

Short version:

  1. Download the JDK of your choice; I picked 1.7.0_51
  2. sudo rpm -Uvh /tmp/jdk-7u51-linux-x64.rpm
  3. sudo alternatives –install /usr/bin/java java /usr/java/latest/bin/java 200000
  4. sudo alternatives –install /usr/bin/javac javac /usr/java/latest/bin/javac 200000
  5. sudo alternatives –install /usr/bin/jar jar /usr/java/latest/bin/jar 200000
  6. sudo alternatives –config java

The last step was to activate the new installation I added. I selected option 2.

As simple as that and running java -version shows me the Oracle JVM version.

Starting with Hadoop – 2

I created a page for my Hadoop notes and will keep those up to date with what I am experimenting.

I will post short articles on what I have done and where I am facing challenges.

I think that using the OpenJDK is a mistake so I am testing with the Oracle JVM to see if it fixes some of the issues I am facing.

I have also upgraded to Fedora 20 which should not change much in how Hadoop works. The only I have noticed is an error because the temp directory is gone. I will have to investigate why that is preventing the namenode from starting. Might have to move the temp outside of /tmp to avoid this issue.

Starting with Hadoop

Trying to find simple and authorative documentation for Hadoop is harder than I expected. With the many versions out there it is easy to find documentation for the wrong version and not being able to find what really needs to be done.

Versions:

  • Hadoop 2.2.0
  • OpenJDK 1.7.0_51
  • Fedora 19

I have set my environment variables in my .bash_profile:

export JAVA_HOME=/usr/lib/jvm/java
export HADOOP_HOME=/opt/hadoop-2.2.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

Configuration file:
$HADOOP_HOME/etc/hadoop/core-site.xml

<!--?xml version="1.0" encoding="UTF-8"?-->
<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/hadoop-${user.name}</value>
</property>
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
</property>
<property>
  <name>mapred.job.tracker</name>
  <value>hdfs://localhost:54311</value>
</property>
<property> 
  <name>dfs.replication</name>
  <value>8
</property>
<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx512m</value>
</property>
</configuration>

First few Hadoop commands:

hadoop namenode -format
hadoop namenode

Things to resolve:

$HADOOP_HOME/sbin/start-all.sh – does not work at all; throws a lot of errors