Fedora 23 in VirtualBox 5.x

That took a bit more work than usual.

There was a few missing components and it I did not find a single source to know how to address the issue.

I had a hard time installing the virtualbox guest addition in Fedora 23. I kept getting errors about missing kernel headers, missing gcc and the fact that the x.org was experimental so it could not complete the installation.

The first few issues were easy to address:

sudo dnf install kernel-headers

The only catch was that my installation was with Kernel 4.2.3 but since the update channel had 4.2.8 headers available the latest got installed which did not match the kernel version I was running.

Lesson learned: update everything before installing new packages

The second component to add was the gcc:

sudo dnf install gcc

If you are up to date there is no issue to worry about otherwise you may get a mix of versions for the dependencies that will not work.

The last issue had the solution in the /var/log/vbox-install.log.

cd /usr/src/kernels/4.2.8-300.fc23.x86_64
sudo make oldconfig

After that I ran the vbox guest addition installation and it completed without error. A restart and everything was working between the guest OS and the hosting one.

Spring Rest Mapping to an Object

I just did a rookie mistake that costed me a few hours.

If you want Spring to be able to create an object from the json you are sending to a controller you need to make sure that you are not creating a constructor with parameters without a default constructor.

I simply removed my constructor since it was not used and everything started to work as expected.

The 400 Bad Request stopped puzzling me.

The POJO is simple:

package com.cinq.example.v1;

public class Request {

 private String version;
 private String template;
 private int lifespan;

 public int getLifespan() {
 return lifespan;

 public String getVersion() {
 return version;

 public String getTemplate(){
 return template;

 public void setVersion(final String version){
 this.version = version;

 public void setTemplate(final String template){
 this.template = template;

 public void setLifespan(final int lifespan){
 this.lifespan = lifespan;

The controller that takes the json to create the object is also very simple:

package com.cinq.example.v1;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.ResponseBody;

 * API for something
@RequestMapping(value = "/api/v1")
public class Management {

	private final static Logger LOGGER = LoggerFactory.getLogger(Management.class);

	@RequestMapping(value = "/something", consumes = MediaType.APPLICATION_JSON_VALUE, method = RequestMethod.POST)
	public Response creation(@RequestBody final Request request){
		LOGGER.info("Will create something with " + request.getVersion());
		try {
		} catch ( final InterruptedException exception ) {
			LOGGER.error("Could not sleep for 5 secs.");

		return(new Response("test001", "s1", "sp1", "u1", "up1", 100L));

The Response object is as simple as the Request one and has many String fields.

Lesson Learned

If you need to create a constructor make sure you have the default one defined as well.

No web.xml

Servlet 3.1 allows to have web apps with no web.xml but Maven was giving me errors when trying to package the application.

I had to add this to my pom.xml:


Apache Spark DataFrame Numeric Column (Again)

There is nothing like having a way to make it work to find more ways.

I found a .cast() method for the columns I want to use as numeric value and this avoids using a UDF to transform it.

I now prefer this way… until I find another, simpler…

package com.cinq.experience;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.DataFrame;
import org.apache.spark.sql.SQLContext;
import org.apache.spark.sql.types.DataTypes;

import java.io.UnsupportedEncodingException;

public class Session {

	public static void main(String[] args) throws UnsupportedEncodingException {
		SparkConf conf = new SparkConf().setAppName("SparkExperience").setMaster("local");
		JavaSparkContext jsc = new JavaSparkContext(conf);
		SQLContext sqlContext = new SQLContext(jsc);

		DataFrame df = sqlContext.read()
				.option("header", "true")

		DataFrame crazy = df.select(df.col("x-custom-a"), df.col("x-custom-count").cast(DataTypes.LongType));
		crazy.groupBy(crazy.col("x-custom-a")).avg("CAST(x-custom-count, LongType)").show();

Apache Spark DataFrame Average

We had some trouble doing the math on a column with dataframes even if the method is readily there.

We kept getting an error that the column was not a numeric value.

After a bit of reading I figured that I needed to use a UDF to transform the string column to a numeric column.

package com.cinq.experience;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.DataFrame;
import org.apache.spark.sql.SQLContext;
import org.apache.spark.sql.api.java.UDF1;
import org.apache.spark.sql.types.DataTypes;

public class DataFrameAvg {

	public static void main(String[] args) {
		SparkConf conf = new SparkConf().setAppName("DataFrameAvg").setMaster("local");
		JavaSparkContext jsc = new JavaSparkContext(conf);
		SQLContext sqlContext = new SQLContext(jsc);

		DataFrame df = sqlContext.read()
				.option("header", "true")

		sqlContext.udf().register("toInt", new UDF1() {
			public Integer call(String s) throws Exception {
				System.out.println("Parsing: " + s);
				return Integer.parseInt(s);
		}, DataTypes.IntegerType);

		DataFrame withNumber = sqlContext.sql("SELECT toInt(number) from allData");

and the content of the numericdata.csv is very simple:


Maven Generate

Why did I only discover this lately?

Because the archetype:create was deprecated in Maven 3.0.5 and you should use the archetype:generate from now on. A bit odd to do this in a .0.5 release. I must be missing something about the reasoning behind this change.

So from now on when I need the default directory structure:
mvn archetype:generate -DgroupId=com.cinq.example -DartifactId=example1 -DinteractiveMode=false

Minimal log4j.xml

Too often I copy my log4j.xml from one project to another so I figured I post it here as a template.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration debug="false">
  <appender name="default.console" class="org.apache.log4j.ConsoleAppender">
  <param name="target" value="System.out" />
  <param name="threshold" value="debug" />
  <layout class="org.apache.log4j.PatternLayout">
  <param name="ConversionPattern" value="%d{ISO8601} %-5p [%c{1}] - %m%n" />
  <logger name="com.halogensoftware.hosting" additivity="false">
  <level value="debug" />
  <appender-ref ref="default.console" />
  <priority value="info" />
  <appender-ref ref="default.console" />

HDFS file listing

Only using the Hadoop libraries I can list all the files that are in a subdirectory with this:

 // list all sites we have data for
 FileSystem fs = FileSystem.get(new Configuration());
 FileStatus status[] = fs.listStatus(new Path("hdfs:///dir/subdir/"));
 for ( FileStatus s : status ) {
     try {
         FileStatus[] metricFile = fs.listStatus(new Path(s.getPath().toString() + "/file.json"));
         logger.info("File: " + metricFile[0].getPath().toString());
     } catch ( IOException e ) {
         // there is no metric file

Since I use Spark for most applications I do I prefer this way of dealing with it:

SparkConf sc = new SparkConf().setAppName("Learning");
JavaSparkContext jsc = new JavaSparkContext(sc);
JavaPairRDD<String, String> allMetricFiles = jsc.wholeTextFiles("hdfs:///dir/subdir/*/file.json");
for ( Tuple2<String, String> each : allMetricFiles.toArray() ) {
	logger.info("Only metric file: " + each._1);