wtorek, 24 września 2013

Playing with Thrift and Java

Thrift is an interface definition language that is used to define and create services for numerous languages, including Java. It is used as a remote procedure call (RPC) framework and was developed at Facebook.

At Intelliseq we are using thrift as a communication protocol for our genequery noSQL database. In this post I will describe basic thrift server and a test client.

First, we need to download and install thrift. See http://thrift.apache.org/.

Next we need to design our interface. This is what I like about thrift. You need to design interface first. It is my favorite approach to software architecture design. Below is a simple interface of service with enum, structure, exception and few methods. I didn't include collections, imports, one way methods and few more less important featuresof thrift. You can read more about them in this great missing guide.
namespace java pl.intelliseq.largedata.thrift

struct Message {
 1: required string message
}

struct User {
 1: required string firstname
 2: required string lastname
}

enum Wrong {
 FIRST = 1,
 SECOND = 2
}

exception InvalidFirst {
 1: string why
}

ExampleService {
 bool isAlive(),
 Message getHello(1:User user),
 void getError(1: Wrong wrong) throws (1:InvalidFirst ouch)
}

We can autogenerate code with this command. Assuming that our thrift file is in thrift subdirectory.
thrift --gen java thrift/Exampleservice.thrift
All classes were autogenerated and placed in gen-java directory. We can set this directory as src directory. Next, we will implement all methods:
package pl.intelliseq.largedata.thrift;

import org.apache.thrift.TException;

public class ThriftHandler implements ExampleService.Iface {

 @Override
 public boolean isAlive() throws TException {
  return true;
 }

 @Override
 public Message getHello(User user) throws TException {
  Message message = new Message();
  message.setMessage("Hello " + user.getFirstname() + " " + user.getLastname());
  return message;
 }

 @Override
 public void getError(Wrong wrong) throws InvalidFirst, TException {
  if (wrong.equals(Wrong.FIRST)) throw new InvalidFirst();
 }

}
We have also to set up our server. We need to run it in separate thread if we want to test it.
package pl.intelliseq.largedata.thrift;

import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.server.TServer;
import org.apache.thrift.server.TThreadedSelectorServer;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TNonblockingServerSocket;
import org.apache.thrift.transport.TNonblockingServerTransport;
import org.apache.thrift.transport.TTransportException;

public class ThriftServer implements Runnable {

 TServer server;
 
 public void init() throws InterruptedException, TTransportException {
  System.out.println("Starting server on port 9090 ...");
  ThriftHandler handler = new ThriftHandler();
  ExampleService.Processor processor = new ExampleService.Processor(
    handler);
  TNonblockingServerTransport trans = new TNonblockingServerSocket(9090);
  TThreadedSelectorServer.Args args = new TThreadedSelectorServer.Args(trans);
  args.transportFactory(new TFramedTransport.Factory());
  args.protocolFactory(new TBinaryProtocol.Factory());
  args.processor(processor);
  args.selectorThreads(4);
  args.workerThreads(32);
  server = new TThreadedSelectorServer(args);

  new Thread(this).start();

  while(!server.isServing()) {Thread.sleep(1); };
  System.out.println("Serving on port 9090 ...");
 }
 
 public void run() {
  server.serve();
 }
}
And finally, we will write our test:
package pl.intelliseq.largedata.thrift;

import static org.junit.Assert.*;

import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.apache.thrift.transport.TTransportException;
import org.junit.After;
import org.junit.Before;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.ExpectedException;
import org.junit.runner.RunWith;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration({"file:conf/thrift-conf.xml"})
public class ThriftServerTest {

 TTransport transport;
 ExampleService.Client client;
 
 @Before
 public void init() throws TTransportException {
  transport = new TFramedTransport(new TSocket("localhost", 9090));
  transport.open();
  TProtocol protocol = new TBinaryProtocol(transport);
  client = new ExampleService.Client(protocol);
 }
 
 @After
 public void destroy() {
  transport.close();
 }

 @Rule
 public ExpectedException exception = ExpectedException.none();
 
 @Test
 public void isAliveTest() throws TException {
  assertTrue(client.isAlive());
 }
 
 @Test
 public void getHelloTest() throws TException {
  User user = new User();
  user.setFirstname("John");
  user.setLastname("Doe");
  assertEquals(client.getHello(user).getMessage(), "Hello John Doe");
 }
 
 @Test
 public void getErrorTest() throws TException {
  exception.expect(InvalidFirst.class);
  Wrong wrong = Wrong.FIRST;
  client.getError(wrong);
 }
 
 @Test
 public void getErrorSecondTest() throws TException {
  Wrong wrong = Wrong.SECOND;
  client.getError(wrong);
 }

}
That's it. Works like a charm. You can clone working project here: https://github.com/marpiech/largedatablog.git -b thrift thrift-and-java

Problems:

How to get rid of ugly warnings caused by autogenerated thrift code? (in Eclipse)
Create new source directory: e.g. src/thrift. Copy autogenerated code to the new directory. Then, Right click on the directory -> Build Path -> Use as Source Folder. Right click again -> Build Path -> Configure Build Path... -> Ignore optional compile problems -> Toggle.

org.apache.thrift.TApplicationException: [your method] failed: unknown result
your server implementation returned null or see: http://stackoverflow.com/questions/4244350/how-thread-safe-is-thrift-re-i-seem-to-have-requests-disrupting-one-another

TNonblockingServer.java [line number] Read an invalid frame size of [integer number]. Are you using TFramedTransport on the client side?
Wrap your TSocket into TFramedTransport (i.e. new TFramedTransport(new TSocket("localhost", 9090)))

czwartek, 13 grudnia 2012

Spring examples #2 - Testing Controller using Spring MVC with session scope

Sometimes there is need to store user session data in a session object. There are two choices for that. We can add session scope to a specific object or to whole controller. For testing purposes it is easier (however, not better) to choose the second option. Let us add some annotations to controller:
@Controller
@Scope("session")
public class MyController {

 @Autowired
 SomeSingletonFromContext singleton;

 SessionObject sessionObject = new SessionObject();

 @RequestMaping("/my.htm")
 public void myRequest(HttpServletRequest, HttpServletResponse) {
  ServletOutputStream out = response.getOutputStream();
  response.setContentType("application/json");
  response.setHeader("Cache-Control", "no-cache");
  out.print("Hello Session User");
 }
}
In the tricky part we will enable session for testing. The context configuartion below is our "access point" for controller testing class. Note the 'import' tag pointing to default servlet configuration.



  
      
          
              
                  
              
          
      
  

  


In the servlet-context we need to scan for controller:



Finally we have to set up our test:
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration({
 "file:path/to/test-context-as-above.xml"})
public class MyControllerTest {
 
    private MockHttpServletRequest request;
    private MockHttpServletResponse response;
    
    @Autowired
    private MyController myController;
 
    @Before
    public void setUp() {
     request = new MockHttpServletRequest();
        response = new MockHttpServletResponse();        
    }

    @Test
    public void testSubmit() throws Exception {

     request.setMethod("POST");
        request.setRequestURI("/my.htm");
        request.addParameter("param", "value");
        
        @SuppressWarnings("unused")
        final ModelAndView mavBack = new AnnotationMethodHandlerAdapter()
            .handle(request, response, controller);
        
        assertEquals(response.getContentType(), "application/json");

    }
}
That's all.

środa, 25 lipca 2012

Ohloh code

When I heard that Google Code Search is going to be closed i started to look for it's alternative. I used it on a routine basis - looking for spring solutions, gwt tricks or some xml hibernate configurations.

The best alternative was koders.com, but it was lacking some features - for example searching in xml language(!).

Now, the Koders project is merging with Ohloh in the form of code.ohloh.net. The Ohloh code is great. I can look for gwt examples, spring xml configuration files or hibernate tricks. The left panel is perfect. The code preview is optimal for me. A look of the tool is clean and nice.

poniedziałek, 23 lipca 2012

Installing latest R version under Ubuntu

Ubuntu does not come with the recent R version. Below is the solution. Mind that you can choose different cran mirror. I use wroclaw/poland cran mirror to do this. My Ubuntu version is 11.10 -> oneiric. Keep in mind that. For 12.04 you should use -> precise
sudo apt-get install python-software-properties
sudo add-apt-repository "http://r.meteo.uni.wroc.pl/bin/linux/ubuntu oneiric/"
sudo apt-get update
You will get: W: GPG error: http://r.meteo.uni.wroc.pl oneiric/ Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 51716619E084DAB9
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
sudo apt-get install r-base-dev 
And we are done.

wtorek, 17 lipca 2012

Spring examples #1 - beginning with spring


I decided to prepare for Spring certification. Therefore, in the series of 'Spring examples' posts I will introduce and clean up the most important Spring concepts.

The source code for this post is available at:
git clone git@bitbucket.org:marpiech/iseqspringcore.git

Prerequisities:
- eclipse: download 'Eclipse IDE for Java EE Developers' from http://www.eclipse.org/downloads/
- maven: download it from http://maven.apache.org/download.html or by sudo apt-get install maven2 
- m2eclipse: in eclipse Help -> Marketplace -> Maven integration for Eclipse -> install
- SpringSource Tool Suite: in eclipse Help -> Marketplace -> STS -> install

First, we have to create directory structure. bash
mkdir iseqspringcore
cd iseqspringcore
mkdir -p src/{test,main}/{java,resources}
Next, we need to set up maven.
nano pom.xml
and paste:


 4.0.0

 com.intelliseq
 spring-tutorial
 1.0

 Spring Tutorial Part 1
Now, we have to prepare eclipse project. bash
mvn install
mvn eclipse:eclipse

Let's switch to eclipse.
File -> Import -> General -> Existing Projects into workspace
RightClick on Project -> Configure -> Convert to maven project

Add maven repository to pom.xml

 
       central
       Maven Repository Switchboard
       default
       http://repo1.maven.org/maven2
       
         false
       
   

RightClick on Project -> Maven -> Add Dependency -> org.springframework, spring-context
Have a look at pom.xml and in project at Maven Dependencies set of libraries.

In src/main/java create package com.intelliseq.springexamples.core Create application-context.xml in the com.intelliseq.springexamples package. There are two two types of bean instantiation through setters: classic one and modern one (see below).


 
  
  
 

 

 
  
 
Let's create Person class in the com.intelliseq.springexamples.core package.
package com.intelliseq.springexamples.core;

public class Person {
 
 private String firstName;
 private String familyName;
 
 public void setFirstName(String firstName) {
  this.firstName = firstName;
 }

 public void setFamilyName(String familyName) {
  this.familyName = familyName;
 }

 @Override
 public String toString() {
  return firstName + " " + familyName;
 }
}
And finally let's create application runner class SpringApp in the com.intelliseq.springexamples.core package. The SpringApp class uses application-context through ClassPathXmlApplicationContext class.
package com.intelliseq.springexamples.core;

import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class SpringApp {

 public static void main (String[] args) {
  ApplicationContext context =
       new ClassPathXmlApplicationContext(new String[] {"com/intelliseq/springexamples/application-context.xml"});
  Person person = (Person) context.getBean("person");
  System.out.println(person);
  Person modernPerson = (Person) context.getBean("person-modern");
  System.out.println(modernPerson);
 }
 
}
Run Spring App and voila.

środa, 13 czerwca 2012

Getting linked exons BED from UCSC tables

Today I needed bigBed file for visualization of transcript positions in a browser which is bigBed and bigWig based. I have found a discussion where Katrina Learned from UCSC Genome Bioinformatics Group posted very usefull script. I am not going to reinvent the wheel.
nano genePredToBed
and paste this
#!/usr/bin/awk -f

#
# Convert genePred file to a bed file (on stdout)
#
BEGIN {
     FS="\t";
     OFS="\t";
}
{
     name=$1
     chrom=$2
     strand=$3
     start=$4
     end=$5
     cdsStart=$6
     cdsEnd=$7
     blkCnt=$8

     delete starts
     split($9, starts, ",");
     delete ends
     split($10, ends, ",");
     blkStarts=""
     blkSizes=""
     for (i = 1; i <= blkCnt; i++) {
         blkSizes = blkSizes (ends[i]-starts[i]) ",";
         blkStarts = blkStarts (starts[i]-start) ",";
     }

     print chrom, start, end, name, 1000, strand, cdsStart, cdsEnd, 0, 
blkCnt, blkSizes, blkStarts
}
We are almost ready to make our bed file:
chmod +x genePredToBed
wget http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/knownGene.txt.gz
gzip -d knownGene.txt.gz
cat knownGene.txt | ./genePredToBed > known.bed
Now we can use Jim Kent's bedToBigBed and we are done.

Getting GTF from UCSC with proper gene_id

While downloading GTF file (knownGenes or ensemblGenes) from UCSC Table browser an output has one serious issue. Transcript_id = gene_id. And in fact there is no gene_id. Below simple solution is presented for this problem (mm9 genome):

#prerequisities mysql
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/genePredToGtf
chmod +x genePredToGtf
sudo ln -s ./genePredToGtf genePredToGtf
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -N -e "select * from ensGene;" mm9 | cut -f2- | genePredToGtf file stdin ensGene.gtf