I was born and raised in Ghana, Africa and this is my fifth year living in USA
Hello, my name is Asher Sampong, a 21 year old from Alexandria, Virginia. I am currently a senior at Fort Valley State University, majoring in Mathematics and on course to graduate in May 2015. I have an older sister attending Virginia Tech. I am a sports minded person and always to incorporate some sort of physical or sport activity into my daily routines. In my leisure time, I like to learn more about some of the new advances being in the engineering and how they will affect our lives in the future. I also love watching criminal investigation shows and movies, with my favorites being Sherlock Holmes and NCIS. I like trying new food recipes. After completing my Math degree, I plan on attending the Pennsylvania State University to attain a degree in Petroleum Engineering.
Gene interaction graphs are a key data structure in systems biology. While a handful of open source biological visualization tools exist like Cytoscape & Walrus, they do not scale to millions of nodes, thousands of graphs, nor do they efficiently incorporate biological metadata for data mining in a biological context. Research question: Provide a baseline visualization solution, for hundreds of plant gene interaction networks currently in development, that can be incorporated into a website.
In the first week of this REU, I had my first meeting with my project team where we discussed the aim of the project and how to achieve it. I was supplied with the data Professor Feltus had already to supplied to the rest of the team to bring me up to date. I am having to learn and understand how to use new software I have not used before such as Linux operating system, JavaScript, D3.js and Python. These software are essential to completing my project and will take some time for me to understand as a beginner. My role in this project also became clear to me as the week went by and my first task was to produce a Json file from the data given in a Text document. This Json file will then be later on loaded into this D3.js force directed graph layout to see if it is possible to produce a graph layout from the data I was given. I was given a tour of the Palmetto cluster which might play a part in my project later on.
I spent most of my time in this week playing around and learning more about Python. I started writing a code to produce a JavaScript Object Notation (Json) file out of the data given in a Text file. By using a combination of on-line tutorials, trial and error and the help of my mentor Dr Levine, I was able to have a Json file in the format that will be useful to try and produce a graph layout. This link contains one of the many Json files I was able to produce with the Python code. At the end of the week I was working on the JavaScript code to read the Json file into D3. During the course of the week, I also attended lectures on Linux and Tableau.
My main focus during this week was to write a JavaScript code that could successfully load a json file into D3. Going off on the JavaScript code written in this force directed graph layout I was able to manipulate it to output a graph layout as seen here. After successfully loading my data into D3, my previous mentor Dr Levine and I discovered that D3 was not going to be a useful tool for visualizing the data because it took a lot of time and computer resources to output the graph layout. As such we had to look for a visualization tool that would be accept the data and output it in a graph layout on a web browser without taking a long time or a lot of resources. Upon research we came across VivaGraph.js. By following the tutorials provided on the website, I was able to come up with a graph layout of the rice network. The next challenge after this breakthrough is to first write a python code to combine all both the rice and the maize networks and afterwards load it into the graph layout.
I spent the majority of this week learning new methods for editing the python code I have, which would make it capable of creating a Json file of the rice and maize network combined. By the end of the week I was able to achieve this goal and produce this Json file. By using the same html code I, I loaded this Json file into the code and was able to produce this graph which shows all the maize network and rice network together on one webpage. The next challenge is to have edges drawn between the rice network and maize network which would represent the the common genes that serve similar functions in both networks.
I encountered some trouble during week 5 due to my internal hard drive crashing on Sunday. On Wednesday, I successfully replaced the hard drive with a larger one. Through a friend I obtained a hard drive transfer cable that I hoped would allow me to access the files on the old hard drive. I tried using windows 8 to access my files but it does not work too well due to some corrupted windows OS files on the old hard drive. Upon research, I discovered that Linux OS is able to read files much more effectively from crashed hard drives than windows. I then installed Ubuntu alongside windows 8 which allows me to access my old hard drive and transfer the files on it to my current hard drive. With this problem out of the way, I am now able to continue my work. At the end of the week, I had started working on a python code to read the Rice, Maize and Rice - Maize mapping text files. On Thursday, I did a mid - term presentation of my work and research to my fellow interns, a few professors and mentors. I explained the research I was doing, the visualization tools I am currently using, the challenges I have been facing and my schedule for the upcoming weeks.
During this week, I thought of several ways to write a python code to read in both the Rice and Maize Networks and mapping text files into a Json file. I initially thought of editing the current code I have, to incorporate the mapping file. I tried this method but I ran into a lot of error codes and syntax problems, so I had to abandon that idea to find a much more suitable option. My next idea was to write a new python code that would work on any number of files and perform the same functions to these files. After extensive research, I discovered the file-input and sys.argv module in python. Using a combination of these two allows the user to input any number of files when calling the python file in the command line. I first tried the file-input module on a different type of code I wrote for the reading rice and maize nodes. After correcting very few errors, the code works effectively as I wanted and is able to print out a list of unique nodes from any number of files and place it into different groups. The next step was to have the edge list read from the already read files in sys.argv. I accomplished this by incorporating a piece of existing code into the current code and it worked. This code is flexible because it is not difficult to add attributes to Json file that can later be called in html for various functions. Based on the Json file created, I incorporated it into the existing html code and produced this graph. This graph contains the network layout of both Rice and Maize Networks as well as the edge list connecting common genes in both networks. With the Json aspect now complete, the next task is to utilize the various attributes in the Json file to make the graph layouts easier to understand.
I attended the 2014 Extreme Science and Engineering Discovery Environment (XSEDE) Conference in Atlanta, Georgia during this week. XSEDE is a collection of the most powerful computing resources that scientists and engineers around the world can use to further their research, develop new products and innovations. The conference brings together people from various backgrounds who utilize XSEDE resources. The conference also allows researchers and innovators to present their findings , discoveries or presentations. This four day conference exposes undergraduates to research involving advanced digital resources and some of the researchers who use or develop these machines. At the conference I participated in various events such as the Student & Student Mentors Dinner, and attended different tutorials and presentations such as a series of lightning talks from different researchers from varying educational backgrounds. The most interesting and helpful tutorial I attended was a tutorial titled: Secure Coding Practices. This tutorial was easy for me to follow along because the material was presented from a beginner’s point of view. The topics covered in this presentation included: understanding how a code can be exploited by a hacker, how to train yourself like a hacker and how to prevent openings where your code can be exploited. I found this presentation useful because it gave me an entry level exposure into how codes are hacked. Although I cannot incorporate the things I learned from this presentation into my research, I hope to use it later on in future. On Wednesday, I participated in the BOF: Listening to Under-represented Students about REUs: A BOF for Advocated of Increasing Participation in the Computational Sciences. I introduced myself, and explained how I found the Visualization Research at Clemson University. I was expecting to answer a lot of questions relating to the topic but few people showed up to the event. I also found the lightning talks useful because it allowed me to see how to present a research to a large audience in a few minutes while also covering every aspect and making it attractive for people to be interested in it. Overall I enjoyed the conference and learnt new information, but I believe I would have benefited more if I knew about some of the subjects that were presented or discussed as they seemed to be at an advanced level I have not yet taken classes or courses on. During the week, I was able to make further progress on my research. The graph layout of the Rice and Maize gene network is now coloured to make it easier to differentiate between the two gene networks.
The final week of the REU was busy. I created a poster for my research, finalised my abstract and python code, and wrote various reports and summaries for the events I participated in during this internship. I also made my final presentation on Friday to some professors, researchers and my fellow peers. Overall, I have had an awesome learning experience taking part in this REU. I have learnt a lot of new information that will be beneficial to me in both my undergraduate studies as well as my future career. I would first like to thank NSF for funding this REU, Dr Vetria Byrd, for selecting me to participate in this REU and all the support and help that she gave me during my participation in this REU. I also want to extend gratitude to Dr Alex Feltus, Dr Melissa Smith, Dr Joshua Levine, Anagha Joshi, Karan Sapra and Christian Weeks for all their support, help and co-operation during these 8 weeks. Finally I want to thank the Advanced Visualization Division at Clemson University for offering me the opportunity to do a research on their campus and utilize the many resources they have. It was great meeting friendly people all over the campus. I would recommend Clemson University to anyone who wants to further their education, partake in a research or tour a beautiful campus.
Last updated: 07/24/2014