Hali Gallagher

Coastal Carolina University https://www.coastal.edu/media/marketing/news/seal_web.jpg

Clemson Research Experiences for Undergraduates

Collaborative Data Visualization Applications Summer 2014

Home Institution

Coastal Carolina University
Conway, SC
hfgallagh@g.coastal.edu

Clemson Research Mentor

Jill Gemmill
Computer Science

Clemson Visualization Mentor

Gabriel Hankins
English

About Me

A few of my interests are programming, pizza, writing, cats, reading, and the internet. My favorite genre is comedy. I also think urban legends are really interesting even though I am not superstitious. Although, I will read just about anything if it is well written.
Some of my hobbies are swimming, fishing, and gaming. I enjoy swimming at the beach. The waves at Myrtle Beach are usually too small to surf, so I mostly bodyboard and skimboard.

I also fish with my family sometimes. The walk to the pier that my family visits is about a mile away, which really is not that bad except the fishing equipment can get heavy after a bit.

I would not exactly call myself a gamer, but I do enjoy many games by Nintendo. I have been a Pokémon fan since I was a little kid. For those who know little about Pokémon, it is a game for kids and that is partly true, but for the serious players it is a ridiculously strategic game. I am still a kid at heart though, so I mostly just play it for fun. My favorite console at the moment is the 3DS.

The first thing people usually declare after a few minutes of meeting me is, "Stop talking so much Hali!" However, this is only a statement of sarcasm. It is a bit strange that I do not really picture myself as the quiet type even though others do. I am always thinking about something; I just do not always blab about it. No offense to the extra extroverts.

I am an Information Systems major and an English minor. However, it probably does not come to a surprise that I was a former English major (even though my grammar probably is not up to hole-in-one), so most of the classes I have taken so far have been English, but I have also taken a few introductory computer classes. I am also a proud member of the Numbers and Bytes club at my college. Numbers and Bytes is a club for those interested in technology and gaming.

I was a regional award winner for the National History Day competition for a research paper I wrote on the Dred Scott case. Last year, I was recognized for my excellent service for volunteering, as an actor, at Dr. Screams Haunted Mansion in October. I am also a repeat member of the Dean's list at my college.

Project Description

The project I am working on requires the visualization of the metadata from the letters of modernist writers, such as Scott Fitzgerald or Ernest Hemingway. The data will be collected from the letters then put into CSV and table formats for later visualization use. The purpose of visualizing these letters is to gain insight or deeper understanding of the metadata that was not previously uncovered. We want to discover what common topics are being talked about and what important figures are being mailed to during this modernist time.

Week 1


For week one, I gathered background information on the following modernist writers: Scotts Fitzgerald, Ernest Hemingway, Virginia Woolf, and Katherine Mansfield. Previously, I had already known a little about Fitzgerald. I was mostly interested in him because of his romance with Zelda Fitzgerald. The story sounded to me a lot like his most famous book The Great Gatsby. They fall in love but the girl fears he's not rich enough to support her so the guy tries to get rich to gain her love. What I didn't previously know was that the character Daisy, from The Great Gatsby, was actually based off of Zelda Fitzgerald, which I thought was interesting.

I also learned about the visualization tool called Palladio and how to use it. I made visuals with the data that Palladio offers its new users. I think the most interesting function it offers is the map function. It plots coordinates for you if you enter the latitude and longitude of the destination.

After, I began to create a people list of all of Fitzgerald's contacts. This includes all the people he sent letters to and all the people he got letters from.

Week 2


For week two, I organized the metadata from the letters into a table. This included recording the letter ID, author of the letter, letter recipient, date the letter was written, destination place, author place, letter type, where the original letter is archived, and the reference page in the book that I found the letter in.

I thought it was interesting how I was able to see Fitzgerald go through the same writing process that I see many writers go through today. In one of his letters, he talks about reusing some of his old characters for his new book but just giving them a new name or personality. Jack becomes Jill, or Kate and Bob fuse together to become John. It's the process of taking the idea of a character and polishing it out to improve upon their personality. It's not exactly shocking that writers still use this technique, but it is kind of funny how vastly a character can change in the writing process.

It's also interesting seeing the goofy side of Fitzgerald, or infamously known by nobody as Geo Washington according to the signature he left on one of the letters. It's a side from Fitzgerald that you can't really see from being forced to read The Great Gatsby in the 10th grade. It kind of made me wonder if some college kid will be reading my Facebook messages in a hundred years and if they'll say, "I never knew Gallagher had such witty humor."

Week 3


For week three, I organized the metadata from Hemingway's letters written between 1919-1922. This included recording the letter ID, author of the letter, letter recipient, date the letter was written, destination place, author place, letter type, where the original letter is archived, and the reference page in the book that I found the letter in.

I also enjoyed reading all of the letters. Hemingway seems to enjoy fishing, drinking, and gambling. His choice of words are a bit more colourful than Fitzgerald's word choice. Here we can see Hemingway ignoring the directions of a standard postcard he was given to send while he was admitted to the hospital:

Hemingway mentions gambling on the World Series of 1919, which is a well known event where it was said the Chicago White Sox let the Cincinnati Reds win the game. He seemed disappointed to later learn the game was rigged. I just found it interesting to see an event he lived through mentioned in his letter. He also mentions being inspired by the book Tarzan of the Apes. I've never read the book before but now I am kind of interested in doing so. It just seemed like something odd to be inspired by for him.

I have also been working on making visuals on Palladio with the current metadata I have. Even making visuals out of part of the data helps me catch minor mistakes such as typos. Here is a visual I made on Palladio with data from Hemingway. The size of the circle indicates where he mailed the most letters from.


Most of his letters have been sent from Paris and Illinois. This would make sense as his family lived in Illinois and many famous authors were in Paris at the time giving him a chance to help his career as a writer.

For next week, I shall be gathering data past 1922 up to 1925 on either Fitzgerald or Hemingway as the two did not meet prior to 1922. This will further help me in connecting the data between these two modernist writers.

Week 4


For week 4, I continued to organize the metadata from Fitzgerald's letters past 1922 up until 1925. It was decided to continue from 1922-1925 as these were the years Fitzgerald and Hemingway began to become acquainted. I experimented with the visualization tool Gephi to see if a different visualization tool may display the data better. I also continued making visualizations with Palladio.

Below just shows an example of a possible string graph created with Palladio. The data displays all of Fitzgerald's male correspondences. The graph is only meant to be an example and not a final visualization product.






Fitzgerald seems to be getting many letters reviewing and thanking him for a copy of The Great Gatsby that he sent out to many of his friends. He also thanks them for giving their input on the book. I thought it was interesting that Daisy was based off of Zelda Fitzgerald but Gatsby was based off of a man named Max Gerlach. I always thought Gatsby's catchphrase of "old sport" got rather annoying pretty quickly in the story and nobody would actually ever repeat that to anyone ever, but apparently Max Gerlach would say, "Old sport." It just seems kind of odd to base one character on your wife and another on your friend. For those of you who haven't read The Great Gatsby, Gatsby is in love with Daisy. Maybe Fitzgerald found Gerlach to be more of an interesting character to write about rather than putting himself in the story.

Week 5


For week 5, I continued entering metadata from Hemingway's letters from 1922-1925 into excel. I've also begin to work with a tool called Google Refine. It's necessary to use because some of the data that contains special characters can't be recognized by Palladio or Gephi so visualizations aren't as good as they can be unless the data is refined. However, the special characters need to stay in the original data to make the data accurate. Special characters such as brackets around years indicates that the exact year is unknown but it is inferred to be around that year.

Week 6


For week 6, I organized metadata from the letters of Virginia Woolf from the years 1919-1922. I think it's been interesting to see how much gossip she was able to obtain despite being sick most of the time, which would restrict her to her house most of the time.

I also have begun using the metadata from both Hemingway and Fitzgerald to create visuals that will show common contacts. So far I have successfully been able to do this with Palladio. Although the graph is not as organized as I would like it to be, it still displays the common contacts. The following is a graph I created with the use of Palladio which displays both information of Hemingway and Fitzgerald:


The following is part of a network graph I have been working on in Gephi of Hemingway and Fitzgerald's contacts:

For future visualization, I will create a clearer graph of the following in Gephi. The final visualization will contain all four literary figures, which includes Scott Fitzgerald, Ernest Hemingway, Katherine Mansfield, and Virginia Woolf. With Gephi I will be able to colour code the graph which will make it clearer as to each relationship in the network graph.

Week 7


For Week 7, I was busy participating in XSEDE14. I managed to continue entering in the data from Katherine Mansfield's collections of letters. The letters from Katherine Mansfield were from 1918-1921 unlike most the other collections. These letters also only included her address but not the address of her recipient. It seems kind of odd not to include but perhaps the editors of the collection didn't deem that as an important element to include. I've noticed unlike Fitzgerald and Hemingway, Woolf and Mansfield don't move around as much. Their address is pretty consistent compared to Hemingway and Fitzgerald. This could be due to Hemingway and Fitzgerald's involvement in the war.

Week 8


For week 8, I worked on refining all the data I have collected. This includes making sure entries are correct. I am also working on refining my visualizations so that they are easier to understand. I have been in the process of perfecting all data and visualizations for my poster. I have mostly been working on my poster for this upcoming presentation, which includes abstract, introduction and background, methods and tools, results, discussion, conclusion, future work, acknowledgements, and references.

Project Summary

Final Report

My project required the visualization of textual data from the letters of F. Scott Fitzgerald, Ernest Hemingway, Virginia Woolf, and Katherine Mansfield. Over 2000 letters were entered into table format from these writers over the course of 8 weeks. The letter years varied; F. Scott Fitzgerald and Hemingway's letters were from the years 1919-1925, Virginia Woolf's letters were from the years 1919-1922, and Katherine Mansfield's letters were from the years 1918-1921. While data were gathered, I also worked with visualization tools Palladio and Gephi to create many different types of visualizations. I wanted to find out who these writers were talking to and what common contacts they had. I found out that both Fitzgerald and Hemingway's common contacts were Sherwood Anderson, Rascoe Burton, Zelda Fitzgerald, Edmund Wilson, and Maxwell Evarts Perkins. I then looked into of which common contacts did both writers send the most letters to and found out both Fitzgerald and Hemingway wrote Gertrude Stein the most out of their common contacts. This was determined by who both wrote to and not who was written to the most. For example, it would make sense that Fitzgerald would write to his wife the most out of his common contacts, but Hemingway only wrote Zelda Fitzgerald once. I also found out that Woolf and Mansfield's common contacts were Sydney Waterlow, Strachey Lytton, and S.S. Koteliansky. It was determined that both Woolf and Mansfield wrote S.S. Koteliansky the most. Gertrude Stein was a profound writer of her time and S.S. Koteliansky was known as a publisher. From this data, it was concluded that these writers made connections to other influential literary figures of their time through a relatively small set of shared connections.

Last updated: 07/24/2014